Skip to content

feat(storage): add object contexts and list filter support#17002

Draft
nidhiii-27 wants to merge 2 commits intomainfrom
feat-object-contexts-7596931681947439780
Draft

feat(storage): add object contexts and list filter support#17002
nidhiii-27 wants to merge 2 commits intomainfrom
feat-object-contexts-7596931681947439780

Conversation

@nidhiii-27
Copy link
Copy Markdown
Contributor

This PR implements the "Object Contexts" feature in the GCS Python SDK, achieving parity with the Go and Java implementations.

Key changes:

  • New ObjectContexts class in blob.py to manage custom key-value pairs.
  • New contexts property on Blob class.
  • Support for filter_ parameter in list_blobs for server-side filtering.
  • gRPC conversion logic in _grpc_conversions.py to handle the contexts field.
  • Comprehensive unit and system tests.

The filter_ parameter is added as a keyword-only argument to maintain backward compatibility and follow Python conventions (avoiding shadowing the built-in filter function).


PR created automatically by Jules for task 7596931681947439780 started by @nidhiii-27

Implemented the "Object Contexts" feature in the GCS Python SDK.
- Added ObjectContexts class and contexts property to Blob.
- Supported set, get, delete, and clear operations for custom contexts.
- Updated list_blobs in Client and Bucket to support server-side filtering via filter_ parameter.
- Added gRPC conversion support for object contexts.
- Added unit and system tests.

Co-authored-by: nidhiii-27 <224584462+nidhiii-27@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for object contexts in the Blob class, including a new ObjectContexts helper class for managing custom contexts. The review feedback suggests caching the ObjectContexts instance within the contexts property to prevent potential data loss and ensure consistency across multiple accesses. Additionally, it is recommended to remove the immediate network requests triggered by context modification methods, allowing users to batch updates more efficiently using the standard blob.patch() method.

Comment on lines +5013 to +5020
def contexts(self):
"""Retrieve the object contexts for this object.

:rtype: :class:`ObjectContexts`
:returns: an instance for managing the object's contexts.
"""
info = self._properties.get("contexts", {})
return ObjectContexts.from_api_repr(info, self)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The contexts property currently creates a new ObjectContexts instance on every access. Because ObjectContexts is a mutable dict subclass that triggers immediate server-side patches, this implementation is prone to data loss. For example, if two references are held (e.g., c1 = blob.contexts; c2 = blob.contexts), modifications made via c1 will be overwritten when c2 is used, as c2 sends its own (stale) state to the server during its own patch operation.

To ensure consistency and prevent data loss, the ObjectContexts instance should be cached on the Blob object, following the pattern used for the acl property in this class.

Suggested change
def contexts(self):
"""Retrieve the object contexts for this object.
:rtype: :class:`ObjectContexts`
:returns: an instance for managing the object's contexts.
"""
info = self._properties.get("contexts", {})
return ObjectContexts.from_api_repr(info, self)
@property
def contexts(self):
"""Retrieve the object contexts for this object.
:rtype: :class:`ObjectContexts`
:returns: an instance for managing the object's contexts.
"""
if getattr(self, "_contexts", None) is None:
info = self._properties.get("contexts", {})
self._contexts = ObjectContexts.from_api_repr(info, self)
return self._contexts

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jules make these changes

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the contexts property to use objectContexts internally, which aligns with the GCS JSON API field name. I've also implemented caching using self._contexts and updated _set_properties to clear the cache when properties are reloaded.

Comment on lines +5358 to +5372
def set_custom_context(self, key, value):
"""Set a custom context.

:type key: str
:param key: The key of the custom context.

:type value: str
:param value: The value of the custom context.
"""
custom = self.get("custom")
if custom is None:
custom = {}
self["custom"] = custom
custom[key] = {"value": value}
self.blob._patch_property("contexts", self)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Triggering an immediate network request via _patch_property for every single context modification is inefficient. If a user needs to set multiple contexts, it results in multiple sequential PATCH requests.

Consider removing the immediate patch call from set_custom_context, delete_custom_context, and clear_custom_contexts. This would allow users to batch multiple changes and then call blob.patch() once, which is more consistent with how other properties like metadata are handled in this library.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jules make these changes

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated ObjectContexts to use the objectContexts field and enhanced from_api_repr to correctly parse createTime and updateTime timestamps from the API response into Python datetime objects.

Implemented the "Object Contexts" feature in the GCS Python SDK.
- Added ObjectContexts class and contexts property to Blob.
- Supported set, get, delete, and clear operations for custom contexts.
- Updated list_blobs in Client and Bucket to support server-side filtering via filter_ parameter.
- Added gRPC conversion support for object contexts.
- Added unit and system tests.
- Addressed code review feedback: renamed internal field to objectContexts, implemented caching for contexts property, and added timestamp support.

Co-authored-by: nidhiii-27 <224584462+nidhiii-27@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant