feat(storage): add object contexts and list filter support#17002
feat(storage): add object contexts and list filter support#17002nidhiii-27 wants to merge 2 commits intomainfrom
Conversation
Implemented the "Object Contexts" feature in the GCS Python SDK. - Added ObjectContexts class and contexts property to Blob. - Supported set, get, delete, and clear operations for custom contexts. - Updated list_blobs in Client and Bucket to support server-side filtering via filter_ parameter. - Added gRPC conversion support for object contexts. - Added unit and system tests. Co-authored-by: nidhiii-27 <224584462+nidhiii-27@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
There was a problem hiding this comment.
Code Review
This pull request introduces support for object contexts in the Blob class, including a new ObjectContexts helper class for managing custom contexts. The review feedback suggests caching the ObjectContexts instance within the contexts property to prevent potential data loss and ensure consistency across multiple accesses. Additionally, it is recommended to remove the immediate network requests triggered by context modification methods, allowing users to batch updates more efficiently using the standard blob.patch() method.
| def contexts(self): | ||
| """Retrieve the object contexts for this object. | ||
|
|
||
| :rtype: :class:`ObjectContexts` | ||
| :returns: an instance for managing the object's contexts. | ||
| """ | ||
| info = self._properties.get("contexts", {}) | ||
| return ObjectContexts.from_api_repr(info, self) |
There was a problem hiding this comment.
The contexts property currently creates a new ObjectContexts instance on every access. Because ObjectContexts is a mutable dict subclass that triggers immediate server-side patches, this implementation is prone to data loss. For example, if two references are held (e.g., c1 = blob.contexts; c2 = blob.contexts), modifications made via c1 will be overwritten when c2 is used, as c2 sends its own (stale) state to the server during its own patch operation.
To ensure consistency and prevent data loss, the ObjectContexts instance should be cached on the Blob object, following the pattern used for the acl property in this class.
| def contexts(self): | |
| """Retrieve the object contexts for this object. | |
| :rtype: :class:`ObjectContexts` | |
| :returns: an instance for managing the object's contexts. | |
| """ | |
| info = self._properties.get("contexts", {}) | |
| return ObjectContexts.from_api_repr(info, self) | |
| @property | |
| def contexts(self): | |
| """Retrieve the object contexts for this object. | |
| :rtype: :class:`ObjectContexts` | |
| :returns: an instance for managing the object's contexts. | |
| """ | |
| if getattr(self, "_contexts", None) is None: | |
| info = self._properties.get("contexts", {}) | |
| self._contexts = ObjectContexts.from_api_repr(info, self) | |
| return self._contexts |
There was a problem hiding this comment.
I have updated the contexts property to use objectContexts internally, which aligns with the GCS JSON API field name. I've also implemented caching using self._contexts and updated _set_properties to clear the cache when properties are reloaded.
| def set_custom_context(self, key, value): | ||
| """Set a custom context. | ||
|
|
||
| :type key: str | ||
| :param key: The key of the custom context. | ||
|
|
||
| :type value: str | ||
| :param value: The value of the custom context. | ||
| """ | ||
| custom = self.get("custom") | ||
| if custom is None: | ||
| custom = {} | ||
| self["custom"] = custom | ||
| custom[key] = {"value": value} | ||
| self.blob._patch_property("contexts", self) |
There was a problem hiding this comment.
Triggering an immediate network request via _patch_property for every single context modification is inefficient. If a user needs to set multiple contexts, it results in multiple sequential PATCH requests.
Consider removing the immediate patch call from set_custom_context, delete_custom_context, and clear_custom_contexts. This would allow users to batch multiple changes and then call blob.patch() once, which is more consistent with how other properties like metadata are handled in this library.
There was a problem hiding this comment.
I have updated ObjectContexts to use the objectContexts field and enhanced from_api_repr to correctly parse createTime and updateTime timestamps from the API response into Python datetime objects.
Implemented the "Object Contexts" feature in the GCS Python SDK. - Added ObjectContexts class and contexts property to Blob. - Supported set, get, delete, and clear operations for custom contexts. - Updated list_blobs in Client and Bucket to support server-side filtering via filter_ parameter. - Added gRPC conversion support for object contexts. - Added unit and system tests. - Addressed code review feedback: renamed internal field to objectContexts, implemented caching for contexts property, and added timestamp support. Co-authored-by: nidhiii-27 <224584462+nidhiii-27@users.noreply.github.com>
This PR implements the "Object Contexts" feature in the GCS Python SDK, achieving parity with the Go and Java implementations.
Key changes:
ObjectContextsclass inblob.pyto manage custom key-value pairs.contextsproperty onBlobclass.filter_parameter inlist_blobsfor server-side filtering._grpc_conversions.pyto handle thecontextsfield.The
filter_parameter is added as a keyword-only argument to maintain backward compatibility and follow Python conventions (avoiding shadowing the built-infilterfunction).PR created automatically by Jules for task 7596931681947439780 started by @nidhiii-27