Tree node listing helpers + server-authoritative S3 tagging#403
Tree node listing helpers + server-authoritative S3 tagging#403
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds configurable file upload options to support S3-compatible storage systems that may not fully implement all S3 features, particularly object tagging. The implementation introduces a FilesConfig class for runtime configuration with three options: useS3Tagging (to disable S3 tagging headers for incompatible storage), maxMultipartRetries (configurable retry count), and fileUploadTimeoutMs (configurable timeout).
- Introduces
FilesConfigclass with static configuration pattern for file upload settings - Refactors
DirectUploadClientconstructor to accept a configuration object instead of a plain number - Implements conditional S3 tagging header inclusion based on configuration
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
test/unit/files/FilesConfig.test.ts |
New test suite covering FilesConfig initialization and configuration retrieval |
test/unit/files/DirectUploadClient.test.ts |
Updated tests to verify S3 tagging behavior and use new config object pattern |
src/files/infra/clients/DirectUploadClient.ts |
Implements DirectUploadClientConfig interface and conditional S3 tagging in single-part uploads |
src/files/index.ts |
Adds FilesConfig class with lazy initialization pattern for DirectUploadClient and UploadFile instances |
CHANGELOG.md |
Documents new features and breaking changes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
FYI - there's already a dataverse.files..disable-tagging setting - is that the same as the new one here? |
If there is one already, then we do not need the new one. However, I know there is one in Java, I did not know there is one in Javascript too? I made this PR a draft, it is a part of a bigger idea: reusing the SPA parts in other UI's. I think it is something for the tech hours to discuss first. There are other 3 PR's coming to illustrate what I mean (one in the new SPA, one in dvwebloader and one in the JSF frontend). The dvwebloader that uses the SPA file upload can function without these PR's ever being merged, and the dvwebloader one is also optional (it is something that we want to use at KU Leuven, but it is fine if it is not mainstream). |
|
A tech hour makes sense for the bigger picture. re: the tagging - the store is configured on the server and the signed URLs either will/won't allow a tag header based on that (calls will fail if they don't match the server setting). Do we need an API so you can discover that setting? (We've recently been adding more settings to the storageDriver GET API). |
The x-amz-tagging header value is now taken from destination.tagging, which is populated by the server when tagging is enabled. This removes the client-side FilesConfig/useS3Tagging flag that duplicated the backend DISABLE_S3_TAGGING JVM setting and would drift silently. - FileUploadDestination: add tagging?: string - fileUploadDestinationsTransformers: pass tagging through both paths - DirectUploadClient: use destination.tagging as header value; remove useS3Tagging field and config option - files/index.ts: remove FilesConfig class and lazy-init pattern; uploadFile is now a plain UploadFile instance - package.json: scope prettier/eslint scripts to ./src to avoid permission errors scanning test/environment/docker-dev-volumes
This reverts commit 5a9f204.
|
I reverted the template functional test isolation change from this PR to keep the upload PR scoped. The failing test appears to be shared-state/flakiness around That commit can be reused in a separate PR dedicated to functional test isolation. |
`npm run format` and `npm run lint:eslint` traverse the whole repo by default, which fails on systems where `test/environment/docker-dev-volumes` contains directories owned by container users (e.g. solr/data) and is not readable by the developer's UID. The pre-commit hook then aborts. Narrowing the globs to `./src` keeps the formatters and linters running on what we actually care about — application source — and lets the pre-commit hook succeed regardless of how container volumes are provisioned.
Adds DataverseApiAuthMechanism to the existing core/index.ts re-export alongside ApiConfig so consumers don't have to deep-import it from `@iqss/dataverse-client-javascript/dist/core/infra/repositories/ApiConfig`. This is the SDK side of a small two-line change agreed with the dataverse-frontend reusable-components track: once a prerelease ships this export, the standalone uploader can import the enum from the package's public surface. Until then, consumers can keep the deep import. Non-breaking additive change.
New use cases backing the paginated dataset version tree endpoint:
GET /api/datasets/{id}/versions/{versionId}/tree
- listDatasetTreeNode: single-page lookup. Accepts path, limit,
cursor, include (all/folders/files), order (NameAZ/NameZA),
includeDeaccessioned, originals.
- iterateDatasetTreeNode: async generator that walks the cursor
chain so callers can consume one folder's children without
driving pagination by hand.
Wire format mirrors the backend response 1:1 (folder items carry
optional `counts`, file items add id/size/contentType/access/
checksum/downloadUrl). Order/include parsing falls back to
defaults on unknown values for forward-compat.
Includes Jest unit tests for the use cases and the transformer.
9771c3b to
397d662
Compare
PR #12182 merged on dataverse develop and moved the per-collection
storage-driver endpoint:
OLD: PUT /api/admin/dataverse/{alias}/storageDriver
NEW: PUT /api/dataverses/{alias}/storageDriver
The CI integration tests on PR #403 now run against a Dataverse
container that includes the move, so setStorageDriverViaApi was
hitting the old admin path and getting 404, which cascaded into
every dataset/file test that depends on the directUploadTestCollection
having LocalStack as its storage driver.
Fix: update setStorageDriverViaApi to use the new public endpoint.
The endpoint still requires X-Dataverse-Key for write operations
(superuser only), so authentication is unchanged.
- docs/useCases.md: add 'List a Folder of a Dataset Version (Tree View)' and 'Iterate a Folder of a Dataset Version (Tree View)' under Datasets read use cases, with example calls and notes on cursor / ETag / ordering. Adds matching TOC entries. - CHANGELOG.md (Unreleased): add a one-line note about re-exporting DataverseApiAuthMechanism from the public surface so the standalone reusable-component bundles can import it without a deep path.
What this PR does / why we need it:
Brings the
dataverse-client-javascriptSDK in line with theserver-authoritative S3 tagging design and adds two new SDK surfaces
that the upcoming reusable-components / tree-view track will consume.
1. Server-authoritative S3 tagging
Removes
FilesConfigand the client-sideuseS3Taggingflag. Thex-amz-taggingheader value is now read from the upload destinationresponse (
FileUploadDestination.tagging), which is populated by theserver based on its
dataverse.files.<driverId>.disable-taggingJVMsetting.
This fixes a class of silent upload failures:
x-amz-taggingis partof the presigned URL signature, so the client must send the header
exactly when the server included it in the signature. A client-side
boolean flag that has to stay in sync with a server JVM setting drifts
in practice. The server is now the single source of truth.
Pairs with a small additive change in the
dataverserepo(
S3AccessIO.generateTemporaryS3UploadUrlsadds thetaggingfieldto its JSON response when tagging is enabled).
2. Tree node listing helpers (#6691 prep)
Two new use cases backing the upcoming
GET /api/datasets/{id}/versions/{versionId}/treeendpoint:listDatasetTreeNode— single-page lookup. Acceptspath,limit,cursor,include(all/folders/files),order(
NameAZ/NameZA),includeDeaccessioned,originals.iterateDatasetTreeNode— async generator that walks the cursorchain so callers can consume all children of a folder without
driving pagination by hand.
Wire format (folders first, then files; opaque keyset cursor; per-file
downloadUrl) is documented in the matching backend PR's Sphinxsection. The transformer falls back to defaults on unknown
order/includevalues for forward-compat.3. Public re-export of
DataverseApiAuthMechanismAdds
DataverseApiAuthMechanismtocore/index.tsalongsideApiConfig, so consumers can import the enum from the public packagesurface instead of the deep path
@iqss/dataverse-client-javascript/dist/core/infra/repositories/ApiConfig.Non-breaking additive change.
4. CI test-helper update for IQSS/dataverse#12182
The integration test helper
setStorageDriverViaApiwas using the old/api/admin/dataverse/{alias}/storageDriverendpoint that PR #12182moved to
/api/dataverses/{alias}/storageDriver. Updated.5. Build hygiene
formatandlintnpm scripts scoped to./srcsonpm run formatand the husky pre-commit hook don't fail on systems where
test/environment/docker-dev-volumesis owned by container users(typical local-Docker setups).
Which issue(s) this PR closes:
Prepares the SDK side of:
Related Dataverse PRs:
dataverse(S3AccessIOadds thetaggingfield) is landing alongside this on theIQSS/dataverse#6691_reusable_componentsbranch — this PR can mergebefore or after that one (the field is optional).
dataverse-frontendconsumer is onfeature/standalone-file-uploader.Special notes for your reviewer:
FilesConfigis removed, not deprecated. There were no releasedversions exposing it as a public API — it lived briefly on this
branch only. CHANGELOG covers it.
being deployed — they will return whatever the server returns, with
defaults on unknown enum values. The frontend repository in the
consumer PR has a fallback path for installations without the
endpoint.
iterateDatasetTreeNodeis afor await...ofgenerator, not a list.Callers control consumption.
Disclosure
Implementation was iterated with Claude Code (Anthropic) in a
pair-with-AI workflow; reviewers should evaluate accordingly. All
commits are authored by a human reviewer.
Suggestions on how to test this:
npm run test:unit— covers the transformer, bothtree-node use cases, and the updated
DirectUploadClient/FilesRepository.npm run test:integration— exercisessingle-part and multipart upload against the LocalStack-backed
container, including the tagging behaviour.
dataverse-frontendstandalone uploader at aprerelease of this package and verify file upload works against a
Dataverse with
dataverse.files.<driverId>.disable-tagging=true(no tagging header expected) and
=false/ unset (header expected).Is there a release notes or changelog update needed for this change?:
Yes.
CHANGELOG.mdhas an "Unreleased" entry covering the tree-nodehelpers and the
DataverseApiAuthMechanismre-export. Theserver-authoritative tagging change is documented inline in the
Removedsection of the priorFilesConfigwork.Additional documentation:
src/datasets/index.ts.dataverse-context/tree_view_plan.md(working doc, not part of thisPR).