-
Notifications
You must be signed in to change notification settings - Fork 59
fix: Add option to bundle submit CLI and API to always perform S3 head check #957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fc6c8d2 to
87de313
Compare
9566d8f to
cce76e1
Compare
|
I guess one other option that doesn't seem to be mentioned is to just have the option do reset_s3_check_cache. If I have a bucket with data from many different scenes I've worked on and I know a lot of the data in my bucket may have been removed by lifecycle rules/altered somehow I think I could get in a state where I'd have to keep using this option until I think I'd covered all of the possible assets I have in my local S3 cache vs the one time "ok, just invalidate all of my local recorded S3 hashes". This way does save even looking at the DB which is nice though. Maybe there's a case longer term for both options? |
Good suggestion - but deleting the database is more invasive right? In the end, both would lead to a S3 head and then upload. The current option does the head and updates the database with the current status. We can add this as another option too, both would solve this problem. |
From our experience - several "3D artists" work on projects and use defined queues ( which are tied to a buckets). For us, these artists share common libraries with textures etc. for different projects. We often encounter the issue, that especially these shared files do expire - even though they are frequently used. So a single artist will not be aware of when shared files are uploaded and it's difficult to gauge when files might be expired due to lifecycle policies. Right now we remind the artists to frequently clear their cache - neither the less we experience frequent aborts from jobs because of missing assets. While having a cli command would be nice - right now just deleting the local cache folder does the same job and is a cumbersome experience. The strategy from this PR seems more reasonable, especially if a team works on projects with shared resources. The performance trade-off is okay for us. |
20ebb0f to
13b6a68
Compare
| - True: Skip cache, always do S3 HEAD to verify existence (skips integrity check) | ||
| - False/None: Use cache with periodic integrity sampling against S3 (default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attempted wording improvement, including phrase "S3 check cache" because that's what it's called in the code and on disk so easier to connect the dots.
| - True: Skip cache, always do S3 HEAD to verify existence (skips integrity check) | |
| - False/None: Use cache with periodic integrity sampling against S3 (default) | |
| - True: Skip the S3 check cache, always check whether uploads are already in S3. | |
| - False/None: Use the S3 check cache, with periodic integrity sampling against S3 (default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for the suggestion! Updated.
| - True: Skip cache, always do S3 HEAD to verify existence | ||
| - False/None: Use cache, fall back to S3 HEAD if not in cache (default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - True: Skip cache, always do S3 HEAD to verify existence | |
| - False/None: Use cache, fall back to S3 HEAD if not in cache (default) | |
| - True: Skip the S3 check cache, always check whether uploads are already in S3. | |
| - False/None: Use the S3 check cache, with periodic integrity sampling against S3 (default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for the suggestion, updated.
| - True: Skip cache, always do S3 HEAD to verify existence (skips integrity check) | ||
| - False/None: Use cache with periodic integrity sampling against S3 (default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - True: Skip cache, always do S3 HEAD to verify existence (skips integrity check) | |
| - False/None: Use cache with periodic integrity sampling against S3 (default) | |
| - True: Skip the S3 check cache, always check whether uploads are already in S3. | |
| - False/None: Use the S3 check cache, with periodic integrity sampling against S3 (default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for the suggestion, updated.
Signed-off-by: David Leong <[email protected]>
|
| "Add...": "添加...", | ||
| "All": "全部", | ||
| "All fields below are optional": "以下所有字段均为可选", | ||
| "Always check S3 job attachments": "始终检查 S3 作业附件", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we translate job attachments to 作业附件? Also curious why the simplified is 作业 while traditional is 任務.
|
nit - PR description seems to be outdated with previous names. |
…-deadline#957) Signed-off-by: David Leong <[email protected]> Signed-off-by: Louise Fox <[email protected]>



Fixes: #634
Usage:
What was the problem/requirement? (What/Why)
When S3 Lifecycle policies delete job attachment files from the CAS bucket, the local
s3_check_cache.dbbecomes stale. Subsequent job submissions trust the cached hash and skip uploading, resulting in jobs that reference non-existent S3 objects. Users had to manually delete~/.deadline/cache/s3_check_cache.dbto recover.What was the solution? (How)
Added a
--verify-job-attachment-existenceCLI flag and correspondingsettings.verify_job_attachment_existenceconfig setting. When enabled:This approach keeps all verification logic in the upload phase without coupling the HashCache and S3CheckCache.
What is the impact of this change?
How was this change tested?
See DEVELOPMENT.md for information on running tests.
Additional testing
Have you run the integration tests?
Have you made changes to the
downloadorasset_syncmodules? If so, then it is highly recommended that you ensure that the docker-based unit tests pass.Was this change documented?
Does this PR introduce new dependencies?
This library is designed to be integrated into third-party applications that have bespoke and customized deployment environments. Adding dependencies will increase the chance of library version conflicts and incompatabilities. Please evaluate the addition of new dependencies. See the Dependencies section of DEVELOPMENT.md for more details.
Is this a breaking change?
No. This is an additive change with a new opt-in flag. Default behavior is unchanged.
Does this change impact security?
No. This change only adds an optional S3 HEAD verification step during upload. It does not modify file permissions, create new files/directories, or change authentication/authorization flows.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Design Proposal: HashCache Job Attachment Verification Option
Problem Statement
Currently, the HashCache and S3CheckCache operate independently during job submission:
HashCache (
_process_input_pathinupload.py:1053-1108): Looks up file by(file_path, hash_algorithm). If found andmtimematches, it skips hashing entirely and uses the cached hash.S3CheckCache (
upload_object_to_casinupload.py:639-680): During upload phase, checks if the S3 key exists in cache. If found, skips both the S3 HEAD request and upload.The Gap: When a file is in HashCache with matching mtime, the system trusts the cached hash. But if the corresponding object was deleted from S3 (bucket cleanup, lifecycle policy, manual deletion), the S3CheckCache may be stale or the entry may have expired (30-day TTL). This results in a job referencing non-existent S3 objects.
Current Code Flow
Proposed Solution
Add a
--verify-job-attachment-existenceCLI option (and corresponding API parameter) that performs an S3 HEAD check for files found in HashCache before trusting the cached hash.New Flow with
verify_job_attachment_existence=TrueImplementation Plan
1. CLI Option (
bundle_group.py)2. API Parameter (
_submit_job_bundle.py)Add
verify_job_attachment_existence: bool = Falseparameter tocreate_job_from_job_bundle().Pass through to
_upload_attachments().3. Core Implementation (
upload.py)3.1 Modify
S3AssetUploader.upload_object_to_cas()Add parameter:
verify_job_attachment_existence: bool = FalseNew logic:
3.2 Modify
S3AssetUploader.upload_input_files()Pass
verify_job_attachment_existencetoupload_object_to_cas()calls.3.3 Modify
S3AssetManager.upload_assets()Add parameter and pass through to
upload_input_files().4. Config File Setting (Optional Enhancement)
Add
settings.verify_job_attachment_existenceto allow setting this as a default behavior.Design Decision
Selected: Option D - Always S3 HEAD in upload phase
This approach is the simplest and keeps all verification logic in the upload phase:
Why HashCache invalidation is NOT needed: The HashCache stores the mapping of
(file_path, hash_algorithm) → (file_hash, mtime). This remains correct even when S3 objects are deleted - the local file hasn't changed, so its hash is still valid. The only stale cache is S3CheckCache, which Option D bypasses entirely by doing the actual S3 HEAD check.Alternative approaches rejected:
Files to Modify
src/deadline/client/cli/_groups/bundle_group.py--verify-job-attachment-existenceCLI optionsrc/deadline/client/api/_submit_job_bundle.pyverify_job_attachment_existenceparameter, pass to_upload_attachmentssrc/deadline/client/api/_job_attachment.pysrc/deadline/job_attachments/upload.pyupload_object_to_cas,upload_input_files, andS3AssetManager.upload_assetsto accept and use the flagPerformance Considerations
Alternative Approaches Considered
verify_hash_cache_integritysamples 30 entries), but only runs at upload timeTesting Strategy
upload_object_to_caswithverify_job_attachment_existence=TruePersisted Configuration Design
Overview
In addition to the CLI flag
--verify-job-attachment-existence, this setting can be persisted in the Deadline Cloud configuration file (~/.deadline/config). This allows users to enable verification by default without specifying the flag on every submission.Precedence
When the CLI flag is provided, it overrides the config file setting. This follows the existing pattern used by
--yes/settings.auto_accept.Configuration File Changes
1. Add Setting to
SETTINGSDictionary (config_file.py)CLI Integration (
bundle_group.py)The CLI flag behavior:
--verify-job-attachment-existenceis provided → useTruesettings.verify_job_attachment_existenceconfigGUI Integration (Optional)
Add a checkbox in
DeadlineConfigDialogunder "General settings":Config File Example
Files to Modify (Additional)
src/deadline/client/config/config_file.pysettings.verify_job_attachment_existencetoSETTINGSdictsrc/deadline/client/cli/_groups/bundle_group.pysrc/deadline/client/ui/dialogs/deadline_config_dialog.pyUser Workflow
deadline config set settings.verify_job_attachment_existence true--no-verify-job-attachment-existence(if implemented) to disable for a single submissionAlternative:
--no-verify-job-attachment-existenceFlagTo allow users to disable verification when the config is enabled, consider adding a negation flag:
This provides full flexibility:
false+ no flag →falsefalse+--verify-job-attachment-existence→truetrue+ no flag →truetrue+--no-verify-job-attachment-existence→false