-
Notifications
You must be signed in to change notification settings - Fork 2
Description
2025-10-24 13:40:58,654 ERROR: Error occurred during cycle: 'NoneType' object has no attribute 'lower'
Traceback (most recent call last):
File "/global/homes/n/nmdcda/nmdc_automation/prod/nmdc_automation/nmdc_automation/workflow_automation/watch_nmdc.py", line 477, in watch
self.cycle()
File "/global/homes/n/nmdcda/nmdc_automation/prod/nmdc_automation/nmdc_automation/workflow_automation/watch_nmdc.py", line 434, in cycle
successful_jobs, failed_jobs = self.job_manager.get_finished_jobs()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/global/homes/n/nmdcda/nmdc_automation/prod/nmdc_automation/nmdc_automation/workflow_automation/watch_nmdc.py", line 235, in get_finished_jobs
if status.lower() == "succeded":
^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'lower'
I believe this is from a combination of status: done and results:null values from JAWS. These come from an edge case where JAWS purges our inputs.json before the runs were started. See https://code.jgi.doe.gov/dsi/advanced-analysis/jaws/jaws-support/-/issues/338
example:
jaws status 137411
{
"compute_site_id": "nmdc",
"cpu_hours": null,
"cromwell_run_id": null,
"id": 137411,
"input_site_id": "nmdc",
"json_file": "/tmp/tmpkw0k3ebi.json",
"output_dir": null,
"result": null,
"status": "done",
"status_detail": "The run is complete.",
"submitted": "2025-10-01 03:05:42",
"tag": "nmdc:omprc-11-weea4z31/nmdc:wfmag-11-pp2xkf68.1",
"team_id": "nmdc",
"updated": "2025-10-16 15:05:56",
"user_id": "nmdcda",
"wdl_file": "/tmp/tmptx3y8ksc/tmp7lqghtzf.wdl",
"workflow_name": null,
"workflow_root": null
}
jaws log 137411
#STATUS_FROM STATUS_TO TIMESTAMP COMMENT
created upload queued 2025-10-01 03:06:00
upload queued upload complete 2025-10-01 03:06:00
upload complete ready 2025-10-01 03:06:28
ready submission failed 2025-10-16 15:05:25 File not found: /pscratch/sd/n/nmjaws/nmdc-prod/inputs/c09f5922-4549-43d1-b997-5aede600c913.json
submission failed slack succeeded 2025-10-16 15:05:44
slack succeeded done 2025-10-16 15:05:56
Immediate fix would be to count these as 'failed' wrt nmdc_automation, increment the failure count and run jaws submit, jaws resubmit should not be used in this case b/c something went wrong with the original submission failed.
Longer term fix would be to add support for checking jaws log for more advanced debugging for job failures.
We have 518 jaws submission in this state (done + null) so we need to implement something quickly & my suggestion is to hot fix production.
This doesn't kill the watcher but it does kill the pooling cycle based on the logs so also consider exception handling.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status