Skip to content

Features/hpcdatamgm 2097 version2#155

Open
saradachintala wants to merge 29 commits intoreleases/3.24.0from
features/HPCDATAMGM-2097-version2
Open

Features/hpcdatamgm 2097 version2#155
saradachintala wants to merge 29 commits intoreleases/3.24.0from
features/HPCDATAMGM-2097-version2

Conversation

@saradachintala
Copy link
Copy Markdown
Contributor

@saradachintala saradachintala commented Mar 17, 2026

This Pull Request contains the code for downloading a single DataObject and a single Collection from an External Archive to a Globus Destination.

The tickets associated with this PR are
HPCDATAMGM-2097: Implement External Download API for downloading a dataObject to Globus
HPCDATAMGM-2168: Implement Download API to download a dataObject from an External Archive to Globus
HPCDATAMGM-2172: Implement Download API to download a Collection from an External Archive to Globus

Copilot AI review requested due to automatic review settings March 17, 2026 21:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for downloading data objects and collections from an external archive into a Globus destination by introducing new “external source” download endpoints and propagating an externalArchiveFlag/RECEIVED_EXTERNAL state through the transfer/task pipeline.

Changes:

  • Added REST API endpoints for external-source downloads (collection, data object, and bulk).
  • Extended schemas/types to include externalArchiveFlag and added a new RECEIVED_EXTERNAL status for download task workflows.
  • Updated scheduler + bus/app/DAO layers to persist and process the new external-archive download flow.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
src/hpc-server/hpc-ws-rs-impl/src/main/java/gov/nih/nci/hpc/ws/rs/impl/HpcDataManagementRestServiceImpl.java Wires new REST methods to bus-service methods for external-source downloads.
src/hpc-server/hpc-ws-rs-api/src/main/java/gov/nih/nci/hpc/ws/rs/HpcDataManagementRestService.java Adds new /ext/.../download REST endpoints.
src/hpc-server/hpc-scheduler/src/main/resources/WEB-INF/spring/hpc-scheduler.properties Adds cron property for archive-link download task starter.
src/hpc-server/hpc-scheduler/src/main/java/gov/nih/nci/hpc/scheduler/impl/HpcScheduledTasksImpl.java Adds scheduled task to start processing RECEIVED_EXTERNAL data object downloads.
src/hpc-server/hpc-dto/src/main/resources/schema/HpcDataManagement.v2.xsd Adds optional externalArchiveFlag to download request DTOs.
src/hpc-server/hpc-domain-types/src/main/resources/schema/HpcDataTransferTypes.xsd Adds RECEIVED_EXTERNAL statuses and externalArchiveFlag elements to transfer domain types.
src/hpc-server/hpc-dao-impl/src/main/scripts/migration/release-3.20.0/hpc_data_object_download_task_update.sql Adds DB column for external archive flag (currently only for data object tasks).
src/hpc-server/hpc-dao-impl/src/main/java/gov/nih/nci/hpc/dao/oracle/impl/HpcDataDownloadDAOImpl.java Persists/loads EXTERNAL_ARCHIVE_FLAG and updates SQL for tasks.
src/hpc-server/hpc-bus-service-impl/src/main/java/gov/nih/nci/hpc/bus/impl/HpcSystemBusServiceImpl.java Processes external collection tasks and adds RECEIVED_EXTERNAL handling for data object tasks.
src/hpc-server/hpc-bus-service-impl/src/main/java/gov/nih/nci/hpc/bus/impl/HpcDataManagementBusServiceImpl.java Implements external-source download methods and propagates the external flag into transfer service calls.
src/hpc-server/hpc-bus-service-api/src/main/java/gov/nih/nci/hpc/bus/HpcSystemBusService.java Exposes startArchiveLinkDataObjectDownloadTasks() in the system bus API.
src/hpc-server/hpc-bus-service-api/src/main/java/gov/nih/nci/hpc/bus/HpcDataManagementBusService.java Exposes new external-source download methods in the bus API.
src/hpc-server/hpc-app-service-impl/src/test/java/gov/nih/nci/hpc/service/impl/HpcDataTransferServiceImplTest.java Updates tests for changed downloadDataObject signature.
src/hpc-server/hpc-app-service-impl/src/main/java/gov/nih/nci/hpc/service/impl/HpcDataTransferServiceImpl.java Adds externalArchiveFlag plumbing, new external-status transition method, and status handling in resets.
src/hpc-server/hpc-app-service-impl/src/main/java/gov/nih/nci/hpc/service/impl/HpcDataManagementServiceImpl.java Adds lookup for S3 archive configuration by external POSIX path.
src/hpc-server/hpc-app-service-impl/src/main/java/gov/nih/nci/hpc/service/impl/HpcDataManagementConfigurationLocator.java Exposes S3 archive configurations for the external-path lookup.
src/hpc-server/hpc-app-service-api/src/main/java/gov/nih/nci/hpc/service/HpcDataTransferService.java Extends service signatures and adds new external-status transition method.
src/hpc-server/hpc-app-service-api/src/main/java/gov/nih/nci/hpc/service/HpcDataManagementService.java Adds API to find transfer configuration for an external path.
Comments suppressed due to low confidence (1)

src/hpc-server/hpc-app-service-impl/src/main/java/gov/nih/nci/hpc/service/impl/HpcDataManagementServiceImpl.java:79

  • There are duplicate imports for gov.nih.nci.hpc.domain.model.HpcDataTransferConfiguration (imported twice). Please remove the duplicate to keep the imports clean and avoid potential style/lint failures.
import gov.nih.nci.hpc.domain.model.HpcDataManagementConfiguration;
import gov.nih.nci.hpc.domain.model.HpcDataTransferConfiguration;
import gov.nih.nci.hpc.domain.model.HpcDataObjectRegistrationRequest;
import gov.nih.nci.hpc.domain.model.HpcDataObjectRegistrationResult;
import gov.nih.nci.hpc.domain.model.HpcDataTransferConfiguration;
import gov.nih.nci.hpc.domain.model.HpcStorageRecoveryConfiguration;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

saradachintala and others added 13 commits March 17, 2026 17:46
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@dinhys dinhys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few comments, we can review it in our Dev meeting today.

ALTER TABLE HPC_COLLECTION_DOWNLOAD_TASK add (
EXTERNAL_ARCHIVE_FLAG CHAR default '0'
);
COMMENT ON COLUMN HPC_COLLECTION_DOWNLOAD_TASK.EXTERNAL_ARCHIVE_FLAG IS 'Indicates if the data is stored externally';
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to retain this EXTERNAL_ARCHIVE_FLAG in HPC_DOWNLOAD_TASK_RESULT?

globusDownloadDestination, s3DownloadDestination, googleDriveDownloadDestination,
googleCloudStorageDownloadDestination, asperaDownloadDestination, boxDownloadDestination, false,
false, userId, retryItem.getDestinationLocation(), collectionDownloadTaskId);
false, userId, retryItem.getDestinationLocation(), collectionDownloadTaskId, false);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just confirming whether collection retries should be disabled for download from external archive.

downloadTask.setExternalArchiveFlag(true);
dataTransferService.updateCollectionDownloadTask(downloadTask);
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Process Collection Download task will now process all RECEIVED_EXTERNAL records first before processing RECEIVED. Is this ok if we have a big backlog for RECEIVED_EXTERNAL?

* {@code "pathWithPosixPathRemoved"}.
* @throws HpcException on service failure or invalid configuration.
*/
private Map<String, String> externalDownloadDetailsMapping(HpcDataTransferConfiguration s3ArchiveConfiguration, String path) throws HpcException {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we name this with a verb getExternalDownloadDetailsMapping or Config instead of Detail? Just a thought.

directoryScanRegistrationItem.setS3ArchiveConfigurationId(s3ArchiveConfiguration.getId());
HpcBulkDataObjectRegistrationRequestDTO registrationBulkRequestDTO = new HpcBulkDataObjectRegistrationRequestDTO();
registrationBulkRequestDTO.getDirectoryScanRegistrationItems().add(directoryScanRegistrationItem);
registerDataObjects(registrationBulkRequestDTO);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need anything from the reponseDTO like task ID?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants