Skip to content

Conversation

@kysrpex
Copy link
Contributor

@kysrpex kysrpex commented Jun 30, 2025

Introduce a JSON transfer action that indicates that the pulsar server should create a JSON manifest that can be used to stage files by an external system that can stage files in and out of the compute environment. This should be the general strategy for collecting input and output files for ARC, DIRAC, AWS batch etc.

Before launching jobs, Pulsar clients receive the staging manifest as a dictionary. When the job is complete, the script pulsar-create-output-manifest can be used to create a manifest declaring which files should be staged out by the external system.

This PR also updates the coexecutor Dockerfile so that it works again and so that it builds the Pulsar wheel (no need to build it separately anymore).

This should be the general strategy for collecting input and output
files for ARC, DIRAC, AWS batch etc.
@kysrpex
Copy link
Contributor Author

kysrpex commented Jun 30, 2025

@mvdbeek That's all I needed to build the ARC integration (coming soon as a different PR). You may want to compare it to your branch offline_connector_marius. I stripped out the Pulsar client (that comes in the other PR) and a couple of things that were not needed to complete the integration.

mvdbeek and others added 6 commits June 30, 2025 17:07
Change base image from `conda/miniconda3` (based off Debian Stretch) to `python:3.12-bookworm`. Miniconda is not required in the base image.

Add the Galaxy Depot repository, which provides SLURM DRMAA packages for Debian Buster and newer releases.

Do not install the package `apt-transport-https`, it is now a dummy package, see https://packages.debian.org/en/bookworm/apt-transport-https. Install the package `slurm` instead of `slurm-llnl`.

Newer versions of the `munge` package include the binary `/usr/sbin/mungekey` instead of `/usr/sbin/create-munge-key`. Nevertheless, the key seems to be created automatically when installing the package, as running `mungekey` yields 'mungekey: Error: Failed to create "/etc/munge/munge.key": File exists'.
Build wheel automatically when building the Docker image. Exclude the source code from the output image through a multistage build.
…oexecutionLaunchMixin` to `BaseRemoteConfiguredJobClient`
@kysrpex kysrpex force-pushed the offline_collector branch from 1e1a3c7 to 7c8371a Compare June 30, 2025 15:07
@kysrpex
Copy link
Contributor Author

kysrpex commented Jun 30, 2025

If this one is fine, then #370 should be closed.

@@ -1,41 +1,60 @@
FROM conda/miniconda3
FROM python:3.12-bookworm
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmchilton commented:

I don't think it is fair at all to say "Miniconda" is not required in the base image as the commit message suggests. I get that it isn't the modality that you wish to run it in but it is documented in https://pulsar.readthedocs.io/en/latest/containers.html#co-execution as an option and it is an option that makes a lot of sense to me. Is the slurm stuff required for your use case?

Copy link
Contributor Author

@kysrpex kysrpex Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is fair at all to say "Miniconda" is not required in the base image as the commit message suggests.

I am assuming "Miniconda" is not required in the base image because I saw line 60 (the last line): RUN _pulsar-conda-init --conda_prefix=/pulsar_dependencies/conda. It looks like it installs Miniconda, thus it should make it to the Docker image without needing to put it in the base image. But I am not familiar with Pulsar so I am not sure if line 60 actually has an effect equivalent to including it in the base image.

Copy link
Contributor Author

@kysrpex kysrpex Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the SLURM stuff, it is not needed for this use case but it was already in the old Dockerfile (it is needed for other use cases).

    # Install packages
    && apt-get update \
    && apt-get install -y --no-install-recommends gcc \
        libcurl4-openssl-dev \
        cvmfs cvmfs-config-default \
        slurm-llnl slurm-drmaa-dev \
        bzip2 \

Because I am changing the base image to Debian Bookworm, the changes are required not to break the build with the new base image (that's why the whole commit has the title "Change base image in coexecutor Dockerfile"). slurm-drmaa1 is not available on the Debian repos anymore for Bookworm (it has to be installed from the Galaxy Depot). The latest version of slurm-drmaa1 available in the Galaxy depot (1.1.4) requires a library libslurm36 from Debian Bullseye. There exists a new version 1.1.5 of libslurm-drmaa1 on GitHub that maybe works on Bookworm without libraries from Bullseye, but @natefoo has not published a Debian package yet.

kysrpex added 3 commits July 21, 2025 17:06
…Collector`

At the moment, JSON staging and outputs manifests are constructed by tracking all actions mapped by the `FileActionMapper` using a list `FileActionMapper.actions`. This makes the `FileActionMapper` stateful, requires including `file_type` as keyword argument for `BaseAction` and its children, requires defining a finalize()` method for `FileActionMapper` and for `BaseAction` and its children.

Paying the small price of refactoring `JsonTransferAction`, generate the staging manifest from `FileStager.transfer_tracker.remote_staging_actions` and the output manifest as `ResultsCollector` collects the outputs.
Set `JsonTransferAction.whole_directory_transfer_supported` to `False`, as the job files API is not capable of serving directories.
Using `basename(action.path)` creates a flat structure for each file type (e.g. job_directory/unstructured/human.fa), but Pulsar expects tree structures to work (e.g. job_directory/unstructured/f0d0164494db6cbf92c12aeb6119ac38/bwa/human.fa).
@kysrpex kysrpex force-pushed the offline_collector branch from 1480061 to 496185e Compare July 21, 2025 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants