Fix #19: add dataset collection support and robust collection detection #26

ldalcolmo · 2026-01-02T00:48:15Z

Description

This PR fixes Issue #19 by adding native support for dataset collections in Galaxy MCP, enabling agent workflows to correctly discover, inspect, and navigate histories that contain collections.

It introduces:

Proper exposure of dataset collections in get_history_contents()
A new MCP tool get_collection_details() for inspecting collection structure and members
A robust, API-based guard in get_dataset_details() to prevent agents from treating collections as datasets

These changes make Galaxy MCP fully usable in agent-native environments where histories frequently contain list, paired, or nested dataset collections.

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📚 Documentation update
⬆️ Dependency update
🧰 Maintenance/chore

Checklist

I have performed a self-review of my code
I have added tests for my changes
I have updated the documentation accordingly
My changes generate no new warnings

Related Issues

Closes #19

…ction detection

ldalcolmo · 2026-01-02T00:51:35Z

This PR fixes Issue #19 and enables robust handling of dataset collections in Galaxy MCP, which is critical for agent-native workflows and LLM integrations.

What it does:

get_history_contents() now returns both datasets and dataset collections with a clear history_content_type field.
Added the new get_collection_details() tool, which returns normalized collection and member dataset info for agent traversal.
get_dataset_details() now includes an API-based guard to detect collection IDs and provide a helpful error message, avoiding fragile text parsing.

Tests included:

Mixed history contents
Truncation for large collections
Correct guard behavior when passing a collection ID
All existing tests remain untouched and pass

This change is backward-compatible and purely additive, and should help unblock agent workflows that need to navigate collections.

Happy to make any adjustments if needed! 🙌

bgruening · 2026-01-02T12:43:59Z

Welcome @ldalcolmo! I started the CI :)

ldalcolmo · 2026-01-02T13:32:22Z

Welcome @ldalcolmo! I started the CI :)

Thanks @bgruening!
I cleaned up all ruff/E501 and formatting issues in a few follow-up commits so CI should be green now.
The functional changes are in the initial commit (collections + guards + new MCP tool), the others are formatting only.
Happy to adjust anything if you have suggestions.

dannon · 2026-01-05T15:52:30Z

Thanks for this contribution! Adding collection support is a welcome improvement!

A couple of things I noticed:

Performance consideration: The change from gi.datasets.get_datasets() (server-side pagination) to gi.histories.show_history(contents=True) means we now fetch all items and paginate client-side. For large histories this could be slower, though I understand it's necessary to include collections. Might be worth a comment noting this tradeoff.
Unused parameter: The details parameter in get_history_contents() is no longer used in the new implementation
Minor test nit: A few test methods have their docstrings after the first line of code rather than immediately after the signature (e.g., test_get_collection_details_list_collection).

The new get_collection_details() tool looks well-structured, and the error handling that detects when someone passes a collection ID to get_dataset_details() is a nice touch for agent UX.

ldalcolmo · 2026-01-05T17:07:05Z

Thanks a lot for the review and the kind words! 🙏
Great points.

Performance: agreed — switching from gi.datasets.get_datasets() (server-side pagination) to show_history(contents=True) does fetch all items and paginates client-side. I’ll add an explicit comment/docstring note explaining the tradeoff and why it’s needed to include dataset collections.

Unused details parameter: good catch — it’s now redundant in the new implementation. I’ll remove it (and update call sites/tests accordingly).

Test docstring placement: also agreed — I’ll move those docstrings to be the first statement in the test methods.

I’ll push a small follow-up commit with these changes shortly. Will ping you once CI is green again.

Thanks for this contribution! Adding collection support is a welcome improvement!

A couple of things I noticed:

Performance consideration: The change from gi.datasets.get_datasets() (server-side pagination) to gi.histories.show_history(contents=True) means we now fetch all items and paginate client-side. For large histories this could be slower, though I understand it's necessary to include collections. Might be worth a comment noting this tradeoff.

Unused parameter: The details parameter in get_history_contents() is no longer used in the new implementation

Minor test nit: A few test methods have their docstrings after the first line of code rather than immediately after the signature (e.g., test_get_collection_details_list_collection).

The new get_collection_details() tool looks well-structured, and the error handling that detects when someone passes a collection ID to get_dataset_details() is a nice touch for agent UX.

…rings

dannon

Thank you!

ldalcolmo · 2026-01-09T20:25:47Z

Thanks a lot for the review and approval!
Really appreciate the feedback.

If there are other MCP or Galaxy integration issues where I can help, feel free to point me to them — happy to contribute.

Fix galaxyproject#19: add dataset collection support and robust colle…

3023dba

…ction detection

ldalcolmo added 3 commits January 2, 2026 10:12

Fix formatting and ruff violations for CI

3c6ee51

Fix ruff E501 violations: break long lines to meet 100 char limit

f95a74e

Fix formatting/ruff and keep CI green

8b4044e

Doc performance tradeoff, remove unused details param, fix test docst…

0a42636

…rings

dannon approved these changes Jan 7, 2026

View reviewed changes

dannon merged commit 508a6b0 into galaxyproject:main Jan 13, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix #19: add dataset collection support and robust collection detection #26

Fix #19: add dataset collection support and robust collection detection #26

Uh oh!

ldalcolmo commented Jan 2, 2026

Uh oh!

ldalcolmo commented Jan 2, 2026

Uh oh!

bgruening commented Jan 2, 2026

Uh oh!

ldalcolmo commented Jan 2, 2026

Uh oh!

dannon commented Jan 5, 2026

Uh oh!

ldalcolmo commented Jan 5, 2026

Uh oh!

dannon left a comment

Uh oh!

ldalcolmo commented Jan 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix #19: add dataset collection support and robust collection detection #26

Fix #19: add dataset collection support and robust collection detection #26

Uh oh!

Conversation

ldalcolmo commented Jan 2, 2026

Description

Type of Change

Checklist

Related Issues

Uh oh!

ldalcolmo commented Jan 2, 2026

Uh oh!

bgruening commented Jan 2, 2026

Uh oh!

ldalcolmo commented Jan 2, 2026

Uh oh!

dannon commented Jan 5, 2026

Uh oh!

ldalcolmo commented Jan 5, 2026

Uh oh!

dannon left a comment

Choose a reason for hiding this comment

Uh oh!

ldalcolmo commented Jan 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants