Releases: txn2/mcp-datahub
mcp-datahub-v1.8.1
mcp-datahub v1.8.1 — Fix OutputSchema Validation on Empty Search Results
Fixes a bug where datahub_search returned "entities": null instead of "entities": [] when a search matched zero results, causing MCP OutputSchema validation failures. This affected both keyword and semantic search modes.
Bug fix
datahub_search returns null entities on zero results (#131)
When a search query matched no entities, SearchResult.Entities was never initialized (Go nil slice), and json.Marshal serialized it as null. The OutputSchema declares entities as "type": "array" which rejects null, causing clients to receive a validation error instead of an empty result set.
Root cause: doSearchAcrossEntities and Search created the SearchResult struct without initializing the Entities slice. The slice was only populated via append in a loop over results — zero results meant the loop never executed and Entities stayed nil.
Fix: Initialize Entities with make([]SearchEntity, 0, len(results)) in both code paths. This produces "entities": [] in JSON and pre-allocates the right capacity when results do exist.
Affected methods:
client.SearchAcrossEntities()(keyword + semantic viadoSearchAcrossEntities)client.Search()(legacy type-scoped search)
Upgrading
This is a patch release with no breaking changes. All users of v1.8.0 should upgrade.
Installation
Claude Desktop (macOS/Windows)
Download the .mcpb bundle for your platform and double-click to install:
- macOS Apple Silicon (M1/M2/M3/M4):
mcp-datahub_1.8.1_darwin_arm64.mcpb - macOS Intel:
mcp-datahub_1.8.1_darwin_amd64.mcpb - Windows:
mcp-datahub_1.8.1_windows_amd64.mcpb
Homebrew (macOS)
brew install txn2/tap/mcp-datahubClaude Code CLI
claude mcp add datahub \
-e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
-e DATAHUB_TOKEN=your-token \
-- mcp-datahubDocker
docker pull ghcr.io/txn2/mcp-datahub:v1.8.1Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-datahub_1.8.1_linux_amd64.tar.gz.sigstore.json \
mcp-datahub_1.8.1_linux_amd64.tar.gzmcp-datahub-v1.8.0
mcp-datahub v1.8.0 — Advanced Search Filters via searchAcrossEntities
Upgrades datahub_search to use DataHub's searchAcrossEntities GraphQL API, unlocking advanced field-level filtering and multi-type search. Agents can now answer questions like "find tables with an email column" or "datasets owned by data-eng on the trino platform" without scanning every schema individually.
+1,405 lines | -178 lines | 25 files changed
Highlights
- Advanced filters — new
filtersparameter supports DataHub's full filter vocabulary:fieldPaths(column names),fieldTags(column-level tags),fieldGlossaryTerms,fieldDescriptions,platform,domains,owners,tags,glossaryTerms,typeNames - Multi-type search — new
typesparameter searches across multiple entity types in a single call (e.g.,["DATASET", "DASHBOARD"]) - Full backward compatibility — existing
entity_typeand simple query usage unchanged; defaults to DATASET - Semantic search parity —
SemanticSearchnow also supportstypesandfilters
New tool parameters
datahub_search(
query: "*",
types: ["DATASET", "DASHBOARD"],
filters: [
{ field: "fieldPaths", values: ["email"], condition: "CONTAIN" },
{ field: "platform", value: "urn:li:dataPlatform:trino" },
{ field: "tags", values: ["urn:li:tag:deprecated"], negated: true }
]
)
Filter fields reference
| Field | Matches |
|---|---|
fieldPaths |
Column/schema field names |
fieldTags |
Column-level tags |
fieldGlossaryTerms |
Column-level glossary terms |
fieldDescriptions |
Column-level descriptions |
platform |
Data platform (URN format) |
domains |
Domain (URN format) |
owners |
Owner (URN format) |
tags |
Entity-level tags |
glossaryTerms |
Entity-level glossary terms |
typeNames |
Entity subtypes |
New client API
| Addition | Description |
|---|---|
client.SearchAcrossEntities() |
Search with types and filters via searchAcrossEntities GraphQL endpoint |
client.SearchFilter |
Filter struct: Field, Values, Condition, Negated |
client.WithTypes() |
Search option for multi-type search |
client.WithSearchFilters() |
Search option for advanced field-level filters |
client.DefaultEntityType |
Constant for the default entity type ("DATASET") |
Bug fixes
- Context documents default to visible — documents created via
datahub_create what=documentnow default toGlobalContext: trueandStatus: "PUBLISHED"so they appear in the DataHub UI immediately (#122, #123)
Other changes
- CI: bump actions/setup-go, codecov/codecov-action, sigstore/cosign-installer, actions/deploy-pages (#128)
SemanticSearchQuerynow shares entity fragments withSearchAcrossEntitiesQuery(DataFlow, Tag, Document fragments added)- Shared
doSearchAcrossEntitieshelper eliminates duplication between keyword and semantic code paths - Input validation rejects filters with empty
fieldor missingvalue/values
Compatibility
| DataHub Version | Search Features |
|---|---|
| 1.3.x+ (minimum) | All search features including types, filters, searchAcrossEntities |
| 1.4.x+ | + Semantic search mode (mode: "semantic") with types and filters |
Installation
Claude Desktop (macOS/Windows)
Download the .mcpb bundle for your platform and double-click to install:
- macOS Apple Silicon (M1/M2/M3/M4):
mcp-datahub_1.8.0_darwin_arm64.mcpb - macOS Intel:
mcp-datahub_1.8.0_darwin_amd64.mcpb - Windows:
mcp-datahub_1.8.0_windows_amd64.mcpb
Homebrew (macOS)
brew install txn2/tap/mcp-datahubClaude Code CLI
claude mcp add datahub \
-e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
-e DATAHUB_TOKEN=your-token \
-- mcp-datahubDocker
docker pull ghcr.io/txn2/mcp-datahub:v1.8.0Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-datahub_1.8.0_linux_amd64.tar.gz.sigstore.json \
mcp-datahub_1.8.0_linux_amd64.tar.gzmcp-datahub-v1.7.1
mcp-datahub v1.7.1 — Default Documents to Visible and Published
Fixes document creation defaults so that documents are immediately visible in the DataHub UI. Previously, documents created via UpsertContextDocument or datahub_create what=document were hidden because showInGlobalContext defaulted to false and status defaulted to UNPUBLISHED.
Problem
Both document creation paths produced invisible documents:
| Creation Path | GlobalContext |
Status |
Visible in UI? |
|---|---|---|---|
UpsertContextDocument → createContextDocument |
false (Go zero value) |
PUBLISHED |
No — hidden from global search |
datahub_create what=document → handleCreateDocument |
false (Go zero value) |
empty (server default = UNPUBLISHED) |
No — draft and hidden |
This affected every consumer of UpsertContextDocument (used by mcp-data-platform's apply_knowledge tool for add_context_document workflows) and every datahub_create what=document call that didn't explicitly set both flags.
Fix (#123)
Both paths now default to visible, published documents:
| Creation Path | GlobalContext |
Status |
|---|---|---|
createContextDocument |
true |
PUBLISHED (unchanged) |
handleCreateDocument |
true |
PUBLISHED |
Explicit overrides preserved
The global_context field on CreateInput was changed from bool to *bool, allowing callers to distinguish "not provided" (defaults to true) from "explicitly set to false":
// Default: visible and published
{"what": "document", "name": "My Doc", "description": "Content"}
// Explicit draft, hidden from global search
{"what": "document", "name": "My Draft", "status": "UNPUBLISHED", "global_context": false}Changes
| File | Change |
|---|---|
pkg/client/context_documents.go |
Added GlobalContext: true to CreateDocumentInput in createContextDocument |
pkg/tools/write_create.go |
Default Status to "PUBLISHED" when empty; changed GlobalContext from bool to *bool with true default |
pkg/tools/write_create_test.go |
Added tests for default behavior and explicit override |
pkg/client/context_documents_test.go |
Added showInGlobalContext: true assertion in create path |
Closes #122
Compatibility
| Requirement | Version |
|---|---|
| Go | 1.25+ |
| DataHub (minimum) | 1.3.x |
| DataHub (full feature set incl. documents) | 1.4.x+ |
| DataHub (schema validated against) | v1.5.0.1 |
Note: The global_context field type change from bool to *bool in CreateInput is a breaking change for Go callers that set the field directly. JSON callers (MCP tool users) are unaffected — true, false, and omitted all work as expected.
Installation
Claude Desktop (macOS/Windows)
Download the .mcpb bundle for your platform and double-click to install:
- macOS Apple Silicon (M1/M2/M3/M4):
mcp-datahub_1.7.1_darwin_arm64.mcpb - macOS Intel:
mcp-datahub_1.7.1_darwin_amd64.mcpb - Windows:
mcp-datahub_1.7.1_windows_amd64.mcpb
Homebrew (macOS)
brew install txn2/tap/mcp-datahubClaude Code CLI
claude mcp add datahub \
-e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
-e DATAHUB_TOKEN=your-token \
-- mcp-datahubDocker
docker pull ghcr.io/txn2/mcp-datahub:v1.7.1Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-datahub_1.7.1_linux_amd64.tar.gz.sigstore.json \
mcp-datahub_1.7.1_linux_amd64.tar.gzmcp-datahub-v1.7.0
mcp-datahub v1.7.0 — GraphQL Schema Alignment & Validation Infrastructure
Corrects GraphQL query field paths across four client modules by validating every query against the upstream DataHub schema source files. Adds automated schema validation infrastructure to prevent future drift — all 59 query/mutation constants are now checked against the official .graphql definitions from datahub-project/datahub.
+25,426 lines | -318 lines | 46 files changed
Highlights
GraphQL Query Corrections (#121)
Cross-referenced all GraphQL queries with the upstream DataHub schema files (datahub-graphql-core/src/main/resources/*.graphql) and corrected field paths that did not match the actual API:
| Module | Issue | Fix |
|---|---|---|
documents.go |
DocumentRelatedAsset, DocumentRelatedDocument, DocumentParentDocument queried a direct urn field that doesn't exist on these wrapper types |
Changed to relatedAssets { asset { urn } }, relatedDocuments { document { urn } }, parentDocument { document { urn } } per upstream documents.graphql |
structured_properties.go |
Fragment targeted non-existent type EntityStructuredPropertiesResult |
Changed to StructuredProperties per upstream entity.graphql |
data_contracts.go |
Queried contract { result(refresh: false) { type assertionResults { ... } } } — the result field and its nested structure don't exist on DataContract |
Rewrote to contract { properties { freshness/schema/dataQuality { assertion { urn } } } status { state } } per upstream contract.graphql |
semantic_search.go |
Used non-existent input type SemanticSearchInput |
Changed to SearchAcrossEntitiesInput with searchAcrossEntities query per upstream search.graphql |
All corrections were verified against both the upstream .graphql source files (v1.4.0.3 and v1.5.0.1) and a live DataHub v1.4.0.3 instance.
Schema Validation Infrastructure (#121)
Adds automated, offline-capable validation of GraphQL queries against the upstream DataHub schema:
testdata/datahub-schema/— 31.graphqlschema files synced from datahub-project/datahub at tag v1.5.0.1, checked into the repo for CI without network accesstestdata/datahub-schema/sync.sh— downloads schema files from any tagged DataHub releasepkg/client/schema_validation_test.go— validates all 59 query/mutation constants against the schema: checks fragment targets, top-level query/mutation fields, inline fragment type names, and input type referencesmake schema-sync— download schema files for a target versionmake schema-check— run schema validation (now part ofmake verify)
Workflow for targeting a new DataHub version:
DATAHUB_VERSION=v1.5.0.1 make schema-sync # pull schema files
make schema-check # validate all queriesDataHub Version Compatibility Matrix
Updated CLAUDE.md with a verified compatibility matrix:
| DataHub Version | Features Available | Schema Validated |
|---|---|---|
| 1.3.x+ (minimum) | All read tools, all write operations except documents | No (pre-dates schema sync) |
| 1.4.x+ (full) | + Documents (create/update/delete), semantic search | Yes (v1.4.0.3) |
| 1.5.x+ (current) | + Batch data product operations | Yes (v1.5.0.1) |
Schema files were diff'd between v1.4.0.3 and v1.5.0.1 — the only change is a new batchAddToDataProducts/batchRemoveFromDataProducts mutation in entity.graphql. All types used by this library are identical across both versions.
Breaking Changes
types.AssertionResult simplified
The AssertionResult type in pkg/types/data_contracts.go was simplified to match the actual DataHub DataContract schema:
Removed fields:
ResultType string— the real API does not expose per-assertion result types through the contract queryNativeResults map[string]string— the real API does not expose native result details through the contract query
Before:
type AssertionResult struct {
AssertionURN string `json:"assertion_urn"`
Type string `json:"type"`
ResultType string `json:"result_type"`
NativeResults map[string]string `json:"native_results,omitempty"`
}After:
type AssertionResult struct {
AssertionURN string `json:"assertion_urn"`
Type string `json:"type"`
}If you were reading ResultType or NativeResults from AssertionResult, those fields were never populated by the actual DataHub API.
DataContract.Status values changed
The Status field now contains the DataContractState enum value from the status.state field (e.g., "ACTIVE", "PENDING") rather than the previously unpopulated result.type field (which was intended to contain "PASSING" / "FAILING").
Compatibility
| Requirement | Version |
|---|---|
| Go | 1.25+ |
| DataHub (minimum) | 1.3.x |
| DataHub (full feature set incl. documents) | 1.4.x+ |
| DataHub (schema validated against) | v1.5.0.1 |
Installation
Claude Desktop (macOS/Windows)
Download the .mcpb bundle for your platform and double-click to install:
- macOS Apple Silicon (M1/M2/M3/M4):
mcp-datahub_1.7.0_darwin_arm64.mcpb - macOS Intel:
mcp-datahub_1.7.0_darwin_amd64.mcpb - Windows:
mcp-datahub_1.7.0_windows_amd64.mcpb
Homebrew (macOS)
brew install txn2/tap/mcp-datahubClaude Code CLI
claude mcp add datahub \
-e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
-e DATAHUB_TOKEN=your-token \
-- mcp-datahubDocker
docker pull ghcr.io/txn2/mcp-datahub:v1.7.0Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-datahub_1.7.0_linux_amd64.tar.gz.sigstore.json \
mcp-datahub_1.7.0_linux_amd64.tar.gzmcp-datahub-v1.6.0
mcp-datahub v1.6.0 — Context Documents Client API & Security Update
Adds a high-level Context Documents client API for entity-scoped document CRUD, enabling downstream knowledge pipeline workflows in mcp-data-platform. Also resolves two security vulnerabilities via the go-sdk v1.4.1 upgrade.
+1,075 lines | -21 lines | 14 files changed
Closes #116
Highlights
Context Documents CRUD Client Methods (#117)
Three new convenience methods compose existing Document primitives into an entity-scoped API with simplified types for downstream consumption:
| Method | Signature | Description |
|---|---|---|
GetContextDocuments |
(ctx, urn) → ([]ContextDocument, error) |
Retrieves context documents linked to an entity via a dedicated GraphQL query with ownership and timestamp fields |
UpsertContextDocument |
(ctx, entityURN, doc) → (*ContextDocument, error) |
Creates (empty ID) or updates (populated ID) a context document; returns the full document after operation |
DeleteContextDocument |
(ctx, documentID) → error |
Deletes a context document by its ID |
New types in pkg/types/context_document.go:
ContextDocument— simplified, flattened view withID,Title,Content,ContentType,Category,CreatedAt,UpdatedAt,AuthorContextDocumentAuthor— author identity (URN,Username) derived from ownershipContextDocumentInput— upsert input (emptyID= create, populated = update)
Design decisions:
- Context documents are standard DataHub Document entities with
showInGlobalContext: false, linked to entities viarelatedAssets— no separate DataHub API exists UpsertContextDocumentperforms a post-writeGetDocumentfetch to return complete server-side data (timestamps, ownership)CreateDocumentnow always sends thesettings.showInGlobalContextfield explicitly (even whenfalse) rather than omitting it, ensuring consistent wire behavior- Author username is extracted from the owner URN consistently across both conversion paths
- Supported entity types for
GetContextDocuments: Dataset, GlossaryTerm, GlossaryNode, Container
Downstream usage — these methods enable mcp-data-platform to implement:
datahub_get_context_documentsMCP toolapply_knowledgechange types:add_context_document,update_context_document,remove_context_document- Search/entity response enrichment with context documentation
Security: go-sdk v1.4.1 (#114)
Bumps github.com/modelcontextprotocol/go-sdk from v1.4.0 to v1.4.1, which includes:
- Cross-origin request protection — verifies
Content-TypeandOriginheaders on JSON-RPC POST requests - Unicode zero character fix — patches a parsing vulnerability in the
segmentio/encodingJSON library - Custom HTTP client for OAuth — allows SSRF protection in
AuthorizationCodeHandler
CI Maintenance
- golangci-lint bumped to v2.11.4 (required for Go 1.25+ compatibility)
- anchore/sbom-action 0.23.1 → 0.24.0 (#120)
- github/codeql-action 4.32.6 → 4.34.1 (#119)
- codecov/codecov-action 5.5.2 → 5.5.3 (#118)
Compatibility
| Requirement | Version |
|---|---|
| Go | 1.25+ (raised by go-sdk v1.4.1) |
| DataHub (minimum) | 1.3.x |
| DataHub (full feature set incl. documents) | 1.4.x+ |
Breaking change note: CreateDocument now always includes settings: {showInGlobalContext: false} in the GraphQL mutation input, where previously it omitted the settings block when GlobalContext was false. This makes the wire behavior explicit but is functionally equivalent for DataHub — false is the server default.
Installation
Claude Desktop (macOS/Windows)
Download the .mcpb bundle for your platform and double-click to install:
- macOS Apple Silicon (M1/M2/M3/M4):
mcp-datahub_1.6.0_darwin_arm64.mcpb - macOS Intel:
mcp-datahub_1.6.0_darwin_amd64.mcpb - Windows:
mcp-datahub_1.6.0_windows_amd64.mcpb
Homebrew (macOS)
brew install txn2/tap/mcp-datahubClaude Code CLI
claude mcp add datahub \
-e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
-e DATAHUB_TOKEN=your-token \
-- mcp-datahubDocker
docker pull ghcr.io/txn2/mcp-datahub:v1.6.0Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-datahub_1.6.0_linux_amd64.tar.gz.sigstore.json \
mcp-datahub_1.6.0_linux_amd64.tar.gzmcp-datahub-v1.5.0
mcp-datahub v1.5.0 — CRUD Write Tools & Full DataHub Mutation Coverage
The biggest write capability release yet: reduces tool count from 16 to 12 while expanding write operations from 7 to 35 — covering the full DataHub mutation surface. All 36 GraphQL mutations verified against actual DataHub schema files.
+4,909 lines | -1,525 lines | 60 files changed | 91.3% tools coverage | 94.6% client coverage
Highlights
3 CRUD Tools Replace 7 Write Tools
The datahub_browse discriminator pattern (what parameter) is now applied to write operations. Three tools — datahub_create, datahub_update, datahub_delete — replace seven fine-grained tools while covering 5x more operations:
| v1.4.0 | v1.5.0 | |
|---|---|---|
| MCP tools | 16 (9 read + 7 write) | 12 (9 read + 3 write) |
| Write operations | 7 | 35 |
| Entity creation | None | 10 entity types |
| Entity deletion | Queries only | 8 entity types |
Context Document Support
Full CRUD support for DataHub context documents (DataHub 1.4.x+), enabling AI knowledge capture workflows. Documents are integrated into existing read tools (datahub_search, datahub_get_entity) and the new CRUD write tools.
Per-Connection Write Control
Multi-server deployments can now set write_enabled per connection, with proper override semantics:
null— inherit from global toolkit configtrue— explicitly enabled (overrides globalfalse)false— explicitly disabled (overrides globaltrue)
New Write Tools
datahub_create — 10 operations
Creates new entities via the what discriminator:
what |
Creates | Required fields |
|---|---|---|
tag |
Tag | name |
domain |
Domain | name |
glossary_term |
Glossary term | name |
data_product |
Data product | name, domain_urn |
document |
Context document | name |
application |
Application | name |
query |
Saved query | value (SQL) |
incident |
Incident | name, incident_type, entity_urns |
structured_property |
Structured property | qualified_name, value_type, entity_types |
data_contract |
Data contract | dataset_urns |
datahub_update — 17 operations
Updates existing entities via what + optional action:
what |
action |
Description |
|---|---|---|
description |
(not used) | Set entity description |
column_description |
(not used) | Set schema field description |
tag |
required: add/remove | Add or remove a tag |
glossary_term |
required: add/remove | Add or remove a glossary term |
link |
required: add/remove | Add or remove a link |
owner |
required: add/remove | Add or remove an owner |
domain |
set/remove (default: set) | Set or remove domain assignment |
structured_properties |
set/remove (default: set) | Set or remove structured property values |
structured_property |
(not used) | Update a structured property definition |
incident_status |
(not used) | Update incident status (requires state) |
incident |
(not used) | Update incident details |
query |
(not used) | Update query properties |
document_contents |
(not used) | Update document title/text |
document_status |
(not used) | Update document status |
document_related_entities |
(not used) | Update document related entities |
document_sub_type |
(not used) | Update document sub-type |
data_contract |
(not used) | Upsert a data contract |
datahub_delete — 8 operations
Deletes entities via what: query, tag, domain, glossary_entity, data_product, application, document, structured_property.
MCP Annotations
Each CRUD tool has distinct behavior annotations:
| Tool | DestructiveHint |
IdempotentHint |
OpenWorldHint |
|---|---|---|---|
datahub_create |
false |
false |
true |
datahub_update |
false |
true |
true |
datahub_delete |
true |
true |
true |
DataHub Version Compatibility
Minimum: DataHub 1.3.x. Full feature set: DataHub 1.4.x.
| DataHub Version | Features Available |
|---|---|
| 1.3.x+ (minimum) | All read tools, all write operations except documents (tags, domains, glossary, data products, queries, owners, links, descriptions, incidents, applications, structured properties incl. delete, data contracts) |
| 1.4.x+ (full) | + Documents (create/update/delete) |
All version claims verified against the DataHub v1.3.0 GraphQL schema source. The client gracefully returns empty results (not errors) when a read feature is unavailable on older versions.
New Client Methods (24 new, 8 newly exposed)
Entity creation (8)
CreateTag, CreateDomain, CreateGlossaryTerm, CreateDataProduct, CreateDocument, CreateApplication, CreateStructuredProperty, UpsertDataContract
Entity update (10)
AddOwner, RemoveOwner, SetDomain, UnsetDomain, UpdateIncident, UpdateStructuredProperty, UpdateDocumentContents, UpdateDocumentStatus, UpdateDocumentRelatedEntities, UpdateDocumentSubType
Entity delete (7)
DeleteTag, DeleteDomain, DeleteGlossaryEntity, DeleteDataProduct, DeleteApplication, DeleteDocument, DeleteStructuredProperty
Newly exposed (existed in client, now wired to MCP tools)
UpdateColumnDescription, CreateQuery, UpdateQuery, DeleteQuery, UpsertStructuredProperties, RemoveStructuredProperties, RaiseIncident, ResolveIncident
Bug Fixes (PRs #106, #108, #109)
UpsertStructuredPropertieswrong field name — usedpropertyUrn(REST name) instead ofstructuredPropertyUrn(GraphQL name)UpsertStructuredProperties/RemoveStructuredPropertiesmissing selection set — causedSubselectionRequiredvalidation errorsUpsertStructuredPropertiesraw values — passed raw values instead of typed objects ({"stringValue": "..."})ResolveIncidentwrong input type —UpdateIncidentStatusInput!renamed toIncidentStatusInput!in DataHub 1.4.xRaiseIncidentwrong field — sentresourceUrns(array of objects) instead ofresourceUrn(singular string)- Tags/descriptions/glossary terms on domains/glossary entities — REST API rejects these aspects for domain/glossaryTerm/glossaryNode; now routed through GraphQL mutations
UpdateDescriptionrejected domain/glossaryTerm — missing entries indescriptionAspectMap
Removed Tools (Breaking Change)
The following tools are replaced by datahub_update with the corresponding what + action parameters:
| Removed Tool | Replacement |
|---|---|
datahub_update_description |
datahub_update with what=description |
datahub_add_tag |
datahub_update with what=tag, action=add |
datahub_remove_tag |
datahub_update with what=tag, action=remove |
datahub_add_glossary_term |
datahub_update with what=glossary_term, action=add |
datahub_remove_glossary_term |
datahub_update with what=glossary_term, action=remove |
datahub_add_link |
datahub_update with what=link, action=add |
datahub_remove_link |
datahub_update with what=link, action=remove |
Library Breaking Changes
- ToolName constants removed:
ToolUpdateDescription,ToolAddTag,ToolRemoveTag,ToolAddGlossaryTerm,ToolRemoveGlossaryTerm,ToolAddLink,ToolRemoveLink - ToolName constants added:
ToolCreate,ToolUpdate,ToolDelete WriteTools()returns 3 tools instead of 7DataHubClientinterface grows by ~30 methods- Output types replaced: 7 per-tool output structs replaced by 3 CRUD output structs (
CreateOutput,UpdateOutput,DeleteOutput)
Installation
Claude Desktop (macOS/Windows)
Download the .mcpb bundle for your platform and double-click to install:
- macOS Apple Silicon (M1/M2/M3/M4):
mcp-datahub_1.5.0_darwin_arm64.mcpb - macOS Intel:
mcp-datahub_1.5.0_darwin_amd64.mcpb - Windows:
mcp-datahub_1.5.0_windows_amd64.mcpb
Homebrew (macOS)
brew install txn2/tap/mcp-datahubClaude Code CLI
claude mcp add datahub \
-e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
-e DATAHUB_TOKEN=your-token \
-- mcp-datahubDocker
docker pull ghcr.io/txn2/mcp-datahub:v1.5.0Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-datahub_1.5.0_linux_amd64.tar.gz.sigstore.json \
mcp-datahub_1.5.0_linux_amd64.tar.gzmcp-datahub-v1.4.3
What's Changed
Bug Fix: Structured Property Value Serialization (#109)
set_structured_property was failing with Expected type 'Map' but was 'String' errors when writing structured properties to DataHub 1.4.x.
Root cause: UpsertStructuredProperties was passing raw values (e.g., "2 years", 30) directly in the values array, but DataHub's GraphQL API expects typed value objects ({"stringValue": "..."} or {"numberValue": ...}).
Fix: Added a toTypedPropertyValue helper that wraps each raw Go value in the appropriate typed map before building the GraphQL mutation variables. This mirrors the read-side deserialization already handled by propertyValueEntry.toAny().
Affected tool: datahub_set_structured_property (write tool, requires WriteEnabled: true)
Commits
Full Changelog: v1.4.2...v1.4.3
Installation
Claude Desktop (macOS/Windows)
Download the .mcpb bundle for your platform and double-click to install:
- macOS Apple Silicon (M1/M2/M3/M4):
mcp-datahub_1.4.3_darwin_arm64.mcpb - macOS Intel:
mcp-datahub_1.4.3_darwin_amd64.mcpb - Windows:
mcp-datahub_1.4.3_windows_amd64.mcpb
Homebrew (macOS)
brew install txn2/tap/mcp-datahubClaude Code CLI
claude mcp add datahub \
-e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
-e DATAHUB_TOKEN=your-token \
-- mcp-datahubDocker
docker pull ghcr.io/txn2/mcp-datahub:v1.4.3Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-datahub_1.4.3_linux_amd64.tar.gz.sigstore.json \
mcp-datahub_1.4.3_linux_amd64.tar.gzmcp-datahub-v1.4.2
mcp-datahub v1.4.2 — DataHub 1.4.x Write Method Fixes (Part 2)
Fixes four additional write operation bugs discovered when running against DataHub 1.4.x instances. Continues the fixes started in v1.4.1. All read paths and REST write paths for dataset, dashboard, chart, dataFlow, dataJob, container, and dataProduct entities are unaffected.
+638 lines added, -25 removed across 6 files | All modified lines covered by tests | All CI checks pass
Bug Fixes
UpsertStructuredProperties Failed with "propertyUrn is not defined"
The GraphQL mutation input used propertyUrn (the REST API field name) instead of structuredPropertyUrn (the GraphQL field name), causing every call to fail with:
field name 'propertyUrn' is not defined for input object type 'StructuredPropertyInputParams'
Fix: Renamed the input field from propertyUrn to structuredPropertyUrn in the mutation variables.
ResolveIncident Failed with Wrong Input Type
The mutation declared $input: UpdateIncidentStatusInput! but DataHub 1.4.x renamed this type to IncidentStatusInput!, causing:
Variable 'input' of type 'UpdateIncidentStatusInput!' used in position expecting type 'IncidentStatusInput!'
Fix: Changed the type declaration from UpdateIncidentStatusInput! to IncidentStatusInput!.
Tags on Domain, GlossaryTerm, and GlossaryNode Failed via REST
The REST API does not register globalTags as a writable aspect for domain, glossaryTerm, or glossaryNode entities. AddTag and RemoveTag calls on these entity types failed with:
Unknown aspect globalTags for entity domain
Fix: These entity types are now routed through DataHub's GraphQL addTag/removeTag mutations instead of REST. All other entity types (dataset, dashboard, chart, etc.) continue to use REST unchanged.
Descriptions on Domain, GlossaryTerm, and GlossaryNode Failed via REST
Same root cause as the tag bug: the REST ingestProposal endpoint does not accept description aspect writes (domainProperties, glossaryTermInfo, glossaryNodeInfo) for these entity types, returning 422 validation errors.
Fix: These entity types are now routed through the GraphQL updateDescription mutation. All other entity types continue to use REST unchanged.
Glossary Term Associations on Domain, GlossaryTerm, and GlossaryNode Failed via REST
The REST API also does not register the glossaryTerms aspect for these entity types.
Fix: AddGlossaryTerm and RemoveGlossaryTerm for these entity types are now routed through GraphQL addTerm/removeTerm mutations.
Technical Details
Introduced a graphQLWriteTypes routing map for entity types that require GraphQL mutations instead of REST for write operations. The public API methods (AddTag, RemoveTag, AddGlossaryTerm, RemoveGlossaryTerm, UpdateDescription) check the entity type and route accordingly. All GraphQL mutation signatures were verified against the official DataHub schema (datahub-graphql-core/src/main/resources/entity.graphql).
Entity type routing:
| Entity Type | Tag Writes | Description Writes | Glossary Term Writes |
|---|---|---|---|
| dataset, dashboard, chart, dataFlow, dataJob, container, dataProduct | REST | REST | REST |
| domain, glossaryTerm, glossaryNode | GraphQL | GraphQL | GraphQL |
Note: AddLink/RemoveLink operations continue to use REST for all entity types including domain, glossaryTerm, and glossaryNode — the institutionalMemory aspect IS registered for these types in DataHub's entity registry.
Upgrade Notes
- No breaking changes. All fixes are backward compatible.
- No configuration changes required.
- Users who were working around the structured property or incident bugs can now use the MCP tools directly.
- Tag, description, and glossary term operations on domain, glossaryTerm, and glossaryNode entities now work — previously these returned errors.
Installation
Claude Desktop (macOS/Windows)
Download the .mcpb bundle for your platform and double-click to install:
- macOS Apple Silicon (M1/M2/M3/M4):
mcp-datahub_1.4.2_darwin_arm64.mcpb - macOS Intel:
mcp-datahub_1.4.2_darwin_amd64.mcpb - Windows:
mcp-datahub_1.4.2_windows_amd64.mcpb
Homebrew (macOS)
brew install txn2/tap/mcp-datahubClaude Code CLI
claude mcp add datahub \
-e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
-e DATAHUB_TOKEN=your-token \
-- mcp-datahubDocker
docker pull ghcr.io/txn2/mcp-datahub:v1.4.2Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-datahub_1.4.2_linux_amd64.tar.gz.sigstore.json \
mcp-datahub_1.4.2_linux_amd64.tar.gzmcp-datahub-v1.4.1
mcp-datahub v1.4.1 — DataHub 1.4.x Write Method Fixes
Fixes three broken write codepaths discovered when running against DataHub 1.4.x instances. All read paths from v1.4.0 were unaffected.
+131 lines added, -48 removed across 7 files | All modified lines covered by tests | All CI checks pass
Closes #106
Bug Fixes
Structured Property Mutations Failed with SubselectionRequired
UpsertStructuredProperties and RemoveStructuredProperties mutations were missing a required GraphQL selection set. DataHub 1.4.x returns StructuredProperties! (a non-scalar type) from these mutations, which requires the client to specify which fields to select. Without it, the GraphQL server rejected every call with a SubselectionRequired validation error.
Fix: Added { properties { structuredProperty { urn } } } selection set to both mutations and updated response structs to match. The returned data is not used — the selection set exists solely to satisfy GraphQL's type system.
RaiseIncident Failed with "at least 1 resource urn"
The RaiseIncident mutation was sending resourceUrns as an array of {"urn":"..."} objects. DataHub 1.4.x expects resourceUrn — a singular String! field.
Fix: Changed to send resourceUrn with the first element as a plain string. Added input validation that returns a clear error when no resource URNs are provided.
UpdateDescription Rejected Domain and GlossaryTerm Entities
UpdateDescription returned ErrUnsupportedEntityType for domain and glossaryTerm URNs. The descriptionAspectMap did not include these entity types, even though DataHub's entity registry confirms domainProperties and glossaryTermInfo are registered aspects writable via the REST ingest proposal API.
Fix: Added both mappings to descriptionAspectMap:
domain→domainProperties(field:description)glossaryTerm→glossaryTermInfo(field:definition, matching the GlossaryTermInfo PDL schema)
Also expanded globalTagsSupportedTypes and glossaryTermsSupportedTypes to include domain, glossaryTerm, and glossaryNode, enabling tag and glossary term operations on these entity types.
Upgrade Notes
- No breaking changes. All fixes are backward compatible.
- No configuration changes required.
- Users who were working around the structured property or incident bugs by calling the DataHub API directly can now use the MCP tools.
UpdateDescriptionon domain and glossaryTerm entities now works — previously these returned errors.
Installation
Claude Desktop (macOS/Windows)
Download the .mcpb bundle for your platform and double-click to install:
- macOS Apple Silicon (M1/M2/M3/M4):
mcp-datahub_1.4.1_darwin_arm64.mcpb - macOS Intel:
mcp-datahub_1.4.1_darwin_amd64.mcpb - Windows:
mcp-datahub_1.4.1_windows_amd64.mcpb
Homebrew (macOS)
brew install txn2/tap/mcp-datahubClaude Code CLI
claude mcp add datahub \
-e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
-e DATAHUB_TOKEN=your-token \
-- mcp-datahubDocker
docker pull ghcr.io/txn2/mcp-datahub:v1.4.1Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-datahub_1.4.1_linux_amd64.tar.gz.sigstore.json \
mcp-datahub_1.4.1_linux_amd64.tar.gzmcp-datahub-v1.4.0
mcp-datahub v1.4.0 — DataHub 1.4.x Feature Support
Full support for DataHub 1.4.x features: structured properties, incidents, data contracts, semantic search, and context documents — all delivered through existing MCP tools with zero new tools added. Gracefully degrades on DataHub 1.3.x deployments.
+3,483 lines added across 21 files | 96.6% patch coverage | All CI checks pass
Closes #94 (DataHub 1.4.x Compatibility & Feature Support tracking issue), #88, #89, #90, #91, #92, #93
Highlights
Zero New MCP Tools
All DataHub 1.4.x features are delivered through the existing datahub_get_entity and datahub_search tools. The MCP tool count remains at 16 (9 read + 7 write). This keeps the AI assistant's tool surface minimal and avoids tool explosion.
Backward Compatible
All new features gracefully degrade on DataHub 1.3.x:
- Read methods return
nil/empty results when the server doesn't support the feature - Write methods propagate errors, since callers need to know mutations failed
- Semantic search propagates errors, since the caller explicitly chose that mode
- No configuration flags needed — degradation is automatic via GraphQL error detection
Concurrent Entity Enrichment
datahub_get_entity now fetches structured properties, active incidents, and data contract status in parallel using goroutines, reducing enrichment latency from 3 sequential API round-trips to 1 concurrent batch.
New Features
Structured Properties (#88)
Typed custom metadata (retention policies, data classifications, SLAs) that replaces freeform custom properties for governed use cases.
New client methods:
| Method | Type | Fallback on 1.3.x |
|---|---|---|
GetStructuredProperties(ctx, urn) |
Read | nil (graceful) |
ListStructuredPropertyDefinitions(ctx) |
Read | nil (graceful) |
UpsertStructuredProperties(ctx, urn, properties) |
Write | error |
RemoveStructuredProperties(ctx, urn, propertyURNs) |
Write | error |
Supported entity types: Dataset, Dashboard, Chart, DataFlow, DataJob, Container, DataProduct
New types:
types.StructuredPropertyDefinition— property schema (qualified name, display name, value type, cardinality, allowed values, applicable entity types)types.StructuredPropertyValue— property assignment on an entity (property URN, definition, values)types.StructuredPropertyInput— input for upsert operations (property URN, values)types.AllowedValue— permitted value with optional description
MCP integration: datahub_get_entity responses now include a structured_properties field (array of property values with full definitions). Nil/omitted on DataHub < 1.4.x.
GraphQL implementation: Uses a named StructuredProps fragment to avoid duplicating the field selection across 7 entity type inline fragments. Handles the StringValue | NumberValue GraphQL union type.
Incidents (#89)
Asset health management — read active incidents, raise new incidents, resolve incidents.
New client methods:
| Method | Type | Fallback on 1.3.x |
|---|---|---|
GetIncidents(ctx, urn) |
Read | nil (graceful) |
RaiseIncident(ctx, input) |
Write | error |
ResolveIncident(ctx, incidentURN, message) |
Write | error |
Supported entity types: Dataset, Dashboard, Chart, DataFlow, DataJob, Container, DataProduct
New types:
types.Incident— incident details (URN, type, custom type, title, description, state, source, created/updated timestamps and actors)types.IncidentResult— incident list with total counttypes.RaiseIncidentInput— input for creating incidents (type, title, description, resource URNs)
MCP integration: datahub_get_entity responses now include an active_incidents field with total count and incident details. Nil/omitted on DataHub < 1.4.x or when no active incidents exist.
GraphQL implementation: Uses a named IncidentFields fragment. Fetches up to MaxLimit incidents since this is enrichment data (not user-paginated). Only retrieves ACTIVE state incidents.
Data Contracts (#90)
Freshness, schema, and data quality assertion results bundled into a single pass/fail quality signal per dataset.
New client method:
| Method | Type | Fallback on 1.3.x |
|---|---|---|
GetDataContract(ctx, datasetURN) |
Read | nil (graceful) |
Datasets only. The DataHub GraphQL API only supports data contracts on dataset entities.
New types:
types.DataContract— contract status (PASSING/FAILING) with individual assertion resultstypes.AssertionResult— assertion URN, type (FRESHNESS,SCHEMA,DATA_QUALITY), result type, and platform-specific native results
MCP integration: datahub_get_entity responses for datasets now include a data_contract field. Nil/omitted for non-dataset entities or DataHub < 1.4.x.
Enriched Entity Responses (#91)
datahub_get_entity automatically includes all three 1.4.x features in every response:
{
"urn": "urn:li:dataset:(urn:li:dataPlatform:snowflake,db.users,PROD)",
"type": "DATASET",
"name": "users",
"structured_properties": [
{
"property_urn": "urn:li:structuredProperty:io.acryl.privacy.retentionTime",
"definition": { "qualified_name": "io.acryl.privacy.retentionTime", "value_type": "NUMBER", ... },
"values": [30]
}
],
"active_incidents": {
"total": 1,
"incidents": [
{ "urn": "urn:li:incident:abc123", "type": "OPERATIONAL", "title": "Pipeline down", "state": "ACTIVE" }
]
},
"data_contract": {
"status": "PASSING",
"assertion_results": [
{ "assertion_urn": "urn:li:assertion:freshness-1", "type": "FRESHNESS", "result_type": "SUCCESS" }
]
}
}All three fields use omitempty JSON tags — they are completely absent from responses on DataHub 1.3.x, keeping payloads lean.
Semantic Search (#92)
Vector-based natural language search using the existing datahub_search tool.
New mode parameter on datahub_search:
keyword(default) — existing behavior, no changesemantic— vector embedding search viasemanticSearchAcrossEntitiesGraphQL query
Requirements: DataHub 1.4.x with OpenSearch 2.19.3+
New client method:
| Method | Type | Fallback on 1.3.x |
|---|---|---|
SemanticSearch(ctx, query, opts...) |
Read | error (explicit mode) |
Semantic search propagates errors (rather than returning empty results) because the caller explicitly chose this mode and needs to know it's unavailable.
Input validation: Unrecognized mode values (e.g., "vector", "hybrid") are rejected with a clear error message before any API call is made.
GraphQL implementation: The semantic search query supports 9 entity types (Dataset, Dashboard, Chart, DataFlow, DataJob, Container, DataProduct, GlossaryTerm, Tag) with full entity detail including ownership, tags, and domain information. Reuses the existing searchResultGQL parsing logic.
Context Documents (#93)
DataHub 1.4.x context documents are searchable via the existing datahub_search tool with entity_type: DOCUMENT. No new tools or client methods needed.
Entity Type Changes
The types.Entity struct gains three new fields:
// Nil when running against DataHub < 1.4.x
StructuredProperties []StructuredPropertyValue `json:"structured_properties,omitempty"`
ActiveIncidents *IncidentResult `json:"active_incidents,omitempty"`
DataContract *DataContract `json:"data_contract,omitempty"`Breaking Change
The DataHubClient interface gains 9 new methods. Consumers implementing this interface will need to add stubs for all 9 methods. This affects custom implementations only — users of the provided client.Client struct are unaffected.
New interface methods:
// Structured properties
GetStructuredProperties(ctx, urn) ([]types.StructuredPropertyValue, error)
ListStructuredPropertyDefinitions(ctx) ([]types.StructuredPropertyDefinition, error)
UpsertStructuredProperties(ctx, urn, []types.StructuredPropertyInput) error
RemoveStructuredProperties(ctx, urn, []string) error
// Incidents
GetIncidents(ctx, urn) (*types.IncidentResult, error)
RaiseIncident(ctx, types.RaiseIncidentInput) (string, error)
ResolveIncident(ctx, incidentURN, message) error
// Data contracts
GetDataContract(ctx, datasetURN) (*types.DataContract, error)
// Semantic search
SemanticSearch(ctx, query, ...client.SearchOption) (*types.SearchResult, error)Internal Improvements
- Entity type constants: Extracted string literals (
"dataset","dashboard", etc.) into package-level constants inpkg/client/write.goto fixgoconstlint violations - Named GraphQL fragments:
StructuredPropsandIncidentFieldsfragments reduce query duplication across entity type inline fragments - Debug logging: All graceful degradation paths (
GetStructuredProperties,ListStructuredPropertyDefinitions,GetIncidents,GetDataContract) log debug messages when errors are swallowed, visible when debug logging is enabled
Test Coverage
- 96.6% patch coverage (12 lines uncovered out of 353 new lines)
- 90.18% project coverage (+0.67% from v1.3.0)
-...