Skip to content

Releases: txn2/mcp-datahub

mcp-datahub-v1.8.1

04 Apr 21:25
Immutable release. Only release title and notes can be modified.
ee23900

Choose a tag to compare

mcp-datahub v1.8.1 — Fix OutputSchema Validation on Empty Search Results

Fixes a bug where datahub_search returned "entities": null instead of "entities": [] when a search matched zero results, causing MCP OutputSchema validation failures. This affected both keyword and semantic search modes.

Bug fix

datahub_search returns null entities on zero results (#131)

When a search query matched no entities, SearchResult.Entities was never initialized (Go nil slice), and json.Marshal serialized it as null. The OutputSchema declares entities as "type": "array" which rejects null, causing clients to receive a validation error instead of an empty result set.

Root cause: doSearchAcrossEntities and Search created the SearchResult struct without initializing the Entities slice. The slice was only populated via append in a loop over results — zero results meant the loop never executed and Entities stayed nil.

Fix: Initialize Entities with make([]SearchEntity, 0, len(results)) in both code paths. This produces "entities": [] in JSON and pre-allocates the right capacity when results do exist.

Affected methods:

  • client.SearchAcrossEntities() (keyword + semantic via doSearchAcrossEntities)
  • client.Search() (legacy type-scoped search)

Upgrading

This is a patch release with no breaking changes. All users of v1.8.0 should upgrade.


Installation

Claude Desktop (macOS/Windows)

Download the .mcpb bundle for your platform and double-click to install:

  • macOS Apple Silicon (M1/M2/M3/M4): mcp-datahub_1.8.1_darwin_arm64.mcpb
  • macOS Intel: mcp-datahub_1.8.1_darwin_amd64.mcpb
  • Windows: mcp-datahub_1.8.1_windows_amd64.mcpb

Homebrew (macOS)

brew install txn2/tap/mcp-datahub

Claude Code CLI

claude mcp add datahub \
  -e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
  -e DATAHUB_TOKEN=your-token \
  -- mcp-datahub

Docker

docker pull ghcr.io/txn2/mcp-datahub:v1.8.1

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-datahub_1.8.1_linux_amd64.tar.gz.sigstore.json \
  mcp-datahub_1.8.1_linux_amd64.tar.gz

mcp-datahub-v1.8.0

04 Apr 19:45
Immutable release. Only release title and notes can be modified.
65ce402

Choose a tag to compare

mcp-datahub v1.8.0 — Advanced Search Filters via searchAcrossEntities

Upgrades datahub_search to use DataHub's searchAcrossEntities GraphQL API, unlocking advanced field-level filtering and multi-type search. Agents can now answer questions like "find tables with an email column" or "datasets owned by data-eng on the trino platform" without scanning every schema individually.

+1,405 lines | -178 lines | 25 files changed

Highlights

  • Advanced filters — new filters parameter supports DataHub's full filter vocabulary: fieldPaths (column names), fieldTags (column-level tags), fieldGlossaryTerms, fieldDescriptions, platform, domains, owners, tags, glossaryTerms, typeNames
  • Multi-type search — new types parameter searches across multiple entity types in a single call (e.g., ["DATASET", "DASHBOARD"])
  • Full backward compatibility — existing entity_type and simple query usage unchanged; defaults to DATASET
  • Semantic search paritySemanticSearch now also supports types and filters

New tool parameters

datahub_search(
  query: "*",
  types: ["DATASET", "DASHBOARD"],
  filters: [
    { field: "fieldPaths", values: ["email"], condition: "CONTAIN" },
    { field: "platform", value: "urn:li:dataPlatform:trino" },
    { field: "tags", values: ["urn:li:tag:deprecated"], negated: true }
  ]
)

Filter fields reference

Field Matches
fieldPaths Column/schema field names
fieldTags Column-level tags
fieldGlossaryTerms Column-level glossary terms
fieldDescriptions Column-level descriptions
platform Data platform (URN format)
domains Domain (URN format)
owners Owner (URN format)
tags Entity-level tags
glossaryTerms Entity-level glossary terms
typeNames Entity subtypes

New client API

Addition Description
client.SearchAcrossEntities() Search with types and filters via searchAcrossEntities GraphQL endpoint
client.SearchFilter Filter struct: Field, Values, Condition, Negated
client.WithTypes() Search option for multi-type search
client.WithSearchFilters() Search option for advanced field-level filters
client.DefaultEntityType Constant for the default entity type ("DATASET")

Bug fixes

  • Context documents default to visible — documents created via datahub_create what=document now default to GlobalContext: true and Status: "PUBLISHED" so they appear in the DataHub UI immediately (#122, #123)

Other changes

  • CI: bump actions/setup-go, codecov/codecov-action, sigstore/cosign-installer, actions/deploy-pages (#128)
  • SemanticSearchQuery now shares entity fragments with SearchAcrossEntitiesQuery (DataFlow, Tag, Document fragments added)
  • Shared doSearchAcrossEntities helper eliminates duplication between keyword and semantic code paths
  • Input validation rejects filters with empty field or missing value/values

Compatibility

DataHub Version Search Features
1.3.x+ (minimum) All search features including types, filters, searchAcrossEntities
1.4.x+ + Semantic search mode (mode: "semantic") with types and filters

Installation

Claude Desktop (macOS/Windows)

Download the .mcpb bundle for your platform and double-click to install:

  • macOS Apple Silicon (M1/M2/M3/M4): mcp-datahub_1.8.0_darwin_arm64.mcpb
  • macOS Intel: mcp-datahub_1.8.0_darwin_amd64.mcpb
  • Windows: mcp-datahub_1.8.0_windows_amd64.mcpb

Homebrew (macOS)

brew install txn2/tap/mcp-datahub

Claude Code CLI

claude mcp add datahub \
  -e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
  -e DATAHUB_TOKEN=your-token \
  -- mcp-datahub

Docker

docker pull ghcr.io/txn2/mcp-datahub:v1.8.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-datahub_1.8.0_linux_amd64.tar.gz.sigstore.json \
  mcp-datahub_1.8.0_linux_amd64.tar.gz

mcp-datahub-v1.7.1

30 Mar 03:21
Immutable release. Only release title and notes can be modified.
1e10f9f

Choose a tag to compare

mcp-datahub v1.7.1 — Default Documents to Visible and Published

Fixes document creation defaults so that documents are immediately visible in the DataHub UI. Previously, documents created via UpsertContextDocument or datahub_create what=document were hidden because showInGlobalContext defaulted to false and status defaulted to UNPUBLISHED.


Problem

Both document creation paths produced invisible documents:

Creation Path GlobalContext Status Visible in UI?
UpsertContextDocumentcreateContextDocument false (Go zero value) PUBLISHED No — hidden from global search
datahub_create what=documenthandleCreateDocument false (Go zero value) empty (server default = UNPUBLISHED) No — draft and hidden

This affected every consumer of UpsertContextDocument (used by mcp-data-platform's apply_knowledge tool for add_context_document workflows) and every datahub_create what=document call that didn't explicitly set both flags.

Fix (#123)

Both paths now default to visible, published documents:

Creation Path GlobalContext Status
createContextDocument true PUBLISHED (unchanged)
handleCreateDocument true PUBLISHED

Explicit overrides preserved

The global_context field on CreateInput was changed from bool to *bool, allowing callers to distinguish "not provided" (defaults to true) from "explicitly set to false":

// Default: visible and published
{"what": "document", "name": "My Doc", "description": "Content"}

// Explicit draft, hidden from global search
{"what": "document", "name": "My Draft", "status": "UNPUBLISHED", "global_context": false}

Changes

File Change
pkg/client/context_documents.go Added GlobalContext: true to CreateDocumentInput in createContextDocument
pkg/tools/write_create.go Default Status to "PUBLISHED" when empty; changed GlobalContext from bool to *bool with true default
pkg/tools/write_create_test.go Added tests for default behavior and explicit override
pkg/client/context_documents_test.go Added showInGlobalContext: true assertion in create path

Closes #122


Compatibility

Requirement Version
Go 1.25+
DataHub (minimum) 1.3.x
DataHub (full feature set incl. documents) 1.4.x+
DataHub (schema validated against) v1.5.0.1

Note: The global_context field type change from bool to *bool in CreateInput is a breaking change for Go callers that set the field directly. JSON callers (MCP tool users) are unaffected — true, false, and omitted all work as expected.


Installation

Claude Desktop (macOS/Windows)

Download the .mcpb bundle for your platform and double-click to install:

  • macOS Apple Silicon (M1/M2/M3/M4): mcp-datahub_1.7.1_darwin_arm64.mcpb
  • macOS Intel: mcp-datahub_1.7.1_darwin_amd64.mcpb
  • Windows: mcp-datahub_1.7.1_windows_amd64.mcpb

Homebrew (macOS)

brew install txn2/tap/mcp-datahub

Claude Code CLI

claude mcp add datahub \
  -e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
  -e DATAHUB_TOKEN=your-token \
  -- mcp-datahub

Docker

docker pull ghcr.io/txn2/mcp-datahub:v1.7.1

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-datahub_1.7.1_linux_amd64.tar.gz.sigstore.json \
  mcp-datahub_1.7.1_linux_amd64.tar.gz

mcp-datahub-v1.7.0

29 Mar 23:54
Immutable release. Only release title and notes can be modified.
7a406a8

Choose a tag to compare

mcp-datahub v1.7.0 — GraphQL Schema Alignment & Validation Infrastructure

Corrects GraphQL query field paths across four client modules by validating every query against the upstream DataHub schema source files. Adds automated schema validation infrastructure to prevent future drift — all 59 query/mutation constants are now checked against the official .graphql definitions from datahub-project/datahub.

+25,426 lines | -318 lines | 46 files changed


Highlights

GraphQL Query Corrections (#121)

Cross-referenced all GraphQL queries with the upstream DataHub schema files (datahub-graphql-core/src/main/resources/*.graphql) and corrected field paths that did not match the actual API:

Module Issue Fix
documents.go DocumentRelatedAsset, DocumentRelatedDocument, DocumentParentDocument queried a direct urn field that doesn't exist on these wrapper types Changed to relatedAssets { asset { urn } }, relatedDocuments { document { urn } }, parentDocument { document { urn } } per upstream documents.graphql
structured_properties.go Fragment targeted non-existent type EntityStructuredPropertiesResult Changed to StructuredProperties per upstream entity.graphql
data_contracts.go Queried contract { result(refresh: false) { type assertionResults { ... } } } — the result field and its nested structure don't exist on DataContract Rewrote to contract { properties { freshness/schema/dataQuality { assertion { urn } } } status { state } } per upstream contract.graphql
semantic_search.go Used non-existent input type SemanticSearchInput Changed to SearchAcrossEntitiesInput with searchAcrossEntities query per upstream search.graphql

All corrections were verified against both the upstream .graphql source files (v1.4.0.3 and v1.5.0.1) and a live DataHub v1.4.0.3 instance.

Schema Validation Infrastructure (#121)

Adds automated, offline-capable validation of GraphQL queries against the upstream DataHub schema:

  • testdata/datahub-schema/ — 31 .graphql schema files synced from datahub-project/datahub at tag v1.5.0.1, checked into the repo for CI without network access
  • testdata/datahub-schema/sync.sh — downloads schema files from any tagged DataHub release
  • pkg/client/schema_validation_test.go — validates all 59 query/mutation constants against the schema: checks fragment targets, top-level query/mutation fields, inline fragment type names, and input type references
  • make schema-sync — download schema files for a target version
  • make schema-check — run schema validation (now part of make verify)

Workflow for targeting a new DataHub version:

DATAHUB_VERSION=v1.5.0.1 make schema-sync   # pull schema files
make schema-check                             # validate all queries

DataHub Version Compatibility Matrix

Updated CLAUDE.md with a verified compatibility matrix:

DataHub Version Features Available Schema Validated
1.3.x+ (minimum) All read tools, all write operations except documents No (pre-dates schema sync)
1.4.x+ (full) + Documents (create/update/delete), semantic search Yes (v1.4.0.3)
1.5.x+ (current) + Batch data product operations Yes (v1.5.0.1)

Schema files were diff'd between v1.4.0.3 and v1.5.0.1 — the only change is a new batchAddToDataProducts/batchRemoveFromDataProducts mutation in entity.graphql. All types used by this library are identical across both versions.


Breaking Changes

types.AssertionResult simplified

The AssertionResult type in pkg/types/data_contracts.go was simplified to match the actual DataHub DataContract schema:

Removed fields:

  • ResultType string — the real API does not expose per-assertion result types through the contract query
  • NativeResults map[string]string — the real API does not expose native result details through the contract query

Before:

type AssertionResult struct {
    AssertionURN  string            `json:"assertion_urn"`
    Type          string            `json:"type"`
    ResultType    string            `json:"result_type"`
    NativeResults map[string]string `json:"native_results,omitempty"`
}

After:

type AssertionResult struct {
    AssertionURN string `json:"assertion_urn"`
    Type         string `json:"type"`
}

If you were reading ResultType or NativeResults from AssertionResult, those fields were never populated by the actual DataHub API.

DataContract.Status values changed

The Status field now contains the DataContractState enum value from the status.state field (e.g., "ACTIVE", "PENDING") rather than the previously unpopulated result.type field (which was intended to contain "PASSING" / "FAILING").


Compatibility

Requirement Version
Go 1.25+
DataHub (minimum) 1.3.x
DataHub (full feature set incl. documents) 1.4.x+
DataHub (schema validated against) v1.5.0.1

Installation

Claude Desktop (macOS/Windows)

Download the .mcpb bundle for your platform and double-click to install:

  • macOS Apple Silicon (M1/M2/M3/M4): mcp-datahub_1.7.0_darwin_arm64.mcpb
  • macOS Intel: mcp-datahub_1.7.0_darwin_amd64.mcpb
  • Windows: mcp-datahub_1.7.0_windows_amd64.mcpb

Homebrew (macOS)

brew install txn2/tap/mcp-datahub

Claude Code CLI

claude mcp add datahub \
  -e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
  -e DATAHUB_TOKEN=your-token \
  -- mcp-datahub

Docker

docker pull ghcr.io/txn2/mcp-datahub:v1.7.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-datahub_1.7.0_linux_amd64.tar.gz.sigstore.json \
  mcp-datahub_1.7.0_linux_amd64.tar.gz

mcp-datahub-v1.6.0

28 Mar 23:51
Immutable release. Only release title and notes can be modified.
fef02a4

Choose a tag to compare

mcp-datahub v1.6.0 — Context Documents Client API & Security Update

Adds a high-level Context Documents client API for entity-scoped document CRUD, enabling downstream knowledge pipeline workflows in mcp-data-platform. Also resolves two security vulnerabilities via the go-sdk v1.4.1 upgrade.

+1,075 lines | -21 lines | 14 files changed

Closes #116


Highlights

Context Documents CRUD Client Methods (#117)

Three new convenience methods compose existing Document primitives into an entity-scoped API with simplified types for downstream consumption:

Method Signature Description
GetContextDocuments (ctx, urn) → ([]ContextDocument, error) Retrieves context documents linked to an entity via a dedicated GraphQL query with ownership and timestamp fields
UpsertContextDocument (ctx, entityURN, doc) → (*ContextDocument, error) Creates (empty ID) or updates (populated ID) a context document; returns the full document after operation
DeleteContextDocument (ctx, documentID) → error Deletes a context document by its ID

New types in pkg/types/context_document.go:

  • ContextDocument — simplified, flattened view with ID, Title, Content, ContentType, Category, CreatedAt, UpdatedAt, Author
  • ContextDocumentAuthor — author identity (URN, Username) derived from ownership
  • ContextDocumentInput — upsert input (empty ID = create, populated = update)

Design decisions:

  • Context documents are standard DataHub Document entities with showInGlobalContext: false, linked to entities via relatedAssets — no separate DataHub API exists
  • UpsertContextDocument performs a post-write GetDocument fetch to return complete server-side data (timestamps, ownership)
  • CreateDocument now always sends the settings.showInGlobalContext field explicitly (even when false) rather than omitting it, ensuring consistent wire behavior
  • Author username is extracted from the owner URN consistently across both conversion paths
  • Supported entity types for GetContextDocuments: Dataset, GlossaryTerm, GlossaryNode, Container

Downstream usage — these methods enable mcp-data-platform to implement:

  • datahub_get_context_documents MCP tool
  • apply_knowledge change types: add_context_document, update_context_document, remove_context_document
  • Search/entity response enrichment with context documentation

Security: go-sdk v1.4.1 (#114)

Bumps github.com/modelcontextprotocol/go-sdk from v1.4.0 to v1.4.1, which includes:

  • Cross-origin request protection — verifies Content-Type and Origin headers on JSON-RPC POST requests
  • Unicode zero character fix — patches a parsing vulnerability in the segmentio/encoding JSON library
  • Custom HTTP client for OAuth — allows SSRF protection in AuthorizationCodeHandler

CI Maintenance

  • golangci-lint bumped to v2.11.4 (required for Go 1.25+ compatibility)
  • anchore/sbom-action 0.23.1 → 0.24.0 (#120)
  • github/codeql-action 4.32.6 → 4.34.1 (#119)
  • codecov/codecov-action 5.5.2 → 5.5.3 (#118)

Compatibility

Requirement Version
Go 1.25+ (raised by go-sdk v1.4.1)
DataHub (minimum) 1.3.x
DataHub (full feature set incl. documents) 1.4.x+

Breaking change note: CreateDocument now always includes settings: {showInGlobalContext: false} in the GraphQL mutation input, where previously it omitted the settings block when GlobalContext was false. This makes the wire behavior explicit but is functionally equivalent for DataHub — false is the server default.


Installation

Claude Desktop (macOS/Windows)

Download the .mcpb bundle for your platform and double-click to install:

  • macOS Apple Silicon (M1/M2/M3/M4): mcp-datahub_1.6.0_darwin_arm64.mcpb
  • macOS Intel: mcp-datahub_1.6.0_darwin_amd64.mcpb
  • Windows: mcp-datahub_1.6.0_windows_amd64.mcpb

Homebrew (macOS)

brew install txn2/tap/mcp-datahub

Claude Code CLI

claude mcp add datahub \
  -e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
  -e DATAHUB_TOKEN=your-token \
  -- mcp-datahub

Docker

docker pull ghcr.io/txn2/mcp-datahub:v1.6.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-datahub_1.6.0_linux_amd64.tar.gz.sigstore.json \
  mcp-datahub_1.6.0_linux_amd64.tar.gz

mcp-datahub-v1.5.0

17 Mar 20:47
Immutable release. Only release title and notes can be modified.
e3fd598

Choose a tag to compare

mcp-datahub v1.5.0 — CRUD Write Tools & Full DataHub Mutation Coverage

The biggest write capability release yet: reduces tool count from 16 to 12 while expanding write operations from 7 to 35 — covering the full DataHub mutation surface. All 36 GraphQL mutations verified against actual DataHub schema files.

+4,909 lines | -1,525 lines | 60 files changed | 91.3% tools coverage | 94.6% client coverage

Closes #112, #107


Highlights

3 CRUD Tools Replace 7 Write Tools

The datahub_browse discriminator pattern (what parameter) is now applied to write operations. Three tools — datahub_create, datahub_update, datahub_delete — replace seven fine-grained tools while covering 5x more operations:

v1.4.0 v1.5.0
MCP tools 16 (9 read + 7 write) 12 (9 read + 3 write)
Write operations 7 35
Entity creation None 10 entity types
Entity deletion Queries only 8 entity types

Context Document Support

Full CRUD support for DataHub context documents (DataHub 1.4.x+), enabling AI knowledge capture workflows. Documents are integrated into existing read tools (datahub_search, datahub_get_entity) and the new CRUD write tools.

Per-Connection Write Control

Multi-server deployments can now set write_enabled per connection, with proper override semantics:

  • null — inherit from global toolkit config
  • true — explicitly enabled (overrides global false)
  • false — explicitly disabled (overrides global true)

New Write Tools

datahub_create — 10 operations

Creates new entities via the what discriminator:

what Creates Required fields
tag Tag name
domain Domain name
glossary_term Glossary term name
data_product Data product name, domain_urn
document Context document name
application Application name
query Saved query value (SQL)
incident Incident name, incident_type, entity_urns
structured_property Structured property qualified_name, value_type, entity_types
data_contract Data contract dataset_urns

datahub_update — 17 operations

Updates existing entities via what + optional action:

what action Description
description (not used) Set entity description
column_description (not used) Set schema field description
tag required: add/remove Add or remove a tag
glossary_term required: add/remove Add or remove a glossary term
link required: add/remove Add or remove a link
owner required: add/remove Add or remove an owner
domain set/remove (default: set) Set or remove domain assignment
structured_properties set/remove (default: set) Set or remove structured property values
structured_property (not used) Update a structured property definition
incident_status (not used) Update incident status (requires state)
incident (not used) Update incident details
query (not used) Update query properties
document_contents (not used) Update document title/text
document_status (not used) Update document status
document_related_entities (not used) Update document related entities
document_sub_type (not used) Update document sub-type
data_contract (not used) Upsert a data contract

datahub_delete — 8 operations

Deletes entities via what: query, tag, domain, glossary_entity, data_product, application, document, structured_property.

MCP Annotations

Each CRUD tool has distinct behavior annotations:

Tool DestructiveHint IdempotentHint OpenWorldHint
datahub_create false false true
datahub_update false true true
datahub_delete true true true

DataHub Version Compatibility

Minimum: DataHub 1.3.x. Full feature set: DataHub 1.4.x.

DataHub Version Features Available
1.3.x+ (minimum) All read tools, all write operations except documents (tags, domains, glossary, data products, queries, owners, links, descriptions, incidents, applications, structured properties incl. delete, data contracts)
1.4.x+ (full) + Documents (create/update/delete)

All version claims verified against the DataHub v1.3.0 GraphQL schema source. The client gracefully returns empty results (not errors) when a read feature is unavailable on older versions.


New Client Methods (24 new, 8 newly exposed)

Entity creation (8)

CreateTag, CreateDomain, CreateGlossaryTerm, CreateDataProduct, CreateDocument, CreateApplication, CreateStructuredProperty, UpsertDataContract

Entity update (10)

AddOwner, RemoveOwner, SetDomain, UnsetDomain, UpdateIncident, UpdateStructuredProperty, UpdateDocumentContents, UpdateDocumentStatus, UpdateDocumentRelatedEntities, UpdateDocumentSubType

Entity delete (7)

DeleteTag, DeleteDomain, DeleteGlossaryEntity, DeleteDataProduct, DeleteApplication, DeleteDocument, DeleteStructuredProperty

Newly exposed (existed in client, now wired to MCP tools)

UpdateColumnDescription, CreateQuery, UpdateQuery, DeleteQuery, UpsertStructuredProperties, RemoveStructuredProperties, RaiseIncident, ResolveIncident


Bug Fixes (PRs #106, #108, #109)

  • UpsertStructuredProperties wrong field name — used propertyUrn (REST name) instead of structuredPropertyUrn (GraphQL name)
  • UpsertStructuredProperties / RemoveStructuredProperties missing selection set — caused SubselectionRequired validation errors
  • UpsertStructuredProperties raw values — passed raw values instead of typed objects ({"stringValue": "..."})
  • ResolveIncident wrong input typeUpdateIncidentStatusInput! renamed to IncidentStatusInput! in DataHub 1.4.x
  • RaiseIncident wrong field — sent resourceUrns (array of objects) instead of resourceUrn (singular string)
  • Tags/descriptions/glossary terms on domains/glossary entities — REST API rejects these aspects for domain/glossaryTerm/glossaryNode; now routed through GraphQL mutations
  • UpdateDescription rejected domain/glossaryTerm — missing entries in descriptionAspectMap

Removed Tools (Breaking Change)

The following tools are replaced by datahub_update with the corresponding what + action parameters:

Removed Tool Replacement
datahub_update_description datahub_update with what=description
datahub_add_tag datahub_update with what=tag, action=add
datahub_remove_tag datahub_update with what=tag, action=remove
datahub_add_glossary_term datahub_update with what=glossary_term, action=add
datahub_remove_glossary_term datahub_update with what=glossary_term, action=remove
datahub_add_link datahub_update with what=link, action=add
datahub_remove_link datahub_update with what=link, action=remove

Library Breaking Changes

  • ToolName constants removed: ToolUpdateDescription, ToolAddTag, ToolRemoveTag, ToolAddGlossaryTerm, ToolRemoveGlossaryTerm, ToolAddLink, ToolRemoveLink
  • ToolName constants added: ToolCreate, ToolUpdate, ToolDelete
  • WriteTools() returns 3 tools instead of 7
  • DataHubClient interface grows by ~30 methods
  • Output types replaced: 7 per-tool output structs replaced by 3 CRUD output structs (CreateOutput, UpdateOutput, DeleteOutput)

Installation

Claude Desktop (macOS/Windows)

Download the .mcpb bundle for your platform and double-click to install:

  • macOS Apple Silicon (M1/M2/M3/M4): mcp-datahub_1.5.0_darwin_arm64.mcpb
  • macOS Intel: mcp-datahub_1.5.0_darwin_amd64.mcpb
  • Windows: mcp-datahub_1.5.0_windows_amd64.mcpb

Homebrew (macOS)

brew install txn2/tap/mcp-datahub

Claude Code CLI

claude mcp add datahub \
  -e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
  -e DATAHUB_TOKEN=your-token \
  -- mcp-datahub

Docker

docker pull ghcr.io/txn2/mcp-datahub:v1.5.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-datahub_1.5.0_linux_amd64.tar.gz.sigstore.json \
  mcp-datahub_1.5.0_linux_amd64.tar.gz

mcp-datahub-v1.4.3

16 Mar 04:26
Immutable release. Only release title and notes can be modified.
dfe078c

Choose a tag to compare

What's Changed

Bug Fix: Structured Property Value Serialization (#109)

set_structured_property was failing with Expected type 'Map' but was 'String' errors when writing structured properties to DataHub 1.4.x.

Root cause: UpsertStructuredProperties was passing raw values (e.g., "2 years", 30) directly in the values array, but DataHub's GraphQL API expects typed value objects ({"stringValue": "..."} or {"numberValue": ...}).

Fix: Added a toTypedPropertyValue helper that wraps each raw Go value in the appropriate typed map before building the GraphQL mutation variables. This mirrors the read-side deserialization already handled by propertyValueEntry.toAny().

Affected tool: datahub_set_structured_property (write tool, requires WriteEnabled: true)

Commits

  • dfe078c: fix: wrap structured property values in typed objects for GraphQL (#109) (@cjimti)

Full Changelog: v1.4.2...v1.4.3


Installation

Claude Desktop (macOS/Windows)

Download the .mcpb bundle for your platform and double-click to install:

  • macOS Apple Silicon (M1/M2/M3/M4): mcp-datahub_1.4.3_darwin_arm64.mcpb
  • macOS Intel: mcp-datahub_1.4.3_darwin_amd64.mcpb
  • Windows: mcp-datahub_1.4.3_windows_amd64.mcpb

Homebrew (macOS)

brew install txn2/tap/mcp-datahub

Claude Code CLI

claude mcp add datahub \
  -e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
  -e DATAHUB_TOKEN=your-token \
  -- mcp-datahub

Docker

docker pull ghcr.io/txn2/mcp-datahub:v1.4.3

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-datahub_1.4.3_linux_amd64.tar.gz.sigstore.json \
  mcp-datahub_1.4.3_linux_amd64.tar.gz

mcp-datahub-v1.4.2

16 Mar 02:51
Immutable release. Only release title and notes can be modified.
9e89e56

Choose a tag to compare

mcp-datahub v1.4.2 — DataHub 1.4.x Write Method Fixes (Part 2)

Fixes four additional write operation bugs discovered when running against DataHub 1.4.x instances. Continues the fixes started in v1.4.1. All read paths and REST write paths for dataset, dashboard, chart, dataFlow, dataJob, container, and dataProduct entities are unaffected.

+638 lines added, -25 removed across 6 files | All modified lines covered by tests | All CI checks pass

Closes #107 via #108


Bug Fixes

UpsertStructuredProperties Failed with "propertyUrn is not defined"

The GraphQL mutation input used propertyUrn (the REST API field name) instead of structuredPropertyUrn (the GraphQL field name), causing every call to fail with:

field name 'propertyUrn' is not defined for input object type 'StructuredPropertyInputParams'

Fix: Renamed the input field from propertyUrn to structuredPropertyUrn in the mutation variables.

ResolveIncident Failed with Wrong Input Type

The mutation declared $input: UpdateIncidentStatusInput! but DataHub 1.4.x renamed this type to IncidentStatusInput!, causing:

Variable 'input' of type 'UpdateIncidentStatusInput!' used in position expecting type 'IncidentStatusInput!'

Fix: Changed the type declaration from UpdateIncidentStatusInput! to IncidentStatusInput!.

Tags on Domain, GlossaryTerm, and GlossaryNode Failed via REST

The REST API does not register globalTags as a writable aspect for domain, glossaryTerm, or glossaryNode entities. AddTag and RemoveTag calls on these entity types failed with:

Unknown aspect globalTags for entity domain

Fix: These entity types are now routed through DataHub's GraphQL addTag/removeTag mutations instead of REST. All other entity types (dataset, dashboard, chart, etc.) continue to use REST unchanged.

Descriptions on Domain, GlossaryTerm, and GlossaryNode Failed via REST

Same root cause as the tag bug: the REST ingestProposal endpoint does not accept description aspect writes (domainProperties, glossaryTermInfo, glossaryNodeInfo) for these entity types, returning 422 validation errors.

Fix: These entity types are now routed through the GraphQL updateDescription mutation. All other entity types continue to use REST unchanged.

Glossary Term Associations on Domain, GlossaryTerm, and GlossaryNode Failed via REST

The REST API also does not register the glossaryTerms aspect for these entity types.

Fix: AddGlossaryTerm and RemoveGlossaryTerm for these entity types are now routed through GraphQL addTerm/removeTerm mutations.


Technical Details

Introduced a graphQLWriteTypes routing map for entity types that require GraphQL mutations instead of REST for write operations. The public API methods (AddTag, RemoveTag, AddGlossaryTerm, RemoveGlossaryTerm, UpdateDescription) check the entity type and route accordingly. All GraphQL mutation signatures were verified against the official DataHub schema (datahub-graphql-core/src/main/resources/entity.graphql).

Entity type routing:

Entity Type Tag Writes Description Writes Glossary Term Writes
dataset, dashboard, chart, dataFlow, dataJob, container, dataProduct REST REST REST
domain, glossaryTerm, glossaryNode GraphQL GraphQL GraphQL

Note: AddLink/RemoveLink operations continue to use REST for all entity types including domain, glossaryTerm, and glossaryNode — the institutionalMemory aspect IS registered for these types in DataHub's entity registry.


Upgrade Notes

  • No breaking changes. All fixes are backward compatible.
  • No configuration changes required.
  • Users who were working around the structured property or incident bugs can now use the MCP tools directly.
  • Tag, description, and glossary term operations on domain, glossaryTerm, and glossaryNode entities now work — previously these returned errors.

Installation

Claude Desktop (macOS/Windows)

Download the .mcpb bundle for your platform and double-click to install:

  • macOS Apple Silicon (M1/M2/M3/M4): mcp-datahub_1.4.2_darwin_arm64.mcpb
  • macOS Intel: mcp-datahub_1.4.2_darwin_amd64.mcpb
  • Windows: mcp-datahub_1.4.2_windows_amd64.mcpb

Homebrew (macOS)

brew install txn2/tap/mcp-datahub

Claude Code CLI

claude mcp add datahub \
  -e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
  -e DATAHUB_TOKEN=your-token \
  -- mcp-datahub

Docker

docker pull ghcr.io/txn2/mcp-datahub:v1.4.2

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-datahub_1.4.2_linux_amd64.tar.gz.sigstore.json \
  mcp-datahub_1.4.2_linux_amd64.tar.gz

mcp-datahub-v1.4.1

15 Mar 23:57
Immutable release. Only release title and notes can be modified.
994577a

Choose a tag to compare

mcp-datahub v1.4.1 — DataHub 1.4.x Write Method Fixes

Fixes three broken write codepaths discovered when running against DataHub 1.4.x instances. All read paths from v1.4.0 were unaffected.

+131 lines added, -48 removed across 7 files | All modified lines covered by tests | All CI checks pass

Closes #106


Bug Fixes

Structured Property Mutations Failed with SubselectionRequired

UpsertStructuredProperties and RemoveStructuredProperties mutations were missing a required GraphQL selection set. DataHub 1.4.x returns StructuredProperties! (a non-scalar type) from these mutations, which requires the client to specify which fields to select. Without it, the GraphQL server rejected every call with a SubselectionRequired validation error.

Fix: Added { properties { structuredProperty { urn } } } selection set to both mutations and updated response structs to match. The returned data is not used — the selection set exists solely to satisfy GraphQL's type system.

RaiseIncident Failed with "at least 1 resource urn"

The RaiseIncident mutation was sending resourceUrns as an array of {"urn":"..."} objects. DataHub 1.4.x expects resourceUrn — a singular String! field.

Fix: Changed to send resourceUrn with the first element as a plain string. Added input validation that returns a clear error when no resource URNs are provided.

UpdateDescription Rejected Domain and GlossaryTerm Entities

UpdateDescription returned ErrUnsupportedEntityType for domain and glossaryTerm URNs. The descriptionAspectMap did not include these entity types, even though DataHub's entity registry confirms domainProperties and glossaryTermInfo are registered aspects writable via the REST ingest proposal API.

Fix: Added both mappings to descriptionAspectMap:

  • domaindomainProperties (field: description)
  • glossaryTermglossaryTermInfo (field: definition, matching the GlossaryTermInfo PDL schema)

Also expanded globalTagsSupportedTypes and glossaryTermsSupportedTypes to include domain, glossaryTerm, and glossaryNode, enabling tag and glossary term operations on these entity types.


Upgrade Notes

  • No breaking changes. All fixes are backward compatible.
  • No configuration changes required.
  • Users who were working around the structured property or incident bugs by calling the DataHub API directly can now use the MCP tools.
  • UpdateDescription on domain and glossaryTerm entities now works — previously these returned errors.

Installation

Claude Desktop (macOS/Windows)

Download the .mcpb bundle for your platform and double-click to install:

  • macOS Apple Silicon (M1/M2/M3/M4): mcp-datahub_1.4.1_darwin_arm64.mcpb
  • macOS Intel: mcp-datahub_1.4.1_darwin_amd64.mcpb
  • Windows: mcp-datahub_1.4.1_windows_amd64.mcpb

Homebrew (macOS)

brew install txn2/tap/mcp-datahub

Claude Code CLI

claude mcp add datahub \
  -e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
  -e DATAHUB_TOKEN=your-token \
  -- mcp-datahub

Docker

docker pull ghcr.io/txn2/mcp-datahub:v1.4.1

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-datahub_1.4.1_linux_amd64.tar.gz.sigstore.json \
  mcp-datahub_1.4.1_linux_amd64.tar.gz

mcp-datahub-v1.4.0

15 Mar 19:38
Immutable release. Only release title and notes can be modified.
06816e9

Choose a tag to compare

mcp-datahub v1.4.0 — DataHub 1.4.x Feature Support

Full support for DataHub 1.4.x features: structured properties, incidents, data contracts, semantic search, and context documents — all delivered through existing MCP tools with zero new tools added. Gracefully degrades on DataHub 1.3.x deployments.

+3,483 lines added across 21 files | 96.6% patch coverage | All CI checks pass

Closes #94 (DataHub 1.4.x Compatibility & Feature Support tracking issue), #88, #89, #90, #91, #92, #93


Highlights

Zero New MCP Tools

All DataHub 1.4.x features are delivered through the existing datahub_get_entity and datahub_search tools. The MCP tool count remains at 16 (9 read + 7 write). This keeps the AI assistant's tool surface minimal and avoids tool explosion.

Backward Compatible

All new features gracefully degrade on DataHub 1.3.x:

  • Read methods return nil/empty results when the server doesn't support the feature
  • Write methods propagate errors, since callers need to know mutations failed
  • Semantic search propagates errors, since the caller explicitly chose that mode
  • No configuration flags needed — degradation is automatic via GraphQL error detection

Concurrent Entity Enrichment

datahub_get_entity now fetches structured properties, active incidents, and data contract status in parallel using goroutines, reducing enrichment latency from 3 sequential API round-trips to 1 concurrent batch.


New Features

Structured Properties (#88)

Typed custom metadata (retention policies, data classifications, SLAs) that replaces freeform custom properties for governed use cases.

New client methods:

Method Type Fallback on 1.3.x
GetStructuredProperties(ctx, urn) Read nil (graceful)
ListStructuredPropertyDefinitions(ctx) Read nil (graceful)
UpsertStructuredProperties(ctx, urn, properties) Write error
RemoveStructuredProperties(ctx, urn, propertyURNs) Write error

Supported entity types: Dataset, Dashboard, Chart, DataFlow, DataJob, Container, DataProduct

New types:

  • types.StructuredPropertyDefinition — property schema (qualified name, display name, value type, cardinality, allowed values, applicable entity types)
  • types.StructuredPropertyValue — property assignment on an entity (property URN, definition, values)
  • types.StructuredPropertyInput — input for upsert operations (property URN, values)
  • types.AllowedValue — permitted value with optional description

MCP integration: datahub_get_entity responses now include a structured_properties field (array of property values with full definitions). Nil/omitted on DataHub < 1.4.x.

GraphQL implementation: Uses a named StructuredProps fragment to avoid duplicating the field selection across 7 entity type inline fragments. Handles the StringValue | NumberValue GraphQL union type.

Incidents (#89)

Asset health management — read active incidents, raise new incidents, resolve incidents.

New client methods:

Method Type Fallback on 1.3.x
GetIncidents(ctx, urn) Read nil (graceful)
RaiseIncident(ctx, input) Write error
ResolveIncident(ctx, incidentURN, message) Write error

Supported entity types: Dataset, Dashboard, Chart, DataFlow, DataJob, Container, DataProduct

New types:

  • types.Incident — incident details (URN, type, custom type, title, description, state, source, created/updated timestamps and actors)
  • types.IncidentResult — incident list with total count
  • types.RaiseIncidentInput — input for creating incidents (type, title, description, resource URNs)

MCP integration: datahub_get_entity responses now include an active_incidents field with total count and incident details. Nil/omitted on DataHub < 1.4.x or when no active incidents exist.

GraphQL implementation: Uses a named IncidentFields fragment. Fetches up to MaxLimit incidents since this is enrichment data (not user-paginated). Only retrieves ACTIVE state incidents.

Data Contracts (#90)

Freshness, schema, and data quality assertion results bundled into a single pass/fail quality signal per dataset.

New client method:

Method Type Fallback on 1.3.x
GetDataContract(ctx, datasetURN) Read nil (graceful)

Datasets only. The DataHub GraphQL API only supports data contracts on dataset entities.

New types:

  • types.DataContract — contract status (PASSING / FAILING) with individual assertion results
  • types.AssertionResult — assertion URN, type (FRESHNESS, SCHEMA, DATA_QUALITY), result type, and platform-specific native results

MCP integration: datahub_get_entity responses for datasets now include a data_contract field. Nil/omitted for non-dataset entities or DataHub < 1.4.x.

Enriched Entity Responses (#91)

datahub_get_entity automatically includes all three 1.4.x features in every response:

{
  "urn": "urn:li:dataset:(urn:li:dataPlatform:snowflake,db.users,PROD)",
  "type": "DATASET",
  "name": "users",
  "structured_properties": [
    {
      "property_urn": "urn:li:structuredProperty:io.acryl.privacy.retentionTime",
      "definition": { "qualified_name": "io.acryl.privacy.retentionTime", "value_type": "NUMBER", ... },
      "values": [30]
    }
  ],
  "active_incidents": {
    "total": 1,
    "incidents": [
      { "urn": "urn:li:incident:abc123", "type": "OPERATIONAL", "title": "Pipeline down", "state": "ACTIVE" }
    ]
  },
  "data_contract": {
    "status": "PASSING",
    "assertion_results": [
      { "assertion_urn": "urn:li:assertion:freshness-1", "type": "FRESHNESS", "result_type": "SUCCESS" }
    ]
  }
}

All three fields use omitempty JSON tags — they are completely absent from responses on DataHub 1.3.x, keeping payloads lean.

Semantic Search (#92)

Vector-based natural language search using the existing datahub_search tool.

New mode parameter on datahub_search:

  • keyword (default) — existing behavior, no change
  • semantic — vector embedding search via semanticSearchAcrossEntities GraphQL query

Requirements: DataHub 1.4.x with OpenSearch 2.19.3+

New client method:

Method Type Fallback on 1.3.x
SemanticSearch(ctx, query, opts...) Read error (explicit mode)

Semantic search propagates errors (rather than returning empty results) because the caller explicitly chose this mode and needs to know it's unavailable.

Input validation: Unrecognized mode values (e.g., "vector", "hybrid") are rejected with a clear error message before any API call is made.

GraphQL implementation: The semantic search query supports 9 entity types (Dataset, Dashboard, Chart, DataFlow, DataJob, Container, DataProduct, GlossaryTerm, Tag) with full entity detail including ownership, tags, and domain information. Reuses the existing searchResultGQL parsing logic.

Context Documents (#93)

DataHub 1.4.x context documents are searchable via the existing datahub_search tool with entity_type: DOCUMENT. No new tools or client methods needed.


Entity Type Changes

The types.Entity struct gains three new fields:

// Nil when running against DataHub < 1.4.x
StructuredProperties []StructuredPropertyValue `json:"structured_properties,omitempty"`
ActiveIncidents     *IncidentResult            `json:"active_incidents,omitempty"`
DataContract        *DataContract              `json:"data_contract,omitempty"`

Breaking Change

The DataHubClient interface gains 9 new methods. Consumers implementing this interface will need to add stubs for all 9 methods. This affects custom implementations only — users of the provided client.Client struct are unaffected.

New interface methods:

// Structured properties
GetStructuredProperties(ctx, urn) ([]types.StructuredPropertyValue, error)
ListStructuredPropertyDefinitions(ctx) ([]types.StructuredPropertyDefinition, error)
UpsertStructuredProperties(ctx, urn, []types.StructuredPropertyInput) error
RemoveStructuredProperties(ctx, urn, []string) error

// Incidents
GetIncidents(ctx, urn) (*types.IncidentResult, error)
RaiseIncident(ctx, types.RaiseIncidentInput) (string, error)
ResolveIncident(ctx, incidentURN, message) error

// Data contracts
GetDataContract(ctx, datasetURN) (*types.DataContract, error)

// Semantic search
SemanticSearch(ctx, query, ...client.SearchOption) (*types.SearchResult, error)

Internal Improvements

  • Entity type constants: Extracted string literals ("dataset", "dashboard", etc.) into package-level constants in pkg/client/write.go to fix goconst lint violations
  • Named GraphQL fragments: StructuredProps and IncidentFields fragments reduce query duplication across entity type inline fragments
  • Debug logging: All graceful degradation paths (GetStructuredProperties, ListStructuredPropertyDefinitions, GetIncidents, GetDataContract) log debug messages when errors are swallowed, visible when debug logging is enabled

Test Coverage

  • 96.6% patch coverage (12 lines uncovered out of 353 new lines)
  • 90.18% project coverage (+0.67% from v1.3.0)
    -...
Read more