llms.txt: publish KNOWN_AGENT_HOSTS as a two-way attribution contract#23
Merged
Merged
Conversation
Add a ### Attribution name mapping subsection to the llms.txt
output that renders the full hostname → brand name map as a
Markdown table. Converts attribution from "merchant-side opaque
policy" into "declared two-way contract":
- Agents building UCP integrations see, up front, what brand
name their orders will be attributed under in the merchant's
Orders list. No need to complete a test checkout and inspect
utm_source to find out.
- Unmapped vendors have an explicit path to request
canonicalization (GitHub issues link). An agent at
mistral.ai can see their hostname isn't mapped, see that
the alternative would be "Source: mistral.ai" passthrough,
and decide whether to open an issue to get "Source: Mistral"
instead.
- Single source of truth: the table is rendered at generation
time from WC_AI_Syndication_UCP_Agent_Header::KNOWN_AGENT_HOSTS,
so the published contract and the runtime canonicalizer
cannot drift. Adding a vendor is still a single constant edit.
Also fixes stale copy introduced pre-1.6.7 that described the
server as "using the hostname from that header as utm_source
automatically" — it now correctly describes the canonicalization
step and points at the newly-published mapping table.
Grouping choice: one row per brand with hostnames comma-separated
(ChatGPT | chatgpt.com, openai.com) rather than one row per
hostname. More scannable for the ~14-entry map and makes multi-
hostname brands obvious at a glance.
Contract completeness: the section also documents the two
fallback paths — unknown hostname passes through verbatim,
missing/malformed UCP-Agent header yields the `ucp_unknown`
sentinel. Without these, novel vendors would have to guess.
Regression tests:
- Section header + Markdown table structure present
- Every KNOWN_AGENT_HOSTS key AND value appears in output
(auto-synchronized via foreach over the constant, so adding
a vendor to the map doesn't require a test update)
- Aliased hostnames (chatgpt.com + openai.com → ChatGPT) group
on a single row with the expected delimiter
- Both fallback clauses documented (passthrough + ucp_unknown)
- GitHub issues link present for vendor-addition requests
Quality gates: 379 PHPUnit tests / 1075 assertions (+5 from prior),
PHPCS clean, PHPStan clean, .pot regenerated.
Ships queued on main, no tag.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Publish the full hostname → brand name map (
WC_AI_Syndication_UCP_Agent_Header::KNOWN_AGENT_HOSTS) as a Markdown table in llms.txt, under a new ### Attribution name mapping subsection inside the existing## Attributionsection.This converts attribution from "merchant-side opaque policy" into "declared two-way contract."
Why
Before this, the canonicalization introduced in 1.6.7 was one-sided: the merchant saw
Source: Geminiinstead ofSource: Gemini.google.com, but the agent had no way to know we were doing this or what brand name they'd be attributed under. They'd learn it only by completing a test checkout and inspectingutm_sourceon the resulting order.After this:
mistral.aican see their hostname isn't mapped → passthrough displaysSource: mistral.ai→ they can open an issue to requestSource: Mistral.ucp_unknownfor missing/malformed headers. Novel vendors don't have to guess.How
The table is rendered at generation time from the PHP constant:
Single source of truth → the published contract and the runtime canonicalizer cannot drift. Adding a vendor to the map automatically updates llms.txt on the next cache-miss regeneration.
Sample output
Also: stale copy fix
Line 333 of
class-wc-ai-syndication-llms-txt.phpsaid "the server uses the hostname from that header asutm_sourceautomatically" — correct pre-1.6.7 but stale after. Updated to describe the canonicalization step and point at the mapping table.Test plan
LlmsTxtTest.php(32 tests total, 101 assertions)KNOWN_AGENT_HOSTSentry appears (auto-synchronized viaforeach).potregeneratedShips as
Merged to
main, not tagged. Queues with PR #21 (tooltip) and PR #22 (rate-limit move) behind the next release signal.🤖 Generated with Claude Code