Skip to content

rustdoc: add doc_link_canonical feature #143158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

lolbinarycat
Copy link
Contributor

@lolbinarycat lolbinarycat commented Jun 28, 2025

proposing this as a t-rustdoc experiment (do we have a process for those?), implementation was surprisingly simple so it's basically a complete implementation of my design (although ideally it would be integrated into cargo in the future using --crate-attr), will likely need FCP then RFC for stabilization, may also need perf testing to make sure added allocations have a negligible impact, not sure.

fixes #143139

r? @GuillaumeGomez

Problem Statement

There are two main usecases for this feature:

  1. crates that wish to have their canonical documentation site be somewhere other than docs.rs.
  2. crates that want to have docs.rs be their canonical documentation site, but their crate docs are frequently included in doc bundles hosted by a third party.

In both these cases, this can cause search engines to duplicate results or show the wrong (non-canonical) page in results.

Solution

A new crate-level doc attribute is added, html_link_canonical, which adds a link rel="canonical" element to the head of every documentation page (pages such as help and settings are excluded due to not belonging to any crate in particular), leveraging the existing system search engines use for de-duplicating equivalent pages.

Future Work

  1. When html_link_canonical is used with no value, it will use the value of html_root_url, or --extern-html-root-url, if present.
  2. cargo flag that implies -Zrustdoc-map and passes --crate-attr='doc(html_link_canonical)' to each rustdoc invocation. This will cause all crates that do not manually specify html_root_url or html_link_canonical to use the docs.rs page as the canonical page. Alternatively, instead of implying -Zrustdoc-map, it could simply reuse its logic, passing the value that would be passed to --extern-html-root-url to be passed via --crate-attr='doc(html_link_canonical="...")'.

@rustbot rustbot added A-attributes Area: Attributes (`#[…]`, `#![…]`) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. T-rustdoc-frontend Relevant to the rustdoc-frontend team, which will review and decide on the web UI/UX output. labels Jun 28, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 28, 2025

Some changes occurred in compiler/rustc_passes/src/check_attr.rs

cc @jdonszelmann

@lolbinarycat lolbinarycat force-pushed the rustdoc-rel-canonical-143139 branch from 50577b1 to 0a25ca7 Compare June 28, 2025 20:31
@lolbinarycat
Copy link
Contributor Author

The one feature that isn't currently implemented is having the default value be the same as from html_root_url if you just specify doc(html_link_canonical) with no value.

@rust-log-analyzer

This comment has been minimized.

@GuillaumeGomez
Copy link
Member

Took a look at the implementation, it seems correct. However, I'm not sure it solves it the correct way, would need to actually completely understand the issue (don't have time right now but didn't want to leave the PR pending with no feedback).

So for experimental features, we generally go through an FCP. Now, to make a PR able to go through FCP, the first comment needs to contain a lot more information: describe exactly the problem it tries to solve, how it solved it and why it solved it this way.

@lolbinarycat
Copy link
Contributor Author

@GuillaumeGomez did my best to describe the experiment, let me know if there's anything I can give more details on or write better.

@bors
Copy link
Collaborator

bors commented Jul 20, 2025

☔ The latest upstream changes (presumably #144181) made this pull request unmergeable. Please resolve the merge conflicts.

@GuillaumeGomez
Copy link
Member

Thanks for the complete explanations. Time to start the fcp then. :)

@rfcbot fcp merge

@rfcbot

This comment was marked as off-topic.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jul 20, 2025
@GuillaumeGomez
Copy link
Member

Woops, too many teams pinged. Sorry for the notifications...

@rfcbot fcp cancel

@rfcbot
Copy link
Collaborator

rfcbot commented Jul 20, 2025

@GuillaumeGomez proposal cancelled.

@rfcbot rfcbot removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jul 20, 2025
@GuillaumeGomez GuillaumeGomez removed T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 20, 2025
@GuillaumeGomez
Copy link
Member

@rfcbot fcp merge

@rfcbot
Copy link
Collaborator

rfcbot commented Jul 20, 2025

Team member @GuillaumeGomez has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jul 20, 2025
@notriddle
Copy link
Contributor

notriddle commented Jul 21, 2025

crates that want to have docs.rs be their canonical documentation site, but their crate docs are frequently included in doc bundles hosted by a third party.

Based on what I've seen discussed in rust-lang/docs.rs#1438 I think there's a double-bind if you try to do that.

  • you can't put a latest URL in here, because if this current crate is not the latest, it would result in a "canonical" link to a page that is different from the current crate, and that'll get you penalized
  • you shouldn't put a versioned URL in here, because the docs.rs team wants the latest URL to get more link juice, so that users don't get outdated docs in their search results

So I don't think you can actually used it for that.

@lolbinarycat
Copy link
Contributor Author

@notriddle couldn't you gate the attribute behind cfg(not(docsrs)) so that the attribute is only generated for third-party docs?

ofc, that would either have to wait for stabilization, or it would require a seperate nightly feature, so it wouldn't be a perfect solution.

@notriddle
Copy link
Contributor

couldn't you gate the attribute behind cfg(not(docsrs)) so that the attribute is only generated for third-party docs?

The problem isn't with linking from docs.rs. The problem is linking to docs.rs. There's no URL to point at that simultaneously satisfies the rel="canonical" law and the team's SEO goals.

Outside of docs.rs, you've still got the same core problem: you have to make sure that the page you link to has the same crate version, not just the same name, or Google will ignore the link. Other than docs.rs and doc.rust-lang.org, most sites have no way to do that, because they don't have version numbers in their URLs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-attributes Area: Attributes (`#[…]`, `#![…]`) disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc-frontend Relevant to the rustdoc-frontend team, which will review and decide on the web UI/UX output.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rustdoc: Option to insert rel="canonical" links to docs.rs (deduplicate search engine results)
7 participants