Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extraneous text in PDF document outline items #484

Closed
sogaiu opened this issue Feb 20, 2025 · 5 comments
Closed

Extraneous text in PDF document outline items #484

sogaiu opened this issue Feb 20, 2025 · 5 comments

Comments

@sogaiu
Copy link

sogaiu commented Feb 20, 2025

Please have a look at the following screenshot of a PDF viewer window displaying a portion of "The Racket Reference" (Version 8.15) [1]:

Image

The left side shows the "Outline" (IIUC, sometimes referred to as Bookmarks or Table of Contents) for the PDF.

The right side shows a bit of the document proper.

With that basic description in mind, please note that the highlighted line on the left side shows the text:

3.1 Modules: IdentifierColorblackmodule, IdentifierColorblackmodule*, ...

It looks to me like there are instances of extra text (i.e. "IdentifierColorblack"). I think this should probably be more like:

3.1 Modules: module, module*, ...

(see the website version of the same section for comparison).

This doesn't appear to be an isolated incident as a similar situation exists in at least version 8.15 of the PDF for "The Racket Guide" [2], e.g. for the outline item for section 2.2.5 which has the text:

2.2.5 Conditionals with IdentifierColorblackif, IdentifierColorblackand, IdentifierColorblackor, and IdentifierColorblackcond


Some investigation revealed that one of the related .scrbl files has code like [3]:

@section[#:tag "module"]{Modules: @racket[module], @racket[module*], ...}

...and the corresponding locally generated .tex files contained lines like:

\Ssection{Modules: \RktSym{\badlink{\RktValLink{module}}}, \RktSym{\badlink{\RktValLink{module*}}}, ...}{Modules: \RktSym{\badlink{\RktValLink{module}}}, \RktSym{\badlink{\RktValLink{module*}}}, ...}\label{t:x28part_x22module22x29}

and:

\newcommand{\RktSym}[1]{\inColor{IdentifierColor}{#1}}

It looks like things wrapped in @racket[...] are leading to the "extra text" results.


(On a side note, some of the above investigation was carried out on the scribble Discord channel with spdegabrielle, soegaard, and velcrovan. velcrovan mentioned something about filing an issue.)


[1] PDF of The Racket Reference (Version 8.15)
[2] PDF of The Racket Guide (Version 8.15)
[3] Please see this bit for the corresponding "The Racket Guide" section 2.2.5 example.

@otherjoel
Copy link

otherjoel commented Feb 20, 2025

(I'm velcrovan on Discord) From section 6.11.5 in the Scribble docs:

\Ssection{}{}, \Ssubsection{}{}, \Ssubsubsection{}{}, \Ssubsubsubsection{}{}, \Ssubsubsubsubsection{}{} — for a top-level section, a second-level section, etc., where the last variant is used for all sections that are deeper than four levels. The first argument corresponds to the optional argument to \section, which is used for the table of contents.

As @sogaiu has shown, the generated latex supplies the same content for both arguments to \Ssection:

\Ssection{Modules: \RktSym{\badlink{\RktValLink{module}}}, \RktSym{\badlink{\RktValLink{module*}}}, ...}{Modules: \RktSym{\badlink{\RktValLink{module}}}, \RktSym{\badlink{\RktValLink{module*}}}, ...}

It seems that the result of \inColor is probably being interpreted as literal content when building the bookmarks/outline for the PDF. One approach might be for Scribble to strip off the formatting for that first argument, generating instead, e.g.:

\Ssection{Modules: module, module*, ...}{Modules: \RktSym{\badlink{\RktValLink{module}}}, \RktSym{\badlink{\RktValLink{module*}}}, ...}

But it might also be desirable to keep that formatting in the TOC shown in the PDF itself.

This StackOverflow answer seems like a better approach. Scribble appears to use the hyperref package to generate the PDF bookmarks. So if Scribble is able to create a version of the content with all the formatting stripped off, it might be able to use \texorpdfstring inside the first argument to \Ssection, supplying the plain-text version of the content as the second argument to \texorpdfstring. This approach might fix the bookmarks while still allowing the formatting in the TOC shown in the document:

\Ssection{\texorpdfstring{Modules: \RktSym{\badlink{\RktValLink{module}}}, \RktSym{\badlink{\RktValLink{module*}}}, ...}}{Modules: module, module*, ...}{Modules: \RktSym{\badlink{\RktValLink{module}}}, \RktSym{\badlink{\RktValLink{module*}}}, ...}

I suggest manually applying this fix on one of the affected section headings in a generated .tex file (I don't have time to do it myself). If that works then someone who knows Scribble guts would have a head start on fixing the issue.

@sogaiu
Copy link
Author

sogaiu commented Feb 21, 2025

Thanks for the further investigation and hints.

I tried applying the suggestion by wrapping \texorpdfstring around the first argument to \Ssection as:

\Ssection{\texorpdfstring{Modules: \RktSym{\badlink{\RktValLink{module}}}, \RktSym{\badlink{\RktValLink{module*}}}, ...}{Modules: module, module*, ...}}{Modules: \RktSym{\badlink{\RktValLink{module}}}, \RktSym{\badlink{\RktValLink{module*}}}, ...}\label{t:x28part_x22module22x29}

where the original was:

\Ssection{Modules: \RktSym{\badlink{\RktValLink{module}}}, \RktSym{\badlink{\RktValLink{module*}}}, ...}{Modules: \RktSym{\badlink{\RktValLink{module}}}, \RktSym{\badlink{\RktValLink{module*}}}, ...}\label{t:x28part_x22module22x29}

Due to things being on a single line, I think that's a bit hard to see here. Basically it's replacing the first argument to \Ssection with:

\texorpdfstring{Modules: \RktSym{\badlink{\RktValLink{module}}}, \RktSym{\badlink{\RktValLink{module*}}}, ...}{Modules: module, module*, ...}

which is calling \texorpdfstring with:

{Modules: \RktSym{\badlink{\RktValLink{module}}}, \RktSym{\badlink{\RktValLink{module*}}}, ...}

and:

{Modules: module, module*, ...}

Running the modified .tex file through pdflatex a couple of times seems to have had the desired effect.

In the image below, the highlighted item on the left side of the window is the one that has been modified and it appears to be lacking the IdentifierColor... bits as desired. (Note that as expected there are a number of other items that still possess some instances of IdentifierColor....)

Image

The steps I followed were roughly:

cd racket/pkgs/racket-doc/scribblings/reference

# create .tex file
scribble --latex syntax.scrbl

# modify the resulting syntax.tex to use \texorpdfstring (around line 549)
# start editor and cast appropriate spells

# invoke pdflatex a couple of times to end up with syntax.pdf
pdflatex syntax.tex
pdflatex syntax.tex

@otherjoel
Copy link

Thanks it looks like you figured out what I meant despite my sloppy code example. In the modified PDF, does the monospace formatting still appear correctly in the on-page TOC and in the heading itself?

@sogaiu
Copy link
Author

sogaiu commented Feb 22, 2025

Sorry, I didn't figure out how to build a .pdf for the whole reference with the suggested changes so I haven't verified about the on-page TOC [1]. I'm willing to try, it's just that I'm pretty ignorant of the necessary commands (^^;

As to the monospace formatting being preserved for the heading on the page text, that looks like it worked:

Image

It's the part with the text:

1 Modules: module, module*, ...

at the bottom of the right-hand side of the window in the image above, right?


[1] By "on-page TOC", I presume you mean the bit that starts on page 2 of the whole reference and starts with the word "Contents" at the top of the page.

@sogaiu
Copy link
Author

sogaiu commented Feb 22, 2025

Not sure if this was appropriate, but I tried the following steps to create a reference.pdf file using the changes in #485 (c54da36):

  1. Built racket "in-place" from source (racket/racket@1f49402)
  2. Modified the PATH environment variable to include $HOME/src/racket/racket/bin [1] (I presume this is what I need to use the in-place build of racket.)
  3. Hand-patched the two files from c54da36 that live under $HOME/src/racket/racket/share/pkgs/scribble-lib/scribble. (I did this in the hopes that invocations of scribble would be suitably affected.)
  4. Invoked scribble --pdf refrence.scrbl from within the $HOME/src/pkgs/racket-doc/scribblings/reference directory.

Below is a screenshot of the resulting reference.pdf:

Image

The result looks good to me 👍


[1] $HOME/src/racket is where I've cloned the racket repository to.

@mflatt mflatt closed this as completed in 0680862 Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants