Skip to content

Conversation

@xworld21
Copy link
Contributor

This is an attempt at refactoring all I/O calls that deal with file/directory names through wrapper functions in Util::Pathname. This means all the -X tests, open, opendir, stat, and unlink. There might be others I did not catch.

My goal is to guarantee that normal LaTeXML code is never exposed to raw file names, which are not Unicode strings and must be decoded and encoded before use. The enconding and decoding will be done in the wrapper functions, although not yet – I first want to know if this refactor is ok.

(The encoding and decoding itself is simple, see #2545 for Unix using Encode::Locale. For Windows, I concluded that the only way to go is Win32::LongPath.)

@brucemiller
Copy link
Owner

This is very scary, but I really like the idea of collecting all the IO stuff into one (hopefully, eventually) consistent API. Should also help to finally focus on dealing with some if the Issues & PR's that you've worked on regarding portability. We'll definitely want to think through the naming conventions, though. And we'll need a lot of feedback from @dginev, when he's available. [In the meantime: Thanks for all the effort!]

Copy link
Collaborator

@dginev dginev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am lightly concerned about the performance loss (many new function calls), but if the consistency is worthwhile that is likely fine. We only manage on the order of hundreds of files in a single conversion run.

The names of the new functions are fine actually - they fit the other pathname_* functions. If we wanted to be explicit that these are file test we can consider adding a _filetest_ piece. So pathname_filetest_r. Or we can go even shorter and do filetest_r? But if we were to add a new term, I'd use the one from the perl documentation.

@dginev
Copy link
Collaborator

dginev commented Oct 15, 2025

Also, could you please rebase the PR to the master branch? Windows CI should be passing again.

Copy link
Collaborator

@dginev dginev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circling back here (while Bruce is still "on holiday"), I still think we need a better naming strategy for the testing subroutines.

pathname_f just doesn't evoke any useful meaning to me. filetest_f does - so maybe that is a reasonable change? Or a name on those lines?

I mean that for the subs that do the perl -dash tests, specifically.

@zmughal
Copy link

zmughal commented Oct 29, 2025

I don't have deep knowledge of the code, but I did notice this PR.

Could some of this be done by bringing in Path::Tiny and subclassing it? It is a fairly standard OO wrapper for path handling. Via the ->stat method it calls the module File::stat, which is a core module that provides the ->cando() method to do the equivalent of the -X type tests.

It won't solve the encoding problem, but it may help with the API design.

@xworld21
Copy link
Contributor Author

xworld21 commented Nov 1, 2025

pathname_f just doesn't evoke any useful meaning to me. filetest_f does - so maybe that is a reasonable change? Or a name on those lines?

In the meanwhile I changed it to pathname_test_X.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants