Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: relative cache support #114

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
215 changes: 215 additions & 0 deletions designs/2023-relative-cache/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
Repo: eslint/eslint
- Start Date: 2023-02-14
- RFC PR: <https://github.com/eslint/rfcs/pull/106>
- Authors: Christian Schulz (@cschulz) and cs6cs6

# Relative cache location strategy

## Summary

To be able to share the created eslint cache between different systems f.e. local machine and CI, there is a need for a relative path stored in the cache.

## Motivation

The overall motivation for doing this is performance during the eslint process.

Most people are running eslint on local developer machine and additional on the CI workers.
In high changing repositories there are several pull requests at the same time changing different files requiring linting all the time.
This is a time consuming process depending on the enabled rules like type checking rules.

If the CI worker is able to cache its last run, it can just pick this cache up and use it.
Also developers can reuse this cache or commit it as part of a repository for the CI.

## Detailed Design

I have a branch with the following design already. It is not fully tested, so I'm not 100% sure this will be the final implementation but I think it is a good start.

The LintResultCache takes a file path as parameter to find or store a file cache entry.

The suggested approach is to truncate the given absolute path of the file to a relative path in the cache.

### Example existing cache file

Here is an example existing cache file for an extremely small project with two source files. It has been formatted for readability.

You can see that the full path of the file is included in the filenames, '/home/USER/git/samplecode'. There is also an obscure field, the value
1k0siqs. That is the hash-value of a stringified ConfigArray object that contains all the configuration for the entire eslint run. It is there so
that any rule changes or configuration changes will cause a complete eslint recheck on all files.

```
[
{
"/home/USER/git/samplecode/src/formatter.ts": "1",
"/home/USER/git/samplecode/src/vite-env.d.ts": "2"
},
{
"size": 2584,
"mtime": 1695823503788,
"results": "3",
"hashOfConfig": "4"
},
{
"size": 75,
"mtime": 1695814197580,
"results": "5",
"hashOfConfig": "4"
},
{
"filePath": "6",
"messages": "7",
"suppressedMessages": "8",
"errorCount": 0,
"fatalErrorCount": 0,
"warningCount": 0,
"fixableErrorCount": 0,
"fixableWarningCount": 0
},
"1k0siqs",
{
"filePath": "9",
"messages": "10",
"suppressedMessages": "11",
"errorCount": 0,
"fatalErrorCount": 0,
"warningCount": 0,
"fixableErrorCount": 0,
"fixableWarningCount": 0
},
"/home/USER/git/samplecode/src/formatter.ts",
[],
[],
"/home/USER/git/samplecode/src/vite-env.d.ts",
[],
[]
]

```

### Suggested updated cache file

Here is a suggested updated cache file, with relative paths instead of full paths. The . is the directory from which
cs6cs6 marked this conversation as resolved.
Show resolved Hide resolved
eslint is run. Typically, eslint is always run from the same place, so using a relative path should not cause a problem.
cs6cs6 marked this conversation as resolved.
Show resolved Hide resolved

Note that the obscure hash value is still there, and is now Ak0sovs. This hash value will need to be calculated using
relative paths only.

```
[
{
"./src/formatter.ts": "1",
"./src/vite-env.d.ts": "2"
},
{
"size": 2584,
"mtime": 1695823503788,
"results": "3",
"hashOfConfig": "4"
},
{
"size": 75,
"mtime": 1695814197580,
"results": "5",
"hashOfConfig": "4"
},
{
"filePath": "6",
"messages": "7",
"suppressedMessages": "8",
"errorCount": 0,
"fatalErrorCount": 0,
"warningCount": 0,
"fixableErrorCount": 0,
"fixableWarningCount": 0
},
"Ak0sovs",
{
"filePath": "9",
"messages": "10",
"suppressedMessages": "11",
"errorCount": 0,
"fatalErrorCount": 0,
"warningCount": 0,
"fixableErrorCount": 0,
"fixableWarningCount": 0
},
"./src/formatter.ts",
[],
[],
"./src/vite-env.d.ts",
[],
[]
]
```

### Adding the command line parameter
- conf/default-cli-options.js: add the property 'shareableCache' with a default of false. It should be put in the section with the other cache variables, below cacheStrategy.
- docs/src/use/command-line-interface.md: Add an explanation of the 'shareable-cache' variable with the other cache variables: "By default, an eslintcache contains full file paths and thus cannot readily be shared across developers or ci machines. "False" is that default. "True" changes the internal storage to store a relative path to the process execution directory, thus making the eslintcache shareable across developers and ci machines. . If you change this setting, you must regenerate your entire eslint cache."
- eslint-helpers.js: add shareableCache variable (set to false) to the processOptions, and make sure it must be a boolean.
- lib/options.js: Add an option 'shareable-cache' of type Boolean with a nice description for people to read: By default, an eslintcache contains full file paths and thus cannot readily be shared across developers or ci machines. "False" is that default. "True" changes the internal storage to store a relative path to the process execution directory, thus making the eslintcache shareable across developers and ci machines. . If you change this setting, you must regenerate your entire eslint cache.
- lib/cli.js: Add the shareableCache option, defaulted to false, in the translateOptions function.
cs6cs6 marked this conversation as resolved.
Show resolved Hide resolved

cs6cs6 marked this conversation as resolved.
Show resolved Hide resolved
### Changing cache file serialization
- lib/cli-engine/lint-result-cache.js: Add the properties 'cwd' and 'shareableCache' to the LintResultCache class. 'cwd' is a string, the current working directory. shareableCache is a boolean, the result of the user's passed in command line parameter. Use these values in two main places:
1. Whenever we store or retrieve file results in the cache, use the results of the new class function, getFileKey(filePath). If shareableCache is false, it uses a full file path as the key. If shareableCache is true, uses a relative file path.
2. Update the function hashOfConfigFor, so that when the configuration object is hashed into a hash key and put into the eslintcache, if shareableCache is set to true, the file paths hashed are all relative paths. Implementation detail:
cs6cs6 marked this conversation as resolved.
Show resolved Hide resolved
```
cs6cs6 marked this conversation as resolved.
Show resolved Hide resolved
function hashOfConfigFor(config, cwd) {
if (!configHashCache.has(config)) {
// replace the full directory path in the config string to make the hash location-neutral
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this feature will only be implemented for the new config system, I believe there is no need for this because there are no absolute paths in configs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A place where we do have absolute paths is LintResult#filePath, so that's something we'll need to convert from absolute to relative and vice versa when cache entries are stored and retrieved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A place where we do have absolute paths is LintResult#filePath, so that's something we'll need to convert from absolute to relative and vice versa when cache entries are stored and retrieved.

Or, we could omit LintResult#filePath property from cache entries and add it back when cache entries are retrieved.

const stringConfig = stringify(config).replaceAll(cwd, '.');
cs6cs6 marked this conversation as resolved.
Show resolved Hide resolved
configHashCache.set(config, hash(`${pkg.version}_${nodeVersion}_${stringConfig}`));
}
```
- lib/cli-engine/cli-engine.js: Update the constructor call to LintResultCache to add the new parameters cwd and shareableCache.
- lib/eslint/flat-eslint.js: Update the constructor call to LintResultCache to add the new parameters cwd and shareableCache.

## Documentation

Add the new CLI flag to the CLI documentation. The new cli flag will be called shareable-cache.

## Drawbacks
cs6cs6 marked this conversation as resolved.
Show resolved Hide resolved

Adding a new CLI flag adds some complexity for the end user. They will need to understand why the feature exists.

Furthermore, defaulting to the existing behavior does help make upgrading less painful, but it is possible most people expect a shareable cache by default. This may cause confusion.

## Backwards Compatibility Analysis

The new flag, shareable-cache, passed in on the command line, will default to false. Unless people explicitly set it to true, then the old algorithms will
be used. Users of the eslint cache will still need to regenerate their cache no matter how they set this flag, because the current implementation forces a cache rebuild if there are
any configuration changes. Adding a new property counts as a change.
Copy link
Member

@fasttime fasttime May 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify the last sentence? Is a new property some field in the cache, in the config, or what exactly? Will removing a property or changing its value also count as a change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any new version of ESLint invalidates cache entries stored by a previous version, so I think this is not a concern.

https://github.com/eslint/eslint/blob/d2d83e045ad03f024d1679275708054d789ebe20/lib/cli-engine/lint-result-cache.js#L50


## Alternatives

A CLI tool which translates the existing cache to the current folder structure. This could be any sort of script that will basically do a replace-all on the paths in
the .eslintcache file and also recalculates the hash so that they are the new location rather than the old. A downside to this is that any tool based on string
replacement would be fragile at best.

## Open Questions

No.

## Help Needed

Decision needed: Make every cache shareable, or keep the shreable-cache flag default to false?
cs6cs6 marked this conversation as resolved.
Show resolved Hide resolved

I originally planned the shareable-cache flag so that users of eslint could not be surprised by changed cache behavior, and would also not have to regenerate their cache
for a patch version change. But I discovered by testing that there is no way to NOT force people to regenerate their eslint cache with this change. That's because by adding
the 'shareable-cache' flag, it adds a property to the ConfigArray object. Even if it's set to false (the old behavior), the object's structure changes with a new property
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this changing the ConfigArray? This setting is at the CLI level, not at the config level, so there shouldn't be any changes to the ConfigArray.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The object itself, along with all of its properties (set or not) is rolled into the computed hash code that goes into the file. So, simply by adding a new property, the hash computed of the object changes, even if that property is deliberately set to create no change in behavior.

field. This in turn changes the ConfigArray object's stringify results, which are fed into a hash function to create a hash key. (See lint-result-cache.js function hashOfConfigFor.)
In turn, that becomes a part of the eslintcache file. I think it's so that everything will be re-scanned with configuration changes.

For that reason, should we simply have a shareable cache full of relative paths be the only behavior? That means no command line flag, and strictly internal changes that
cs6cs6 marked this conversation as resolved.
Show resolved Hide resolved
would be invisible to the end user. On the other hand, if this change DOES have undesirable side effects because people are using eslint in unexpected ways, having the flag
default to the old behavior would save them from surprises on upgrade. Judging by the number of .gitignore files with .eslintcache in them, people are used to this
cache not being shareable.

What would the eslint team prefer? No flag, or keep the shareable-cache flag and default to false?

## Frequently Asked Questions

TBD

## Related Discussions

[Change Request: Eslintcache relative #16493](https://github.com/eslint/eslint/issues/16493)