Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix nls crash, and better refresh diagnostics. #1944

Merged
merged 5 commits into from
Jun 10, 2024
Merged

Fix nls crash, and better refresh diagnostics. #1944

merged 5 commits into from
Jun 10, 2024

Conversation

jneem
Copy link
Member

@jneem jneem commented Jun 5, 2024

In nls, if main.ncl imports dep.ncl and dep.ncl changes, we need
to invalidate some cached data about main.ncl and re-check it.
Previously we were simply resetting main.ncl to the "parsed" state in
the cache, but this isn't sufficient because our cached term for
main.ncl might depend on dep.ncl. Specifically, import resolution of
main.ncl transforms the term in a way that depends on whether
dep.ncl parsed.

This PR does the most inefficient (but easiest to get correct)
thing: throwing out all the parsed data and re-parsing from scratch.
This can be optimized, but it probably isn't too much slower than the
status quo, because we were re-typechecking from scratch anyway.

Fixes #1943 and maybe also #1942 and #1935. There are still some bugs related to cache state in the background worker, which I will investigate next.

jneem added 3 commits June 5, 2024 15:27
In nls, if `main.ncl` imports `dep.ncl` and `dep.ncl` changes, we need
to invalidate some cached data about `main.ncl` and re-check it.
Previously we were simply resetting `main.ncl` to the "parsed" state in
the cache, but this isn't sufficient because our cached term for
`main.ncl` might depend on `dep.ncl`. Specifically, import resolution of
`main.ncl` transforms the term in a way that depends on whether
`dep.ncl` parsed.

This commit does the most inefficient (but easiest to get correct)
thing: throwing out all the parsed data and re-parsing from scratch.
This can be optimized, but it probably isn't too much slower than the
status quo, because we were re-typechecking from scratch anyway.
When `main.ncl` imports `dep.ncl` and `dep.ncl` changes, we invalidate
`main.ncl` and re-do some checks. This commit adds background
invalidation to the list of checks that we re-do.
/// Remove the cached term associated with this id.
///
/// The file contents associated with this id remain, and they will be re-parsed if necessary.
pub fn reset(&mut self, file_id: FileId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feeling about this function. I wonder if it can lead to inconsistent state, like the entry's state saying that the file is parsed while it's not in the terms cache. Or some ResolvedImport somewhere depending on this file_id which suddenly becomes invalid.

I guess a first step would be to change the state of the entry as well. I would prefer this function to be private, but nls needs it obviously. Maybe both a comment warning about the fact that invalidating a file_id that may be in use somewhere is "dangereous" (can cause panics in practice), and/or giving the function an explicit name, like unsafe_reset or _reset or whatever?

Another possibility might be to handle the re-parsing in the same function, so that you can't observe externally the missing file, but I don't know if that would fit your workflow here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I should be clearing some of the other state (i.e. imports, rev_imports, and wildcards), but the entry state is contained in terms so there's no chance in it being stale, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion at the weekly, I'll refactor this so that Cache's API does the rev-dependency invalidation. This way Cache's public API can't leave things in an invalid state.

// Note that this will cause the contents to be re-parsed, which might be avoidable
// in some situations. It's safest to completely reset the state because post-parsing
// transformations (especially import resolution) can change the cached term in ways
// that are invalidated by the changes we just received.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, morally, would re-running the pipeline from import resolution (and then typechecking and analysis) be sufficient? Of course not in the current state, it would need some adaptation to program transformation. I'm just trying to understand at a high-level how we could make it better in the long term.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think re-running from import resolution should work, but then we'd need to keep a copy of the pre-import-resolution term (right now we consume it to produce the import-resolved term).

The more performant option would be to keep track of what changed with the dependency: if the dependency's success/failure status didn't change, we don't need to re-run import resolution. If the dependency's type didn't change, we don't need to re-run typechecking.

@github-actions github-actions bot temporarily deployed to pull request June 7, 2024 17:09 Inactive
@jneem
Copy link
Member Author

jneem commented Jun 7, 2024

@yannham I've implemented the solution we discussed yesterday, where the full recursive invalidation happens in the cache.

It occurred to me that there's still some possibility for misuse in the cache API, since replace_string deletes the cached term without invalidating things that import it. Maybe replace_string should do the invalidation automatically? I didn't do that here because it's more invasive...

@jneem jneem requested a review from yannham June 7, 2024 17:43
@yannham
Copy link
Member

yannham commented Jun 10, 2024

It occurred to me that there's still some possibility for misuse in the cache API, since replace_string deletes the cached term without invalidating things that import it. Maybe replace_string should do the invalidation automatically? I didn't do that here because it's more invasive...

Hmm, I'm not entirely set on this. On one hand you're right that this has the same risk of kinda inconsistent state lying around. On the other hand replace_string is used in practice for like queries and error message snippets, which shouldn't be imported from anywhere. Well then maybe it's virtually free to invalidate the reverse dependencies, because in practice there shouldn't be any?

Comment on lines -96 to -97
// Invalidate any cached inputs that imported the newly-opened file, so that any
// cross-file references are updated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this comment stay?

@jneem
Copy link
Member Author

jneem commented Jun 10, 2024

Well then maybe it's virtually free to invalidate the reverse dependencies, because in practice there shouldn't be any?

That's probably right, I just didn't want this bugfix to have unintended consequences...

@jneem jneem enabled auto-merge June 10, 2024 14:13
@github-actions github-actions bot temporarily deployed to pull request June 10, 2024 14:13 Inactive
@jneem jneem added this pull request to the merge queue Jun 10, 2024
Merged via the queue into master with commit 446128b Jun 10, 2024
5 checks passed
@jneem jneem deleted the nls-crash branch June 10, 2024 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LSP crash on _some_ modifications of a file imported in another
2 participants