Skip to content

Add serialized parsing cache for shipped libraries#351

Merged
nlothian merged 2 commits intomainfrom
codex/implement-serialization-for-parsed-clauses
Dec 13, 2025
Merged

Add serialized parsing cache for shipped libraries#351
nlothian merged 2 commits intomainfrom
codex/implement-serialization-for-parsed-clauses

Conversation

@nlothian
Copy link
Copy Markdown
Owner

Summary

  • add on-disk serialization for parsed shipped library modules keyed by mtime and parser configuration
  • load cached ASTs during consult before parsing and regenerate serialized artifacts when sources change
  • cover cache hits, invalidation, and behavior parity with new tests

Testing

  • uv run pytest tests/test_module_parsing_cache.py

Codex Task

Copy link
Copy Markdown

@kilo-code-bot kilo-code-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ No Issues Found

2 files reviewed | Confidence: 95% | Recommendation: Merge

Review Details

Files: tests/test_module_parsing_cache.py (4 new tests), vibeprolog/interpreter.py (new serialization methods)

Checked: Security, bugs, performance, error handling, cache invalidation logic

Summary: The PR adds on-disk caching for parsed shipped library modules using pickle serialization. Cache is validated by file mtime and parser configuration. Tests cover cache hits, invalidation, and behavior parity. Implementation is robust with proper error handling and security considerations.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an on-disk cache for parsed library modules, which is a great performance enhancement. The implementation correctly uses file modification times and parser configuration to key the cache, ensuring correctness. The logic for loading from cache, invalidating it, and populating it on a miss seems sound. The new tests are thorough and cover cache hits, invalidation, and parity.

I've added a couple of comments on vibeprolog/interpreter.py:

  • A high-severity security note about the use of pickle for deserialization, which can be a vector for arbitrary code execution if cache files are located in untrusted directories.
  • A medium-severity suggestion to improve error handling when loading from the cache to avoid silently swallowing all exceptions, which will aid in debugging.

Overall, this is a solid contribution that should improve performance. Addressing the feedback will make it more robust and secure.

Comment thread vibeprolog/interpreter.py

try:
with cache_file.open("rb") as handle:
payload: SerializedParsedModule = pickle.load(handle)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Loading data with pickle.load from a file can lead to arbitrary code execution if the file's content can be controlled by an attacker. The cache files are stored in .vibe_parsed_cache directories within the library search paths. If any of these library paths are in a world-writable location (e.g., /tmp), it could expose a security vulnerability. Please ensure library paths are in trusted, permission-controlled locations. For greater security, consider using a safer serialization format like JSON, although this may require more work to serialize/deserialize the complex objects.

Comment thread vibeprolog/interpreter.py Outdated
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@nlothian nlothian merged commit e7a51a8 into main Dec 13, 2025
2 checks passed
@nlothian nlothian deleted the codex/implement-serialization-for-parsed-clauses branch December 13, 2025 07:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant