Skip to content

Comments

[FEATURE]: create python binding#128

Merged
haochengxia merged 10 commits into1a1a11a:developfrom
haochengxia:pybind
Jul 13, 2025
Merged

[FEATURE]: create python binding#128
haochengxia merged 10 commits into1a1a11a:developfrom
haochengxia:pybind

Conversation

@haochengxia
Copy link
Collaborator

@haochengxia haochengxia commented Feb 22, 2025

Roadmap in #123

Example

import libcachesim as cachesim

# Create a cache with FIFO eviction policy
cache = cachesim.FIFO(cache_size=1024*1024)

# Create a request
req = cachesim.Request()
req.obj_id = 1
req.obj_size = 100

# Check if object is in cache
hit = cache.get(req)
print(f"Cache hit: {hit}")

@haochengxia haochengxia marked this pull request as draft February 23, 2025 18:27
@haochengxia haochengxia changed the title [WIP] a prototype for python binding [WIP] a prototype for Python binding Feb 23, 2025
@haochengxia haochengxia changed the title [WIP] a prototype for Python binding [WIP] Python binding Feb 24, 2025
@haochengxia haochengxia force-pushed the pybind branch 4 times, most recently from ebbd4a6 to fea7e86 Compare June 20, 2025 08:48
@haochengxia
Copy link
Collaborator Author

haochengxia commented Jun 20, 2025

Our project now includes initial Python bindings. Regarding clang-tidy analysis, I've temporarily skipped its checks for the binding-related code. The reason is that pybind11, a required dependency for these bindings, is only configured in the Python package's CMakeLists.txt and not recognized by the main CMake build system, leading to clang-tidy errors.

Therefore, I am considering different methods to support the full check,

  1. add find_package(pybind11) in main CMakeLists.txt.
  2. use seperate check logic for binding code.

@haochengxia haochengxia marked this pull request as ready for review June 20, 2025 08:56
@haochengxia haochengxia requested a review from 1a1a11a June 20, 2025 08:56
Repository owner deleted a comment from sonarqubecloud bot Jun 20, 2025
@haochengxia haochengxia requested a review from Copilot June 20, 2025 09:26

This comment was marked as outdated.

@haochengxia haochengxia changed the title [WIP] Python binding Python binding Jun 20, 2025
@haochengxia haochengxia requested a review from Copilot June 21, 2025 15:00
@haochengxia haochengxia self-assigned this Jun 21, 2025

This comment was marked as outdated.

@1a1a11a
Copy link
Owner

1a1a11a commented Jun 22, 2025

Our project now includes initial Python bindings. Regarding clang-tidy analysis, I've temporarily skipped its checks for the binding-related code. The reason is that pybind11, a required dependency for these bindings, is only configured in the Python package's CMakeLists.txt and not recognized by the main CMake build system, leading to clang-tidy errors.

Therefore, I am considering different methods to support the full check,

  1. add find_package(pybind11) in main CMakeLists.txt.
  2. use seperate check logic for binding code.

Since we have disabled clang-tidy, do we still need this?

@haochengxia
Copy link
Collaborator Author

Our project now includes initial Python bindings. Regarding clang-tidy analysis, I've temporarily skipped its checks for the binding-related code. The reason is that pybind11, a required dependency for these bindings, is only configured in the Python package's CMakeLists.txt and not recognized by the main CMake build system, leading to clang-tidy errors.
Therefore, I am considering different methods to support the full check,

  1. add find_package(pybind11) in main CMakeLists.txt.
  2. use seperate check logic for binding code.

Since we have disabled clang-tidy, do we still need this?

Removing it

@1a1a11a
Copy link
Owner

1a1a11a commented Jun 22, 2025

Awesome work! One concern that I have: If we only support running one request each time, it would be very slow, shall we support finding the trace length. i.e., #requests, #seconds, and replay the full trace, a fraction of the trace, N requests, N seconds of requests?

@haochengxia
Copy link
Collaborator Author

Awesome work! One concern that I have: If we only support running one request each time, it would be very slow, shall we support finding the trace length. i.e., #requests, #seconds, and replay the full trace, a fraction of the trace, N requests, N seconds of requests?

Good question! I think we can support it. The abstraction for this is that we do not move the data to the Python side but directly consume it in the C side.

graph LR
    C --Req--> Python --Req--> C;
Loading
graph LR
    C --Req--> C;
Loading

Therefore we should have a structure at C/Python sides to represent a range of requests. Currently, only working set size is exposed at the Python side. We can add other methods to Reader on trace info and statistics. Besides, let me add a method process_trace() to Cache obj.

The interface looks like

      .def("process_trace", [](cache_t& self, reader_t& reader, 
                              py::object max_requests, py::object max_seconds, 
                              py::object start_time, py::object end_time) {

In the future, we can add seamless data communication by zero-copy data movement. Directly interpreting data in each side.

graph LR
    C --Reqs--> Python;
    Python --Reqs--> C;
Loading

@haochengxia haochengxia force-pushed the pybind branch 4 times, most recently from 0ab5446 to fffa5db Compare June 25, 2025 21:13

This comment was marked as outdated.

@haochengxia haochengxia changed the title Python binding [FEAT]: create python binding Jul 11, 2025
@haochengxia haochengxia changed the title [FEAT]: create python binding [FEATURE]: create python binding Jul 11, 2025
@haochengxia haochengxia requested a review from Copilot July 11, 2025 03:06

This comment was marked as outdated.

@haochengxia haochengxia requested a review from Copilot July 11, 2025 03:16

This comment was marked as outdated.

@haochengxia haochengxia requested a review from Copilot July 11, 2025 03:27
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds Python bindings for libCacheSim, including C++ wrapper code, Python package modules, build scripts, and comprehensive tests and examples.

  • Introduce pylibcachesim.cpp to expose core simulator APIs via Pybind11
  • Add Python package structure under libCacheSim-python with eviction, const, and __init__ modules
  • Provide build helper scripts (sync_python_version.py, install_python.sh), CMake integration, and CI workflow
  • Include extensive tests in libCacheSim-python/tests and usage examples

Reviewed Changes

Copilot reviewed 25 out of 27 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
scripts/sync_python_version.py Script to sync Python package version
scripts/install_python.sh Automated Python binding build/install script
libCacheSim-python/src/pylibcachesim.cpp Pybind11 C++ wrapper for libCacheSim API
libCacheSim-python/pyproject.toml Python package metadata and build config
libCacheSim-python/CMakeLists.txt CMake setup for building Python extension
CMakeLists.txt (root) Export CMake vars for Python binding
.github/workflows/python.yml GitHub Actions for Python build and tests
libCacheSim-python/libcachesim/const.py Define Python-side TraceType enum
libCacheSim-python/libcachesim/eviction.py High-level Python eviction policy registry
libCacheSim-python/libcachesim/init.py Package initialization exports
libCacheSim-python/libcachesim/init.pyi Stub declarations for typing
libCacheSim-python/tests/utils.py Test utility for reference data lookup
libCacheSim-python/tests/test_unified_interface.py Tests unified interface across policies
libCacheSim-python/tests/test_python_hook_cache.py Tests PythonHookCachePolicy implementation
libCacheSim-python/tests/test_process_trace.py Tests process_trace for native and hook caches
libCacheSim-python/tests/test_eviction.py Parametrized eviction policy correctness tests
libCacheSim-python/tests/reference.csv Reference miss-ratio data
libCacheSim-python/tests/pytest.ini Pytest configuration
libCacheSim-python/tests/conftest.py Pytest fixtures
libCacheSim-python/examples/python_hook_cache_example.py Example of custom hook caches
libCacheSim-python/examples/demo_unified_interface.py Unified interface demo script
libCacheSim-python/export/README.md Documentation for CMake export mechanism
libCacheSim-python/export/CMakeLists.txt Export selected project variables to Python
libCacheSim-python/.gitignore Ignore rules for Python build artifacts

@haochengxia haochengxia requested a review from 1a1a11a July 11, 2025 04:12
@haochengxia
Copy link
Collaborator Author

Uploaded to PyPI

pip install libcachesim

@1a1a11a
Copy link
Owner

1a1a11a commented Jul 13, 2025

We may also want to promote the Python package on the frontpage README, but this can be added in a separate PR.

@1a1a11a
Copy link
Owner

1a1a11a commented Jul 13, 2025

Shall I merge it?

@haochengxia
Copy link
Collaborator Author

Yes, it can merged now.

@haochengxia haochengxia merged commit 3b29ba6 into 1a1a11a:develop Jul 13, 2025
6 checks passed
@1a1a11a
Copy link
Owner

1a1a11a commented Jul 13, 2025

Congratulations on the exciting feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants