Skip to content

Conversation

@timosachsenberg
Copy link
Collaborator

@timosachsenberg timosachsenberg commented Nov 29, 2025

  • Add tests for hash-based lookups in unordered_map (find, count, at)
  • Add tests for membership checks in unordered_set (count, find)
  • Add lookup tests for wrapped class containers (map, unordered_map, unordered_set)
  • Fix bug where loop variables 'key'/'value' could shadow function parameters
    of the same name in StdMapConverter and StdUnorderedMapConverter

The new tests verify that dict/set operations actually use hashing for O(1)
lookups rather than just iterating through all items.

Summary by CodeRabbit

  • New Features

    • Added C++17 STL container support for unordered_map, unordered_set, deque, list, optional, and string_view with Python bindings.
  • Tests

    • Expanded test coverage for C++17 STL containers and wrapped container scenarios including lookups and membership checks.
  • Refactor

    • Improved internal conversion logic for consistent handling of container iterations and element access.

✏️ Tip: You can customize this high-level summary in your review settings.

…collision bug

- Add tests for hash-based lookups in unordered_map (find, count, at)
- Add tests for membership checks in unordered_set (count, find)
- Add lookup tests for wrapped class containers (map, unordered_map, unordered_set)
- Fix bug where loop variables 'key'/'value' could shadow function parameters
  of the same name in StdMapConverter and StdUnorderedMapConverter

The new tests verify that dict/set operations actually use hashing for O(1)
lookups rather than just iterating through all items.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 29, 2025

Walkthrough

This PR refactors STL container iteration variable naming in the code generator's conversion logic to use mangled names (loop_key/loop_value), and introduces comprehensive C++17 STL container support. It replaces the new_stl_test framework with cpp17_stl_test, adds extended test infrastructure for wrapped containers, and includes new runtime validation tests covering unordered_map, unordered_set, deque, list, optional, and string_view bindings.

Changes

Cohort / File(s) Summary
Core conversion logic
autowrap/ConversionProvider.py
Updates map/set iteration to use mangled loop variable names (loop_key/loop_value) instead of direct argument names, preventing parameter-name collisions across nested conversions and all affected code paths.
Test infrastructure removal
tests/test_code_generator_new_stl.py, tests/test_files/new_stl_test.pxd
Removes old STL test scaffolding, including code-generation and runtime test suites for the previous STL container implementation.
C++17 STL test support (new)
tests/test_files/cpp17_stl_test.hpp, tests/test_files/cpp17_stl_test.pxd, tests/test_files/cpp17_stl_test.pyx, tests/test_code_generator_cpp17_stl.py
Introduces comprehensive C++17 STL container bindings and validation suite covering unordered_map, unordered_set, deque, list, optional, and string_view with code-generation verification and extensive runtime tests.
Wrapped container test helpers
tests/test_files/new_stl_test.hpp, tests/test_files/wrapped_container_test.hpp, tests/test_files/wrapped_container_test.pxd, tests/test_wrapped_containers.py
Adds lookup, membership, and value-retrieval methods for map/unordered_map/unordered_set operations on wrapped types; updates hash specialization for Item; expands test coverage with new assertions for hash semantics and edge cases.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • autowrap/ConversionProvider.py: Review the critical loop variable mangling logic across all code paths (enum types, pointer types, vector/map/set conversions) to ensure consistent and correct variable substitution.
  • Cython files (cpp17_stl_test.pyx): Verify C++/Python type conversions, container marshalling, in-place mutations, and correctness of string_view and optional handling.
  • Test validation: Ensure comprehensive coverage of edge cases (missing keys, None values, hash collisions) across all container types in the new test suite.

Possibly related PRs

  • Add C++17 STL container converters #202: Adds initial STL container support infrastructure; this PR refines and extends those conversion paths with corrected loop variable naming and comprehensive C++17 STL bindings.

Suggested reviewers

  • jpfeuffer
  • poshul

Poem

🐰 Loop variables dance in mangled names so fine,
Containers wrap in C++, from map to optional's shrine,
Unordered sets and deques now play,
String views peek without delay,
Autowrap hops forward—another milestone divine! 🌟

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title accurately describes the main changes: adding lookup/membership tests for STL containers and fixing a variable name collision bug in the conversion logic.
Docstring Coverage ✅ Passed Docstring coverage is 81.48% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch claude/add-stl-container-tests-01SghEieKPbunGiJjFeExriA

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/test_files/new_stl_test.pyx (1)

346-372: Minor docstring/assertion inconsistency in string_view methods.

The docstrings declare parameter type as bytes, but the assertions check for (bytes, str). While functionally harmless (the assertion is more permissive), this creates a documentation inconsistency.

Either update the docstring to match the assertion:

-        getStringViewLength(self, sv: bytes ) -> int
+        getStringViewLength(self, sv: Union[bytes, str] ) -> int

Or update the assertion to match the docstring:

-        assert isinstance(sv, (bytes, str)), 'arg sv wrong type'
+        assert isinstance(sv, bytes), 'arg sv wrong type'
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6535794 and 7cd8106.

📒 Files selected for processing (8)
  • autowrap/ConversionProvider.py (3 hunks)
  • tests/test_code_generator_new_stl.py (3 hunks)
  • tests/test_files/new_stl_test.hpp (2 hunks)
  • tests/test_files/new_stl_test.pxd (1 hunks)
  • tests/test_files/new_stl_test.pyx (6 hunks)
  • tests/test_files/wrapped_container_test.hpp (3 hunks)
  • tests/test_files/wrapped_container_test.pxd (3 hunks)
  • tests/test_wrapped_containers.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
tests/test_wrapped_containers.py (2)
tests/test_files/new_stl_test.hpp (9)
  • m (23-29)
  • m (31-37)
  • m (31-31)
  • m (40-46)
  • m (40-40)
  • m (49-51)
  • m (49-49)
  • m (54-56)
  • m (54-54)
tests/test_files/wrapped_container_test.hpp (27)
  • m (102-108)
  • m (102-102)
  • m (119-125)
  • m (119-119)
  • m (128-130)
  • m (128-128)
  • m (136-142)
  • Item (21-21)
  • Item (22-22)
  • Item (23-23)
  • Item (24-24)
  • items (58-64)
  • items (58-58)
  • items (74-76)
  • items (74-74)
  • items (82-88)
  • items (82-82)
  • items (300-306)
  • items (300-300)
  • items (320-326)
  • items (320-320)
  • items (340-346)
  • items (340-340)
  • items (357-359)
  • items (357-357)
  • items (362-368)
  • items (362-362)
autowrap/ConversionProvider.py (1)
autowrap/Code.py (1)
  • add (56-75)
tests/test_files/new_stl_test.hpp (1)
tests/test_files/wrapped_container_test.hpp (16)
  • m (102-108)
  • m (102-102)
  • m (119-125)
  • m (119-119)
  • m (128-130)
  • m (128-128)
  • m (136-142)
  • m (136-136)
  • m (176-185)
  • m (176-176)
  • m (191-197)
  • m (191-191)
  • m (211-217)
  • m (211-211)
  • m (231-237)
  • m (231-231)
🪛 Ruff (0.14.6)
tests/test_code_generator_new_stl.py

126-126: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)


127-128: try-except-pass detected, consider logging the exception

(S110)


127-127: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: test (==3.2.0, 3.13)
  • GitHub Check: test (==3.2.0, 3.12)
  • GitHub Check: test (==3.1.0, 3.10)
  • GitHub Check: test (==3.2.0, 3.11)
  • GitHub Check: test (==3.1.0, 3.13)
  • GitHub Check: test (==3.2.0, 3.10)
  • GitHub Check: test (==3.1.0, 3.11)
  • GitHub Check: test (==3.1.0, 3.12)
🔇 Additional comments (21)
autowrap/ConversionProvider.py (3)

860-863: Excellent bug fix for variable name collision.

The introduction of mangled loop variable names (loop_key and loop_value) properly addresses the issue where loop variables could shadow function parameters with the same names. Using mangle() with the argument_var ensures unique variable names in the generated Cython code.


865-972: Consistent application of mangled variables throughout the method.

The mangled loop_key and loop_value variables are correctly used in all conversion expressions and the loop iteration (line 972). This ensures the fix is complete and prevents collisions across all code paths in the StdMapConverter.


2237-2260: Bug fix correctly applied to StdUnorderedMapConverter.

The same mangled variable approach is consistently implemented here, mirroring the fix in StdMapConverter. This ensures both map-based converters are protected from parameter name collision issues.

tests/test_files/new_stl_test.hpp (2)

39-56: LGTM! Clean implementation of unordered_map accessor methods.

The three accessor methods follow established patterns: find() for safe lookup with sentinel return, count() for existence check, and at() for throwing accessor. This aligns well with the patterns in wrapped_container_test.hpp (lookupMapIntToItem, hasKeyMapIntToItem).


74-91: LGTM! Consistent unordered_set accessor implementations.

The set methods mirror the map patterns appropriately. While findUnorderedSet returning the found value seems redundant (you already know what you're searching for), it effectively validates that the hash-based find() operation works correctly through the Python bindings.

tests/test_files/new_stl_test.pxd (1)

37-46: LGTM! Correct Cython declarations with proper exception handling.

The except + annotation on getValueUnorderedMap is correctly applied since std::unordered_map::at() throws std::out_of_range for missing keys. The other methods use safe patterns (returning sentinel values) and don't require exception handling.

tests/test_files/wrapped_container_test.hpp (3)

118-130: LGTM! Proper map lookup implementations.

The find()-based lookup and count()-based existence check follow the established patterns. Using const reference for the map parameter is correct.


247-259: LGTM! Consistent unordered_map implementations.

These mirror the std::map versions exactly, which is appropriate since both containers share the same lookup interface.


356-368: LGTM! Clean unordered_set membership and find implementations.

The hash-based membership test and find operation correctly use the Item's hash function defined earlier in the file.

tests/test_code_generator_new_stl.py (2)

47-62: LGTM! Good coverage of generated method assertions.

The assertions verify that all new lookup and membership methods are correctly generated in the output.


138-152: LGTM! Comprehensive unordered_set membership and find tests.

Good coverage of both present and missing values, verifying the hash-based O(1) operations work correctly through the bindings.

tests/test_wrapped_containers.py (3)

147-178: LGTM! Thorough map lookup and key existence tests.

Good coverage testing both existing keys (1, 5, 10) and missing keys (999, 3). The tests properly verify the -1 sentinel return for missing keys.


341-372: LGTM! Consistent unordered_map tests matching the map tests.

The docstrings correctly note "hash-based O(1) lookup", which aligns with the PR objective of verifying that dict/set operations use hashing rather than iteration.


503-541: LGTM! Well-designed unordered_set membership and find tests.

Creating new Item instances (search_item1, search_item2) for lookup rather than reusing the original references is a good pattern - it verifies that the hash function and equality operator work correctly for Items with the same value.

tests/test_files/wrapped_container_test.pxd (3)

54-55: LGTM! Correct Cython declarations for map lookups.


94-95: LGTM! Correct Cython declarations for unordered_map lookups.


125-126: LGTM! Correct Cython declarations for unordered_set operations.

tests/test_files/new_stl_test.pyx (4)

26-26: LGTM! Added shared_ptr import for proper memory management.


67-68: Good fix: Loop variables renamed to prevent parameter shadowing.

The change from generic key/value loop variables to _loop_key_m/_loop_value_m prevents the bug where function parameters named key or value would be shadowed by the loop variables. This aligns with the PR objective of fixing the variable name collision bug in StdMapConverter and StdUnorderedMapConverter.


81-145: LGTM! Correctly generated unordered_map wrapper methods.

The three new methods (lookupUnorderedMap, hasKeyUnorderedMap, getValueUnorderedMap) follow the established pattern for container conversion and sync-back. The _loop_key_m/_loop_value_m naming is consistently applied.


179-243: LGTM! Correctly generated unordered_set wrapper methods.

The set methods properly handle the conversion to C++ libcpp_unordered_set and sync modifications back to the Python set.

- Update Item hash function to combine both value_ and name_ members
- Add test verifying that items with same value_ but different name_
  are treated as distinct (tests proper multi-field hashing)
key_conv_cleanup = ""

# Use mangled variable names to avoid collision with function parameters
loop_key = mangle("_loop_key_" + argument_var)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpfeuffer is this extra mangling safer? needed?

return py_result

def getStringViewLength(self, sv ):
def getStringViewLength(self, bytes sv ):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpfeufer no clue here. are both needed or now only one sufficient?

- Rename test files: new_stl_test.* -> cpp17_stl_test.*
- Rename test module: test_code_generator_new_stl.py -> test_code_generator_cpp17_stl.py
- Rename class: NewSTLTest -> Cpp17STLTest
- Add user-centric documentation with container mappings and usage examples
- Document Python types for each C++ container (dict, set, list, T|None, bytes)
- Include example syntax for passing/receiving containers
- Document compilation requirements (C++17)
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
tests/test_code_generator_cpp17_stl.py (4)

143-143: Remove extraneous f-prefix from string literal.

This f-string has no placeholders.

-        f"lookupUnorderedMap('grape') should return -1 for missing key"
+        "lookupUnorderedMap('grape') should return -1 for missing key"

157-161: Improve exception handling: use pytest.raises or catch specific exception.

Using bare except Exception with pass obscures which exception is expected and silences unintended errors. Additionally, assert False is removed by python -O.

-    # Verify at() throws exception for missing key
-    try:
-        obj.getValueUnorderedMap(test_map, b"missing")
-        assert False, "getValueUnorderedMap should raise exception for missing key"
-    except Exception:
-        pass  # Expected - std::out_of_range from at()
+    # Verify at() throws exception for missing key
+    with pytest.raises(Exception):  # std::out_of_range from at()
+        obj.getValueUnorderedMap(test_map, b"missing")

196-196: Remove extraneous f-prefix from string literal.

This f-string has no placeholders.

-        f"findUnorderedSet(999) should return -1 for missing element"
+        "findUnorderedSet(999) should return -1 for missing element"

238-240: Unused loop variable i in assertion.

The loop variable i is declared but never used in the loop body. Consider using _ or incorporating i into the error message.

-    for i, (r, e) in enumerate(zip(list_data, expected_doubled)):
-        assert abs(r - e) < 0.0001, \
-            f"doubleListElements should modify list in place: {list_data}"
+    for idx, (r, e) in enumerate(zip(list_data, expected_doubled)):
+        assert abs(r - e) < 0.0001, \
+            f"doubleListElements mismatch at index {idx}: {list_data}"
tests/test_files/cpp17_stl_test.pyx (2)

346-358: Docstring mismatch with assertion for getStringViewLength.

The docstring on line 348 says sv: bytes, but the assertion on line 350 accepts both bytes and str. This is intentional (allowing str with UTF-8 encoding), but the docstring should reflect the actual accepted types.

     def getStringViewLength(self, bytes sv ):
         """
-        getStringViewLength(self, sv: bytes ) -> int
+        getStringViewLength(self, sv: Union[bytes, str] ) -> int
         """

360-372: Same docstring inconsistency for stringViewToString.

The docstring says bytes but the implementation accepts both bytes and str.

     def stringViewToString(self, bytes sv ):
         """
-        stringViewToString(self, sv: bytes ) -> bytes
+        stringViewToString(self, sv: Union[bytes, str] ) -> bytes
         """
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3094352 and a72596f.

📒 Files selected for processing (6)
  • tests/test_code_generator_cpp17_stl.py (1 hunks)
  • tests/test_code_generator_new_stl.py (0 hunks)
  • tests/test_files/cpp17_stl_test.hpp (1 hunks)
  • tests/test_files/cpp17_stl_test.pxd (1 hunks)
  • tests/test_files/cpp17_stl_test.pyx (7 hunks)
  • tests/test_files/new_stl_test.pxd (0 hunks)
💤 Files with no reviewable changes (2)
  • tests/test_code_generator_new_stl.py
  • tests/test_files/new_stl_test.pxd
🧰 Additional context used
🪛 Clang (14.0.6)
tests/test_files/cpp17_stl_test.hpp

[error] 19-19: 'unordered_map' file not found

(clang-diagnostic-error)

🪛 Ruff (0.14.6)
tests/test_code_generator_cpp17_stl.py

143-143: f-string without any placeholders

Remove extraneous f prefix

(F541)


159-159: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)


160-161: try-except-pass detected, consider logging the exception

(S110)


160-160: Do not catch blind exception: Exception

(BLE001)


196-196: f-string without any placeholders

Remove extraneous f prefix

(F541)


238-238: Loop control variable i not used within loop body

(B007)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: test (==3.2.0, 3.10)
  • GitHub Check: test (==3.2.0, 3.13)
  • GitHub Check: test (==3.2.0, 3.11)
  • GitHub Check: test (==3.1.0, 3.10)
  • GitHub Check: test (==3.2.0, 3.12)
  • GitHub Check: test (==3.1.0, 3.11)
  • GitHub Check: test (==3.1.0, 3.12)
🔇 Additional comments (17)
tests/test_code_generator_cpp17_stl.py (4)

1-44: LGTM! Comprehensive module docstring and proper setup.

The docstring clearly documents container mappings, usage examples, and compilation requirements. Imports are appropriate for the test context.


68-104: Good coverage of code generation verification.

The assertions verify all expected imports and method signatures are present in the generated code. This provides good regression protection for the code generator.


119-276: Comprehensive runtime test coverage for all C++17 STL containers.

The test thoroughly exercises:

  • unordered_map: return, iteration, find, count, at
  • unordered_set: return, iteration, membership, count, find
  • deque: return, iteration, mutable reference
  • list: return, iteration, mutable reference
  • optional: value/None handling
  • string_view: length and round-trip

Edge cases like missing keys/elements are properly covered.


212-216: Mutable reference behavior is correctly implemented.

The binding properly handles in-place modification through the standard Cython pattern:

  1. Creates a C++ deque from the Python list
  2. Calls the C++ method to modify it
  3. Copies the modified deque back to the original Python list using slice assignment (d[:] = [...] on line 275 of the .pyx file)

The test at lines 212-216 will pass as expected because the binding correctly reflects modifications back to the Python list.

tests/test_files/cpp17_stl_test.pyx (3)

67-68: Good fix: Loop variables renamed to avoid shadowing function parameters.

The renaming from key/value to _loop_key_m/_loop_value_m prevents variable name collisions when the function has parameters named key or value. This addresses the bug fix mentioned in the PR objectives.


81-101: New lookup method correctly implements hash-based key lookup.

The lookupUnorderedMap method properly:

  1. Validates input types
  2. Converts Python dict to C++ unordered_map
  3. Calls the C++ lookup with the key parameter
  4. Copies any modifications back to the Python dict
  5. Returns the result

The loop variable naming (_loop_key_m, _loop_value_m) correctly avoids shadowing the key parameter.


179-199: Implementation pattern is consistent across new unordered_set methods.

The hasValueUnorderedSet method follows the same pattern as the map methods, with proper type validation, C++ container conversion, method call, and write-back. The loop variable naming avoids shadowing the value parameter.

tests/test_files/cpp17_stl_test.pxd (3)

1-46: Excellent documentation in the PXD file.

The header comments clearly document container type mappings, required imports, and compilation requirements. This serves as good reference documentation for users of autowrap.


65-65: Correct use of except + for exception-throwing method.

The getValueUnorderedMap method correctly declares except + to propagate the C++ std::out_of_range exception from at() to Python. This enables the test's exception checking to work correctly.


47-127: Complete and consistent method declarations.

All methods in the C++ header are properly declared with matching signatures. The use of references (&) for container parameters is correct for the mutable reference semantics.

tests/test_files/cpp17_stl_test.hpp (7)

19-27: Standard C++ includes are correct.

The clang error about 'unordered_map' file not found is a false positive from the static analysis environment, which likely lacks proper C++ standard library paths or C++17 support configured. The includes are standard C++17 headers.


28-30: Class structure is clean and appropriate for testing.

The default constructor is sufficient for this test class that has no member state.


56-74: Hash-based lookup methods correctly demonstrate O(1) operations.

The lookupUnorderedMap, hasKeyUnorderedMap, and getValueUnorderedMap methods properly use find(), count(), and at() respectively—all O(1) average-time operations for hash-based containers. This aligns with the PR's goal of testing hash-based lookups.


106-114: findUnorderedSet returns the searched value itself when found.

This is an unusual pattern—typically find either returns an iterator or a boolean. Returning the value itself (which the caller already has) isn't particularly useful. Consider whether this is the intended behavior or if it should return a different indicator.

The current implementation returns value when found, which the caller already knows. This seems designed purely to test that find() works, which is valid for a test case, but the API design is unconventional.


136-142: Mutable reference correctly allows in-place modification.

The doubleDequeElements method takes a non-const reference, enabling modifications to be reflected back to the caller. The Cython wrapper correctly copies the modified container back to the Python list.


177-189: Clean std::optional implementation using value_or.

The getOptionalValue and unwrapOptional methods cleanly demonstrate optional value handling. Using value_or(-1) is idiomatic C++17.


197-205: string_view handling is correct.

The methods properly accept std::string_view and demonstrate conversion to std::string. Note that the Python side passes bytes, and the Cython layer handles the conversion correctly.

@timosachsenberg
Copy link
Collaborator Author

ok I was able to add an example with hashing. I see some looping but I am not sure if / where it could be prevented.

@timosachsenberg
Copy link
Collaborator Author

ok I merge and try to build openms with it

@timosachsenberg timosachsenberg merged commit 18c6fd9 into master Nov 29, 2025
11 checks passed
@jpfeuffer
Copy link
Contributor

jpfeuffer commented Nov 29, 2025

This is not really what I meant. Now it implemented functions on the cython level to do lookups.

I just want to see that d[key] == value in python. With a wrapped class.
Before, that stupid thing was just doing
for k,v in d.items():
assert some kind of bullshit

And I have no idea why this name mangling should be needed. Was there an error somewhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants