Skip to content

Conversation

@levi42x
Copy link

@levi42x levi42x commented Nov 26, 2025

Fixes #4501

Description

This PR implements support for parsing Python UV package manager manifests in scancode-toolkit.

Changes

  • UvPyprojectTomlHandler: Parses pyproject.toml files with [tool.uv] sections
  • UvLockHandler: Parses uv.lock lockfiles with full dependency resolution
  • BaseUvPythonLayout: Manages UV project assembly and resource coordination
  • Helper functions:
    is_uv_pyproject_toml() for UV project detection
    parse_dependency_requirement() for PEP 508 dependency parsing

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled 📑 and links the original issue above 🔗
  • Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
    Run tests locally to check for errors.
  • Commits are in uniquely-named feature branch and has no merge conflicts 📁
  • Updated documentation pages (if applicable)
  • Updated CHANGELOG.rst (if applicable)

Signed-off-by: Shekhar Suman [email protected]

- Implement UvPyprojectTomlHandler to parse pyproject.toml with [tool.uv] sections
- Implement UvLockHandler to parse uv.lock lockfiles
- Add BaseUvPythonLayout for UV project assembly logic
- Add is_uv_pyproject_toml() helper for UV project detection
- Add parse_dependency_requirement() helper for dependency parsing
- Support standard dependencies, dev-dependencies, and optional groups
- Handle UV lockfile format with [[package]] entries and markers
- Add comprehensive test fixtures and test cases

Signed-off-by: Shekhar Suman [email protected]
Signed-off-by: Shekhar <[email protected]>
@levi42x levi42x force-pushed the feature/parse-uv_package branch from 64d01f0 to 900452d Compare November 27, 2025 03:45
Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks++ @levi42x for the PR, much needed update! ❤️

This is a great start and a step in the right direction, but needs some more updates and love. See my comments for your consideration.

yield from yield_dependencies_from_package_resource(resource)
return

if codebase.has_single_resource:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be at the beginning of the function

package = pypi.UvLockHandler.parse(test_file)
expected_loc = self.get_test_loc('pypi/uv/attrs-uv.lock-expected.json')
self.check_packages_data(package, expected_loc, regen=REGEN_TEST_FIXTURES)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add an end-to-end example with a --package scan (see the test file for examples, add a test similarly) in a uv package manifest layout and show that the package assembly works as expected. You've added an assemble function, but this is not tested otherwise.

@@ -0,0 +1,23 @@
[project]
name = "attrs-example"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use real world examples, and include the link

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you include a file that is large, please shorten it to only keep the relevant parts which you need to test the parser for, as we want to keep filesize and repo size as minimal as possible

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also find a file which has all/almost all the values which could be present, for example this fake example doesn't have licenses, copyrights and other usable package data fields.


extra_data = {}
extra_data['python_version'] = requires_python
extra_data['lock_version'] = version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
extra_data['lock_version'] = version
extra_data['lock_version'] = lock_version

this would be confusing with the proper package version, let's be as descriptive as possible in variable names to improve readability

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is also likely wrong as you're using the same variable for version

extra_data['lock_version'] = version

package_data = dict(
datasource_id=cls.datasource_id,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a lot more package data fields which can be parsed and added, please modify to include those. See

class PackageData(IdentifiablePackageData):
for more details for all the fields and refer to other package data parsers too

type=cls.default_package_type,
primary_language='Python',
extra_data=extra_data,
dependencies=dependencies,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also need to populate URLs with packageURL fields, see

urls = get_pypi_urls(name, version)
for example


def is_uv_pyproject_toml(location):
with open(location, 'r') as file:
data = file.read()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
data = file.read()
if "tool.uv" in file.read():
return True
return False

yield models.PackageData.from_data(package_data, package_only)


def parse_dependency_requirement(requirement, scope='dependencies', is_runtime=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be replaced by using get_requires_dependencies? The processing looks similar.

Or maybe abstract out similar code into functions that both these functions use? As we atleast have some code duplication here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support parsing python UV manifests

2 participants