Skip to content

Conversation

@Sameena-Thabassum
Copy link

Problem Statement
The unstructured-api currently lacks support for the 'basic' chunking strategy, which is mentioned in the official documentation. As per the documentation, the API should allow users to choose between different chunking strategies, such as "by_title" and "basic". However, the chunking_strategy parameter in the API currently only supports 'by_title', causing a mismatch between the functionality described in the documentation and the actual behavior of the system.

Solution
Updating the chunking_strategy parameter: The chunking_strategy parameter is modified to support both 'by_title' and 'basic' chunking strategies, as indicated by the official documentation.

The Literal type hint for chunking_strategy is updated to include both 'by_title' and 'basic' as valid options.

This update ensures that the API accepts 'basic' as a valid chunking strategy and prevents potential issues with incorrect values.

Why this PR should be accepted
This pull request resolves a key issue by ensuring that the unstructured-api behaves consistently with the documented features, particularly by adding support for the 'basic' chunking strategy.

@Sameena-Thabassum Sameena-Thabassum changed the title Add support for 'basic' chunking strategy to match documentation fix: Add support for 'basic' chunking strategy to match documentation Apr 22, 2025
@Sameena-Thabassum
Copy link
Author

@dkarlovi Please review

@PastelStorm
Copy link
Contributor

Needs a version bump and a changelog update. Other than that looks good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants