Skip to content

Text splitting - Document Splitting #14

@emileastih1

Description

@emileastih1

We should revisit how we are splitting our documents into chunks, and for those chunks to be represented as a vector of embedding.
This is very important because we need to know how to split our documents in a way that makes sense , so the default way we should split our documents as a start is maybe into paragraphs that we find in a PDF documents.
We should think afterwards of what splitting mechanism is the most efficient and the most relevant to our use cases so we can use it differently on each document type .

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestquestionFurther information is requested

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions