-
Notifications
You must be signed in to change notification settings - Fork 687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How adding papers to index manually #784
Comments
To manually add papers to an
Example: await search_index.add_document(
{
"title": "Document Title",
"year": "2023",
"file_location": "/path/to/document",
"body": "The full text of the document goes here.",
},
document=your_document_object, # Optional
) This method handles the addition of the document to the index, including writing the document to storage if needed [1]. For creating a reusable index from ZoteroDB, you can iterate over items in the Zotero library using the from paperqa import Docs
from paperqa.contrib import ZoteroDB
docs = Docs()
zotero = ZoteroDB(library_type="user") # "group" if group library
for item in zotero.iterate(limit=20):
if item.num_pages > 30:
continue # skip long papers
docs.add(item.pdf, docname=item.key) This method handles metadata by using the Zotero library to fetch and manage document details [2]. To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
I've noticed, agent queries on What are fundamental algorithmic differences between them ? Best regards. |
In the exemple you gave :
Best regards. |
The fundamental algorithmic difference between querying with an Regarding your second question:
To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other | Bug Report |
Is there a function to get PDF document full text body ? |
Yes, PaperQA provides a function called To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other | Bug Report |
In this function :
Should I use |
The To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other | Bug Report |
Hi,
A code snippet is provided in PaperQA' documentation to create a reusable documents index :
In this way we can create an index by adding papers to index automatically but it seems it's not possible to add "citation" and "docname" for each paper as it can be done when creating a
Doc()
object. See following :So I wonder if it is a way to add papers manually to an empty
IndexSearch()
object, iteratively, as it can be done withDoc()
objects.Also, documentation provides snippet to fill a
Doc()
object with papers form ZoteroDB :Is it a way to create reusable index from ZoteroDB ?
Best regards.
@dosu.
The text was updated successfully, but these errors were encountered: