Skip to content

Add LLM fallback when Grobid returns no entries#15467

Draft
faneeshh wants to merge 7 commits intoJabRef:mainfrom
faneeshh:fix-12700
Draft

Add LLM fallback when Grobid returns no entries#15467
faneeshh wants to merge 7 commits intoJabRef:mainfrom
faneeshh:fix-12700

Conversation

@faneeshh
Copy link
Copy Markdown
Contributor

@faneeshh faneeshh commented Apr 1, 2026

Related issues and pull requests

Closes #12700

PR Description

When Grobid returns no entries for a PDF such as the Kaerlein bibliography, the online import action should fall back to the existing method instead of opening an empty library.

So far I've added a CliPreferences overload to CitationsFromPdf so the GUI layer can call it without depending on the concrete JabRefCliPreferences class. Still have to test these changes and I think next step would be handling the case where Grobid throws a connection exception rather than returning an empty list.

Steps to test

Checklist

  • I own the copyright of the code submitted and I license it under the MIT license
  • [.] I manually tested my changes in running JabRef (always required)
  • [.] I added JUnit tests for changes (if applicable)
  • [.] I added screenshots in the PR description (if change is visible to the user)
  • [.] I added a screenshot in the PR description showing a library with a single entry with me as author and as title the issue number
  • [.] I described the change in CHANGELOG.md in a way that can be understood by the average user (if change is visible to the user)
  • [.] I checked the user documentation for up to dateness and submitted a pull request to our user documentation repository

@jabref-machine
Copy link
Copy Markdown
Collaborator

Note that your PR will not be reviewed/accepted until you have gone through the mandatory checks in the description and marked each of them them exactly in the format of - [x] (done), - [ ] (yet to be done) or - [/] (not applicable). Please adhere to our pull request template.

@github-actions github-actions bot added the status: changes-required Pull requests that are not yet complete label Apr 1, 2026
@testlens-app
Copy link
Copy Markdown

testlens-app bot commented Apr 1, 2026

✅ All tests passed ✅

🏷️ Commit: 50b3ebc
▶️ Tests: 7268 executed
⚪️ Checks: 58/58 completed


Learn more about TestLens at testlens.app.

@faneeshh
Copy link
Copy Markdown
Contributor Author

faneeshh commented Apr 4, 2026

During testing I'm getting Overlapping FileLock Exception on the .mv AI storage files when the LLM fallback runs. Looks like CitationsFromPdf.extractCitationsUsingLLM creates a new AiService internally, which conflicts with the one the GUI already has open. I'm thinking the fix would be to thread the existing AiService through the constructor like ClearEmbeddingsAction does and add an overload that takes it directly instead of constructing a new one? Or is there a simpler approach I'm missing?

@calixtus calixtus requested a review from InAnYan April 5, 2026 21:13
@github-actions
Copy link
Copy Markdown
Contributor

The requested changes were not addressed for 10 days. Please follow-up in the next 10 days or your PR will be automatically closed. You can check the contributing guidelines for hints on the pull request process.

@github-actions github-actions bot added the status: stale Issues marked by a bot as "stale". All issues need to be investigated manually. label Apr 16, 2026
@InAnYan
Copy link
Copy Markdown
Member

InAnYan commented Apr 16, 2026

Sorry for not reponding for too long. While I’m not familiar fully with this part of code, I think your remark is right. And I think you could try sending the existing ai service instead of creating a new one

@faneeshh
Copy link
Copy Markdown
Contributor Author

Sorry for not reponding for too long. While I’m not familiar fully with this part of code, I think your remark is right. And I think you could try sending the existing ai service instead of creating a new one

No worries. I can try that approach.

@github-actions github-actions bot removed the status: stale Issues marked by a bot as "stale". All issues need to be investigated manually. label Apr 18, 2026
@faneeshh
Copy link
Copy Markdown
Contributor Author

I did test with the Kaerlein PDF and Grobid fails now and the LLM fallback works without any errors but the imported library is still empty. LlmPlainCitationParser.importDatabase returns nothing and there's no log output at all after it runs. This is a lot more complicated I think.

@jabref-machine
Copy link
Copy Markdown
Collaborator

Your code currently does not meet JabRef's code guidelines. We use Checkstyle to identify issues. You can see which checks are failing by locating the box "Some checks were not successful" on the pull request page. To see the test output, locate "Source Code Tests / Checkstyle (pull_request)" and click on it.

In case of issues with the import order, double check that you activated Auto Import. You can trigger fixing imports by pressing Ctrl+Alt+O to trigger Optimize Imports.

Please carefully follow the setup guide for the codestyle. Afterwards, please run checkstyle locally and fix the issues, commit, and push.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component: import-load status: changes-required Pull requests that are not yet complete

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Import Kaerlein bibliography into JabRef

3 participants