Skip to content

Tc classification breaking use case  #53

@lfoppiano

Description

@lfoppiano

Apparently this document (document2.pdf) + superconductors using scibert makes a risotto with the tc classification:

Jul 11 12:47:07 falcon docker[11065]: Traceback (most recent call last):
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/venv/lib/python3.7/site-packages/bottle.py", line 870, in _handle
Jul 11 12:47:07 falcon docker[11065]: return route.call(**args)
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/venv/lib/python3.7/site-packages/bottle.py", line 1750, in wrapper
Jul 11 12:47:07 falcon docker[11065]: rv = callback(*a, **ka)
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/grobid_superconductors/service.py", line 118, in process_link
Jul 11 12:47:07 falcon docker[11065]: result.append(self.process_single_sentence(sentence_input, 
link_types_as_list, skip_classification))
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/grobid_superconductors/service.py", line 143, in process_single_sentence
Jul 11 12:47:07 falcon docker[11065]: marked_tc_paragraph = self.temperature_classifier.mark_temperatures_paragraph(paragraph_input)
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/grobid_superconductors/linking/linking_module.py", line 561, in mark_temperatures_paragraph
Jul 11 12:47:07 falcon docker[11065]: return self.mark_temperatures(text_, tokens_, spans_)
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/grobid_superconductors/linking/linking_module.py", line 543, in mark_temperatures
Jul 11 12:47:07 falcon docker[11065]: doc = self.init_doc(words, spaces, spans_remapped)
Jul 11 12:47:07 falcon docker[11065]: File "/opt/service/grobid_superconductors/linking/linking_module.py", line 68, in init_doc
Jul 11 12:47:07 falcon docker[11065]: span = Span(doc=doc, start=s['token_start'], end=s['token_end'], label=s['type'])
Jul 11 12:47:07 falcon docker[11065]: File "spacy/tokens/span.pyx", line 99, in spacy.tokens.span.Span.__cinit__
Jul 11 12:47:07 falcon docker[11065]: IndexError: [E035] Error creating span with start 9 and end 6 for Doc of length 24.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions