Skip to content

Possible to get parse tree output similar to AllenNLP? #11

@julianvenhuizen

Description

@julianvenhuizen

Is it possible to have Alpino output the parse tree in the following format:

In: "Several theories about the higher prevalence in males have been investigated, but the cause of the difference is unconfirmed; one theory is that females are underdiagnosed."

Out: (S (S (S (NP (NP (JJ Several) (NNS theories)) (PP (IN about) (NP (NP (DT the) (JJR higher) (NN prevalence)) (PP (IN in) (NP (NNS males)))))) (VP (VBP have) (VP (VBN been) (VP (VBN investigated))))) (, ,) (CC but) (S (NP (NP (DT the) (NN cause)) (PP (IN of) (NP (DT the) (NN difference)))) (VP (VBZ is) (ADJP (JJ unconfirmed))))) (: ;) (S (NP (CD one) (NN theory)) (VP (VBZ is) (SBAR (IN that) (S (NP (NNS females)) (VP (VBP are) (ADJP (JJ underdiagnosed))))))) (. .))

This output is currently achieved through the use of AllenNLP and a minimal span-based neural constituency parser. However, as I'm also working with Dutch data I intend to use the Alpino parser. If the above output isn't conceivable I suspect I have to go over the XML output and work something out myself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions