doc(core): clarify difference between chunker NPs and pattern NPs #2312
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issues
Seeks to resolve #2301
Description
Harper has had two definitions of a nominal phrase for some time now. I'd like to clarify why.
The first mention of nominal phrases in Harper was with the
NominalPhrasePattern. It used a naive heuristic approach inspired by the kinds of syntax trees you find in formal programming languages. It was never effective enough, but it remains in the codebase becuase some rules rely on it.The second was from my work on nominal phrase chunkers. These go through the document with a neural network and set words'
np_memberfield to true if they are a member of a nominal phrase. This is far more accurate and thus is the recommended way to detect nominal phrases in rules moving forward.When reviewing this PR, keep in mind that the doc-comments in the code are what will persist. Are they complete? Are there any questions that need answering?
How Has This Been Tested?
N/A
Checklist