Noun chunk info from token
See original GitHub issueHello spaCy team!
It appears that there isn’t an option to determine whether any single token is part of a noun chunk (as determined from doc.noun_chunks
), in the same way as token.ent_iob
.
The main problem that I am trying to solve is merging noun_chunks in specific sentences.
Is this a feature that could be added? Or is there another solution?
Issue Analytics
- State:
- Created 7 years ago
- Comments:13 (5 by maintainers)
Top Results From Across the Web
extract noun-chunk from single token - python - Stack Overflow
I am interested in writing a function that returns, for every token, the noun-chunk that (may) include that token. Something like: for tok...
Read more >Extracting noun chunks | Python Natural Language ...
Noun chunks are spaCy Span objects and have all their properties. See the official documentation at https://spacy.io/api/token.
Read more >How to keep original noun chunk spans in Spacy after ...
Due to the noun chunk merging step in the pipeline, “the new store” becomes a single token and there is no way to...
Read more >Chunking in NLP: decoded - Towards Data Science
In short, Chunking means grouping of words/tokens into chunks. ... sentence is divided into two different chunks which are NP(noun phrase).
Read more >7 Extracting Information from Text - NLTK
We can match these noun phrases using a slight refinement of the first tag pattern above, i.e. <DT>?<JJ.*>*<NN.*>+. This will chunk any sequence...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Maybe we should have a
span2doc
function? I think this might take some pressure off the span objects.Trying to tie noun_chunks to specific sentences is a feature I’m trying to build as well. I altered your code @owlas for those trying to run it live on a document.
I’m assuming this just needs to be changed to check the left and right most child of the noun chunk is inside the sentence left/right most child.