Token classification weak labelling
See original GitHub issueAs in the text classification task, the “Weak labeling” mode in token classification must allow tag entities by defining a query and entity label (the rule).
Given a rule, the Weak labeling mode for token classification will tag entities based on the matched tokens/words in the search results returned by the API.
The way the entity will be tagged from the matched token will be determined by a labeling function provided as an attribute of the rule. For now, just one single labeling function will be supported, the exact_match
, where all matched tokens/words will be tagged as the provided rule.
For example, given a labeling rule with the query Par*
, the label PLACE
, and the matched record Paris is the city of light
, the labeling function will tag the token Paris
as a PLACE
.
An important behavior of this feature is to provide a visualization of the tagged entities in the visible records from the UI.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (5 by maintainers)
Decision Notes
Record list
Module to set rules
Also, think about including docs and reference
TextClassification
usecases too #1986