Search-tag on labels, resets prior annotation on text classification hand labeling with multi_label=True
See original GitHub issueIt appears that the search function on labels in case of hand labeling - text classification with multiple labels clears all prior annotations on close. This creates a major bug, because it is not apparent immediately that the prior annotation labels have been reset since they are out of visible scope. The problems is even more pronounced if you are working with a large number of labels.
Steps to reproduce:
Create a DatasetForTextClassification
with an array of records created using
records = []
for idx, row in df.iterrows():
records.append(make_record(row))
dataset_rb = rb.DatasetForTextClassification(records)
def make_record(row):
record = rb.TextClassificationRecord(
text = row["text"],
multi_label = True
)
return row
Assign a large amount of labels to the dataset
settings = rb.TextClassificationSettings(label_schema=get_lots_of_labels())
# apply settings to new or already existing dataset
rb.configure_dataset("my_dataset_name", settings=settings)
# logging to the newly created dataset triggers the validation checks
rb.log(dataset_rb, "my_dataset_name")
Switch to the web app and try hand labeling, use the search on the labels (not the record) for toggling select, try a few search string and clear out search string after making selections, only the most recent labels maintain state, all prior label toggles get reset.
Appears to be a state management issue.
Issue Analytics
- State:
- Created a year ago
- Comments:6 (3 by maintainers)
Thanks for reporting @dhruvsakalley
We will take a look at this problem as soon as possible
Thanks for confirming, I would like to add that if you reset prior annotations without confirmation, it leads to the possibility of lost work. It might be useful to have an undo in case of accidents like these. Some tools like prodigy keep a track of last n actions in the session and commit as a separate step, which I find very useful as a quick way to go back and change a label based on a new observation or undo a mistake that happened, which makes the annotation flow faster.