Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

NLP (Text tracking and visualization)

See original GitHub issue

🚀 Feature

I recently saw in your RoadMap that you will be adding a feature for Images tracking and visualization, thus I thought it would be a very helpful feature to have a similarly themed addition for textual data. What I mean is, It would be very useful to be able to perform substring/regexp search and filter in the data during and after training to see/visualize/track how the model performs on a particular subset of examples during each stage of the training. (An Example follows in the Motivation section)

Motivation

During the training on the set of GLUE/SuperGLUE tasks (almost any NLP problem can be inserted here), there are times where I would like to see how my model performed (labels and other metadata I track) on a particular batch from the set of examples.

Example: Imagine you are training a Question Answering model and you would like to see how it performs on questions that start with “What if…” or “Who is”. Maybe you even have a set of adversarial examples that are easy to filter with a regexp, like a set of paragraphs (maybe reordered or rewritten i.e. Adversarial SQUAD, Human Adversarial QA ) that contain the sequence “and then he … but”.

It would be very useful to know/track when have you started to get better at answering one type of question vs another. It can allow for knowing if a particular wording is never understood by the model and a ton of questions akin to these themes.

Pitch

I want functionality that will allow me to filter the examples from either TRAIN, TEST or VAL or any combination of them, according to a substring or regexp matching (see example above). I would hope for a window/tab/smth that will display the filtered examples to me during and after the experiment.

Alternatives

You can write a log file and somehow separate them to view in a text editor, however that is not very user-friendly and you wouldn’t be able to properly track/search the text during training.

Issue Analytics

State:
Created 2 years ago
Reactions:6
Comments:9 (6 by maintainers)

Top GitHub Comments

3reactions

gorarakelyancommented, Dec 17, 2021

@osoblanco @LarsHill Aim v3.3.0 is out with text tracking and visualization! 📄 regex search and markdown support will be added as enhancements in future releases! Thanks a lot for your feedback! 🙌

2reactions

gorarakelyancommented, Jan 24, 2022

@osoblanco the text search enhancement is now available! 🎉

Please make sure to upgrade to aim>=3.4.0 to be able to use it: pip install aim --upgrade.

Live example here: http://play.aimstack.io:10004/runs/d9e89aa7875e44b2ba85612a/texts

Top Results From Across the Web

Exploratory Data Analysis for Natural Language Processing

Text statistics visualizations are simple but very insightful techniques. They include: word frequency analysis,; sentence length analysis,; average word length ...

NLP: Text Data Visualization - Numpy Ninja

Text data visualization has many advantages, like getting the most used word at a speed to know what the text is about largely, ......

Text Analytics 2: Visualizing Natural Language Processing - edX

Text Analytics 2: Visualizing Natural Language Processing. Extend your knowledge of the core techniques of computational linguistics by working through

Extracting and Visualizing Customer Feedback Data using NLP

Such data comes in different shapes and sources, and mainly in the form of text. In this session, Orestes Castaneda, Business Intelligence ...

NLP Visualisation guide | Kaggle

In this notebook,i am explaining some basic visualisations on text data.If you are a beginner to NLP,please watch below notebook where i explained...