question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

NLP (Text tracking and visualization)

See original GitHub issue

🚀 Feature

I recently saw in your RoadMap that you will be adding a feature for Images tracking and visualization, thus I thought it would be a very helpful feature to have a similarly themed addition for textual data. What I mean is, It would be very useful to be able to perform substring/regexp search and filter in the data during and after training to see/visualize/track how the model performs on a particular subset of examples during each stage of the training. (An Example follows in the Motivation section)

Motivation

During the training on the set of GLUE/SuperGLUE tasks (almost any NLP problem can be inserted here), there are times where I would like to see how my model performed (labels and other metadata I track) on a particular batch from the set of examples.

Example: Imagine you are training a Question Answering model and you would like to see how it performs on questions that start with “What if…” or “Who is”. Maybe you even have a set of adversarial examples that are easy to filter with a regexp, like a set of paragraphs (maybe reordered or rewritten i.e. Adversarial SQUAD, Human Adversarial QA ) that contain the sequence “and then he … but”.

It would be very useful to know/track when have you started to get better at answering one type of question vs another. It can allow for knowing if a particular wording is never understood by the model and a ton of questions akin to these themes.

Pitch

I want functionality that will allow me to filter the examples from either TRAIN, TEST or VAL or any combination of them, according to a substring or regexp matching (see example above). I would hope for a window/tab/smth that will display the filtered examples to me during and after the experiment.

Alternatives

You can write a log file and somehow separate them to view in a text editor, however that is not very user-friendly and you wouldn’t be able to properly track/search the text during training.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:6
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

3reactions
gorarakelyancommented, Dec 17, 2021

@osoblanco @LarsHill Aim v3.3.0 is out with text tracking and visualization! 📄 regex search and markdown support will be added as enhancements in future releases! Thanks a lot for your feedback! 🙌

2reactions
gorarakelyancommented, Jan 24, 2022

@osoblanco the text search enhancement is now available! 🎉

Please make sure to upgrade to aim>=3.4.0 to be able to use it: pip install aim --upgrade.

image

Live example here: http://play.aimstack.io:10004/runs/d9e89aa7875e44b2ba85612a/texts

Read more comments on GitHub >

github_iconTop Results From Across the Web

Exploratory Data Analysis for Natural Language Processing
Text statistics visualizations are simple but very insightful techniques. They include: word frequency analysis,; sentence length analysis,; average word length ...
Read more >
NLP: Text Data Visualization - Numpy Ninja
Text data visualization has many advantages, like getting the most used word at a speed to know what the text is about largely, ......
Read more >
Text Analytics 2: Visualizing Natural Language Processing - edX
Text Analytics 2: Visualizing Natural Language Processing. Extend your knowledge of the core techniques of computational linguistics by working through
Read more >
Extracting and Visualizing Customer Feedback Data using NLP
Such data comes in different shapes and sources, and mainly in the form of text. In this session, Orestes Castaneda, Business Intelligence ...
Read more >
NLP Visualisation guide | Kaggle
In this notebook,i am explaining some basic visualisations on text data.If you are a beginner to NLP,please watch below notebook where i explained...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found