Evaluate 3 different topic modeling algorithms
See original GitHub issue- OCTIS version:
- Python version:3,7
- Operating System: linux
Description
I am a PhD candidate and I need to evaluate the performance of three different topic model algorithm including: LDA, LSI and Bertopic. ( LDA and LSI were trained using the Gensim package) what are the relevance metrics that I should use apart from coherence score? I would like to include in my paper a sort of table or graph that shows an evaluation in term of accuracy of the model (coherence score) and relevance of topics ( should I use the topic diversity metric ?) Thank you
What I Did
Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.
Issue Analytics
- State:
- Created 2 years ago
- Comments:5
Top Results From Across the Web
Evaluation Methods for Topic Models
In this paper we consider only the simplest topic model, latent Dirichlet allocation (LDA), and compare a number of methods for estimating the...
Read more >Topic Modelling Techniques - Medium
Brief Overview of different techniques used for topic modeling in NLP along with abstract code examples ... Have you ever had lots of...
Read more >An Evaluation of Topic Modelling Techniques for Twitter
In this paper, we complete an evaluation of various topic modelling algorithms, and examine their performance when working with Twitter tweets.
Read more >Topic Modeling: An Introduction - MonkeyLearn
In this guide, we're going to take a look at two types of topic analysis techniques: topic modeling and topic classification. Topic modeling...
Read more >Using Topic Modeling Methods for Short-Text Data - Frontiers
The paper sheds light on some common topic modeling methods in a ... “Performance evaluation of topic modeling algorithms for text ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Which diversity metric are you using? Can you also show the snippet of the code in which you call the metric? In general a metric in OCTIS expects to receive in input the output of a
Model(). Any topic model in OCTIS returns a dictionary with up to 4 fields. Depending on the metric, the right field will be used to compute the metric (see here for the details onmodel_output) So if you want to use a metric that uses the word-topic distribution to compute the diversity, then you will construct yourmodel_outputlike this:And then use it to compute the score of a metric. For example,
Let me know if it works.
Silvia
Hello, it depends on what your objective is. Any evaluation metric focuses on a specific aspect of a topic model. OCTIS includes different categories of evaluation metrics:
I am not sure if BERTopic generates the document-topic and word-topic distributions (in that case, you will not be able to compute the topic significance metrics). Maybe you’d like to consider Contextualized Topic Models (CTM) which is a topic model that uses pre-trained contextualized representations (as BERTopic). CTM is part of OCTIS too.
Let me know if you have further questions,
Silvia