Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Model's _fit should accept Dataset also, not just BatchVectorizer

See original GitHub issue

_fit

Seems more natural for a model to fit on Dataset. Maybe better to use Union[artm.BatchVectorizer, topicnet.cooking_machine.Dataset] instead of just artm.BatchVectorizer (Union — for compatibility)?

Issue Analytics

State:
Created 3 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

bt2901commented, May 25, 2020

I think you are moving the goalposts. We do not provide guarantees on _fit, but it does not forbid the user to use it. Making this method a bit more flexible does not change that.

Also, training a model without Cubes + Experiment overhead is exactly why one would consider using the method (e.g. for very dirty prototyping or perhaps for cases not covered by Cubes + Experiment yet).

0reactions

Alvantcommented, May 25, 2020

First, _fit is “protected” method, meaning we do not guarantee that it should work nice and easy for the user and that everything will work

Ok, but it doesn’t mean that we shouldn’t think about how to make the method better 🙂

Top Results From Across the Web

1. Loading Data: BatchVectorizer and Dictionary

Before starting modeling we need to convert you data in the library format. ... if it is not too big and you don't...

Why Keras model.fit() is using whole dataset as a batch and ...

This way keras would feed data by batches. You should adjust batch size to ensure, that GPU's memory is enough. Usually batch size...

TopicNet/dataset.py at master · machine-intelligence ... - GitHub

When working with any text collection `data_path` for the first time,. there is no such folder: it will be created by Dataset. batch_size...

Beyond LDA: State-of-the-art Topic Models With BigARTM

Previously, we looked at the LDA (Latent Dirichlet Allocation) topic modeling library available within MLlib in PySpark. While LDA is a very ...

Processing the data - Hugging Face Course

Of course, just training the model on two sentences is not going to yield very good results. To get better results, you will...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Model's _fit should accept Dataset also, not just BatchVectorizer

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

"can't set attribute" error

Local deployment on KIND needs update