Regression: custom batches no longer supported
See original GitHub issueDescribe the bug
In joeynmt<2.0.0
, the TrainManager
used to receive a batch_class
to handle custom batches.
In joeynmt=2.0.0
, it is no longer needed, as there is the make_data_iter
which calls a collate_fn
responsible to return whatever.
This created a regression - it used to be possible to use this repository as a library - but now one has to make hard-coded changes in the code in order to have a custom make_data_iter
.
It is hard coded in training and prediction
Possible Solutions: For training:
- to have
data_iter
callable passed to theTrainManager
- in the train manager, have a method so we can extend it:
def make_data_iter(self, **kwargs):
return make_data_iter(**kwargs)
For prediction:
- pass this as an argument to the
predict
method?
What solutions would you prefer/approve if any?
Issue Analytics
- State:
- Created a year ago
- Comments:17 (8 by maintainers)
Top Results From Across the Web
Should the custom loss function in Keras return a single loss ...
I think the loss function should return loss values for every sample in the batch. So the loss function shoud give an array...
Read more >How to use Different Batch Sizes when Training and ...
In this tutorial, you will discover how you can address this problem and even use different batch sizes during training and predicting. After ......
Read more >Regression list for IBM WebSphere Application Server ...
The asserted user 'unauthenticated' is no longer authenticated. ... Medium. NPE submitting batch job with no servers available.
Read more >sklearn Linear Regression vs Batch Gradient Descent
I have a small data set and wanted to use Batch Gradient Descent (self written) as an intermediate step for my own edification....
Read more >Get batch predictions and explanations | Vertex AI
This page shows you how to make a batch prediction request to your trained classification or regression model using the Google Cloud console...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I moved it to two files as @may- and @juliakreutzer both thought “three classes for data-related functions might be confusing”.
I’d argue for merging and releasing as is (to support the functionality), and if down the line some restructuring needs to be done, that’s fine.
@AmitMY
Do we need to separate the
load_data()
func fromdata.py
? I wonder if we really need bothdata.py
anddata_loader.py
. If it’s just because of the circular import problem, we could movemake_data_iter()
fromdata.py
todatasets.py
, maybe?make_data_iter()
is now actually a member of the dataset class. And other sampler funcs + collate func belong to thismake_data_iter()
. What do you think about this?I did some nasty hack to avoid circular import before, and I regret it… 🤕 https://github.com/joeynmt/joeynmt/blob/38fcd3a2b19ee657f355cd9c6833c19cf29fb703/joeynmt/helpers.py#L29-L31 I should have put the
log_data_info()
indata.py
file, instead.