question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

skip_blank has different values during data preparation

See original GitHub issue

I’m running a standard training with parallel sentences containing empty source or target lines and data_io.py returns an error while building buckets:

IndexError: index 897268 is out of bounds for axis 0 with size 897268

The parallel_iter() function in data_io.py is always called with the skip_blank argument set as True, except right here. This line makes us keep the sentence pairs containing “blanks”, which seems to bring the mismatch reflected in the error above. I don’t get the error anymore when I set skip_blank to True (or when I remove the sentence pairs containing blanks in the data).

@mjpost This line came with this PR. Would it be an issue to let the default skip_blank value here ?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
tdomhancommented, Feb 28, 2019

Thanks for reporting! This indeed doesn’t seem right.

1reaction
mjpostcommented, Feb 27, 2019

I should also have written, thanks for tracking this down and filing a perfect bug report. It seems you covered a real hole in my use cases 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Missing data values - IBM
Frequently, such obviously wrong values are purposely entered, or fields left blank, during a questionnaire to indicate a nonresponse.
Read more >
How to Deal with Missing Values in Your Dataset - KDnuggets
In this article, we are going to talk about how to identify and treat the missing values in the data step by step....
Read more >
Data Preparation - PyCaret Official - GitBook
Datasets for various reasons may have missing values or empty records, often encoded as blanks or NaN . Most of the machine learning...
Read more >
Automatically skip blanks in Excel charts with formulas (ignore ...
Download the workbook here: http://www.xelplus.com/ skip -dates- in -excel-charts/ In this video I show you how to dynamically ignore blank dates ...
Read more >
Skip blank records in a join - Designer - Alteryx Community
Solved: Is there a way I can skip blank or nonexistent records while doing a join? I have a list of 200 client...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found