question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ValueError: cannot convert float NaN to integer

See original GitHub issue

Thanks for this open source contribution!

I am seeing the same problem as in issue #37 - although I am using the fix for that issue (commit c85ca3f5b1c00b785ca346882a8983d57287d75f to generate my datalist)

Specifically, when I train on my own data (open source dataset FinTabNet) I see the following errors:

/home/usr/DAVAR-Lab-OCR/davarocr/davar_table/core/mask/lp_mask_target.py:55: RuntimeWarning: Mean of empty slice.
  middle_x, middle_y = round(np.where(box_text == 1)[1].mean()), round(np.where(box_text == 1)[0].mean())
/home/usr/DAVAR-Lab-OCR/davarocr/davar_table/core/mask/lp_mask_target.py:55: RuntimeWarning: Mean of empty slice.
Traceback (most recent call last):
  File "acx-tsr/scripts/train_lgpma.py", line 316, in <module>
    main()
  File "acx-tsr/scripts/train_lgpma.py", line 311, in main
    meta=meta,
  File "/home/usr/DAVAR-Lab-OCR/davarocr/davar_common/apis/train.py", line 174, in train_model
  middle_x, middle_y = round(np.where(box_text == 1)[1].mean()), round(np.where(box_text == 1)[0].mean())
    runner.run(data_loaders, cfg.workflow)
  File "/home/usr/mmcv/mmcv/runner/epoch_based_runner.py", line 125, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/usr/mmcv/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True, **kwargs)
  File "/home/usr/mmcv/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
    **kwargs)
  File "/home/usr/mmcv/mmcv/parallel/data_parallel.py", line 67, in train_step
    return self.module.train_step(*inputs[0], **kwargs[0])
  File "/home/usr/mmdetection/mmdet/models/detectors/base.py", line 247, in train_step
    losses = self(**data)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 918, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/usr/mmcv/mmcv/runner/fp16_utils.py", line 124, in new_func
    output = old_func(*new_args, **new_kwargs)
  File "/home/usr/mmdetection/mmdet/models/detectors/base.py", line 181, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/home/usr/DAVAR-Lab-OCR/davarocr/davar_table/models/detectors/lgpma.py", line 132, in forward_train
    **kwargs)
  File "/home/usr/DAVAR-Lab-OCR/davarocr/davar_table/models/roi_heads/lgpma_roi_head.py", line 84, in forward_train
    gt_masks, img_metas, gt_bboxes)
  File "/home/usr/DAVAR-Lab-OCR/davarocr/davar_table/models/roi_heads/lgpma_roi_head.py", line 131, in _mask_forward_train
    self.train_cfg, gt_bboxes)
  File "/home/usr/DAVAR-Lab-OCR/davarocr/davar_table/models/roi_heads/mask_heads/lpma_mask_head.py", line 105, in get_targets
    gt_lpma_hor, gt_lpma_ver = get_lpmasks(gt_masks, gt_bboxes)
  File "/home/usr/DAVAR-Lab-OCR/davarocr/davar_table/core/mask/lp_mask_target.py", line 30, in get_lpmasks
    gt_masks_temp = list(gt_masks_temp)
  File "/home/usr/DAVAR-Lab-OCR/davarocr/davar_table/core/mask/lp_mask_target.py", line 55, in get_lpmask_single
    middle_x, middle_y = round(np.where(box_text == 1)[1].mean()), round(np.where(box_text == 1)[0].mean())
  ValueError: cannot convert float NaN to integer

Any help would be very appreciated!

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
qiaoliang6commented, Mar 25, 2022

Could you provide the datalist you generated (you may send it by email)?

0reactions
julianmackcommented, Apr 14, 2022

To update on this if anyone else is having this issue I added a hack to fix it here: https://github.com/AccelexTechnology/DAVAR-Lab-OCR/commit/ce6d9ce766573362e3354ece5718ca722feaa4c4

^ this was only appropriate as it affected a small proportion of my dataset (~1/5000 samples) - it will give bad results otherwise as I am setting the index to an arbitrary (and hence likely incorrect) non-NaN value

I spent a long time looking at one of the samples from FinTabNet that was causing the issue here:

{"pdf/RCL/2017/page_60_95895.png": {"height": 888, "width": 2065, "content_ann": {"bboxes": [[], [1195, 11, 1529, 43], [], [761, 177, 832, 209], [1013, 116, 1146, 209], [1325, 116, 1399, 209], [1608, 116, 1682, 209], [1855, 116, 2002, 209], [0, 233, 337, 273], [], [], [], [], [], [24, 292, 497, 334], [663, 293, 914, 334], [946, 293, 1197, 334], [1229, 293, 1480, 334], [1512, 293, 1763, 334], [1795, 293, 2046, 334], [24, 353, 475, 394], [752, 354, 914, 394], [1066, 354, 1197, 394], [1349, 354, 1480, 394], [1631, 354, 1763, 394], [1915, 354, 2046, 394], [24, 414, 147, 455], [783, 415, 914, 455], [1066, 415, 1197, 455], [1349, 415, 1480, 455], [1631, 415, 1763, 455], [1915, 415, 2046, 455], [0, 475, 326, 516], [915, 512, 917, 516], [], [], [], [], [24, 535, 472, 576], [732, 536, 914, 576], [1036, 536, 1197, 576], [1319, 536, 1480, 576], [1601, 536, 1763, 576], [1885, 536, 2046, 576], [0, 596, 337, 637], [915, 634, 917, 638], [], [], [], [], [24, 656, 497, 697], [752, 657, 914, 697], [1036, 657, 1197, 697], [1319, 657, 1480, 697], [1601, 657, 1763, 697], [1885, 657, 2046, 697], [24, 717, 452, 758], [803, 718, 914, 758], [1106, 718, 1197, 758], [1389, 718, 1480, 758], [1672, 718, 1763, 758], [1935, 718, 2046, 758], [24, 777, 147, 819], [803, 778, 914, 819], [1106, 778, 1197, 819], [1370, 778, 1480, 819], [1672, 778, 1763, 819], [2006, 778, 2046, 819], [0, 839, 83, 879], [663, 839, 914, 879], [946, 839, 1197, 879], [1229, 839, 1480, 879], [1512, 839, 1763, 879], [1795, 839, 2046, 879]], "texts": ["", "Payments due by period", "", "Total", "Less than 1 year", "1-3 years", "3-5 years", "More than 5 years", "Operating Activities:", "", "", "", "", "", "Operating lease obligations(1)", "$241,468", "$29,420", "$44,191", "$22,644", "$145,213", "Interest on long-term debt(2)", "1,275,346", "250,600", "415,000", "292,665", "317,081", "Other(3)", "879,206", "214,444", "282,570", "150,003", "232,189", "Investing Activities:", "0", "", "", "", "", "Ship purchase obligations(4)", "10,888,494", "2,368,806", "3,063,165", "4,089,153", "1,367,370", "Financing Activities:", "0", "", "", "", "", "Long-term debt obligations(5)", "7,506,312", "1,185,038", "2,047,882", "2,012,922", "2,260,470", "Capital lease obligations(6)", "33,139", "3,476", "7,210", "8,395", "14,058", "Other(7)", "21,552", "8,868", "11,217", "1,467", "\u2014", "Total", "$20,845,517", "$4,060,652", "$5,871,235", "$6,577,249", "$4,336,381"], "texts_tokens": [[], ["P", "a", "y", "m", "e", "n", "t", "s", " ", "d", "u", "e", " ", "b", "y", " ", "p", "e", "r", "i", "o", "d"], [], ["T", "o", "t", "a", "l"], ["L", "e", "s", "s", " ", "t", "h", "a", "n", " ", "1", " ", "y", "e", "a", "r"], ["1", "-", "3", " ", "y", "e", "a", "r", "s"], ["3", "-", "5", " ", "y", "e", "a", "r", "s"], ["M", "o", "r", "e", " ", "t", "h", "a", "n", " ", "5", " ", "y", "e", "a", "r", "s"], ["O", "p", "e", "r", "a", "t", "i", "n", "g", " ", "A", "c", "t", "i", "v", "i", "t", "i", "e", "s", ":"], [], [], [], [], [], ["O", "p", "e", "r", "a", "t", "i", "n", "g", " ", "l", "e", "a", "s", "e", " ", "o", "b", "l", "i", "g", "a", "t", "i", "o", "n", "s", "<sup>", "(", "1", ")", "</sup>"], ["$", "2", "4", "1", ",", "4", "6", "8"], ["$", "2", "9", ",", "4", "2", "0"], ["$", "4", "4", ",", "1", "9", "1"], ["$", "2", "2", ",", "6", "4", "4"], ["$", "1", "4", "5", ",", "2", "1", "3"], ["I", "n", "t", "e", "r", "e", "s", "t", " ", "o", "n", " ", "l", "o", "n", "g", "-", "t", "e", "r", "m", " ", "d", "e", "b", "t", "<sup>", "(", "2", ")", "</sup>"], ["1", ",", "2", "7", "5", ",", "3", "4", "6"], ["2", "5", "0", ",", "6", "0", "0"], ["4", "1", "5", ",", "0", "0", "0"], ["2", "9", "2", ",", "6", "6", "5"], ["3", "1", "7", ",", "0", "8", "1"], ["O", "t", "h", "e", "r", "<sup>", "(", "3", ")", "</sup>"], ["8", "7", "9", ",", "2", "0", "6"], ["2", "1", "4", ",", "4", "4", "4"], ["2", "8", "2", ",", "5", "7", "0"], ["1", "5", "0", ",", "0", "0", "3"], ["2", "3", "2", ",", "1", "8", "9"], ["I", "n", "v", "e", "s", "t", "i", "n", "g", " ", "A", "c", "t", "i", "v", "i", "t", "i", "e", "s", ":"], ["0"], [], [], [], [], ["S", "h", "i", "p", " ", "p", "u", "r", "c", "h", "a", "s", "e", " ", "o", "b", "l", "i", "g", "a", "t", "i", "o", "n", "s", "<sup>", "(", "4", ")", "</sup>"], ["1", "0", ",", "8", "8", "8", ",", "4", "9", "4"], ["2", ",", "3", "6", "8", ",", "8", "0", "6"], ["3", ",", "0", "6", "3", ",", "1", "6", "5"], ["4", ",", "0", "8", "9", ",", "1", "5", "3"], ["1", ",", "3", "6", "7", ",", "3", "7", "0"], ["F", "i", "n", "a", "n", "c", "i", "n", "g", " ", "A", "c", "t", "i", "v", "i", "t", "i", "e", "s", ":"], ["0"], [], [], [], [], ["L", "o", "n", "g", "-", "t", "e", "r", "m", " ", "d", "e", "b", "t", " ", "o", "b", "l", "i", "g", "a", "t", "i", "o", "n", "s", "<sup>", "(", "5", ")", "</sup>"], ["7", ",", "5", "0", "6", ",", "3", "1", "2"], ["1", ",", "1", "8", "5", ",", "0", "3", "8"], ["2", ",", "0", "4", "7", ",", "8", "8", "2"], ["2", ",", "0", "1", "2", ",", "9", "2", "2"], ["2", ",", "2", "6", "0", ",", "4", "7", "0"], ["C", "a", "p", "i", "t", "a", "l", " ", "l", "e", "a", "s", "e", " ", "o", "b", "l", "i", "g", "a", "t", "i", "o", "n", "s", "<sup>", "(", "6", ")", "</sup>"], ["3", "3", ",", "1", "3", "9"], ["3", ",", "4", "7", "6"], ["7", ",", "2", "1", "0"], ["8", ",", "3", "9", "5"], ["1", "4", ",", "0", "5", "8"], ["O", "t", "h", "e", "r", "<sup>", "(", "7", ")", "</sup>"], ["2", "1", ",", "5", "5", "2"], ["8", ",", "8", "6", "8"], ["1", "1", ",", "2", "1", "7"], ["1", ",", "4", "6", "7"], ["\u2014"], ["T", "o", "t", "a", "l"], ["$", "2", "0", ",", "8", "4", "5", ",", "5", "1", "7"], ["$", "4", ",", "0", "6", "0", ",", "6", "5", "2"], ["$", "5", ",", "8", "7", "1", ",", "2", "3", "5"], ["$", "6", ",", "5", "7", "7", ",", "2", "4", "9"], ["$", "4", ",", "3", "3", "6", ",", "3", "8", "1"]], "cells": [[0, 0, 0, 0], [0, 1, 0, 5], [1, 0, 1, 0], [1, 1, 1, 1], [1, 2, 1, 2], [1, 3, 1, 3], [1, 4, 1, 4], [1, 5, 1, 5], [2, 0, 2, 0], [2, 1, 2, 1], [2, 2, 2, 2], [2, 3, 2, 3], [2, 4, 2, 4], [2, 5, 2, 5], [3, 0, 3, 0], [3, 1, 3, 1], [3, 2, 3, 2], [3, 3, 3, 3], [3, 4, 3, 4], [3, 5, 3, 5], [4, 0, 4, 0], [4, 1, 4, 1], [4, 2, 4, 2], [4, 3, 4, 3], [4, 4, 4, 4], [4, 5, 4, 5], [5, 0, 5, 0], [5, 1, 5, 1], [5, 2, 5, 2], [5, 3, 5, 3], [5, 4, 5, 4], [5, 5, 5, 5], [6, 0, 6, 0], [6, 1, 6, 1], [6, 2, 6, 2], [6, 3, 6, 3], [6, 4, 6, 4], [6, 5, 6, 5], [7, 0, 7, 0], [7, 1, 7, 1], [7, 2, 7, 2], [7, 3, 7, 3], [7, 4, 7, 4], [7, 5, 7, 5], [8, 0, 8, 0], [8, 1, 8, 1], [8, 2, 8, 2], [8, 3, 8, 3], [8, 4, 8, 4], [8, 5, 8, 5], [9, 0, 9, 0], [9, 1, 9, 1], [9, 2, 9, 2], [9, 3, 9, 3], [9, 4, 9, 4], [9, 5, 9, 5], [10, 0, 10, 0], [10, 1, 10, 1], [10, 2, 10, 2], [10, 3, 10, 3], [10, 4, 10, 4], [10, 5, 10, 5], [11, 0, 11, 0], [11, 1, 11, 1], [11, 2, 11, 2], [11, 3, 11, 3], [11, 4, 11, 4], [11, 5, 11, 5], [12, 0, 12, 0], [12, 1, 12, 1], [12, 2, 12, 2], [12, 3, 12, 3], [12, 4, 12, 4], [12, 5, 12, 5]], "labels": [[1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1]]}}}

But couldn’t see anything wrong that would allow me to programatically filter the bad samples out of the dataset. I think the issue arose in the DavarLoadAnnotations._poly2mask function but didn’t debug further than this

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas: ValueError: cannot convert float NaN to integer
NB: You have to go through numpy float first and then to nullable Int32, for some reason. The size of the int if...
Read more >
How to Fix: ValueError: cannot convert float NaN to integer
This error occurs when you attempt to convert a column in a pandas DataFrame from a float to an integer, yet the column...
Read more >
How to Fix: ValueError: cannot convert float NaN to integer
In Python, NaN stands for Not a Number. This error will occur when we are converting the dataframe column of the float type...
Read more >
ValueError: cannot convert float NaN ... - Net-Informations.Com
NaN is short for Not a Number . It is a numeric data type used to represent any value that is undefined or...
Read more >
valueerror: cannot convert float nan to integer ( Solved )
There are many times when a programmer raises an exception that is the ValueError. You can get this error when you give a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found