StatisticsGen does not recognize missing fields
See original GitHub issueFor some reason StatisticsGen is not recognizing the missing values in my file. I was expecting that StatisticsGen recognizes the missing fields in the last records. When I generate the statistics using generate_statistics_from_csv directly works perfectly fine.
# Loading the Files
input_config = example_gen_pb2.Input(splits=[
example_gen_pb2.Input.Split(name='train', pattern='train/*'),
example_gen_pb2.Input.Split(name='eval', pattern='eval/*')
])
example_gen = CsvExampleGen(
input_base=_data_root,
input_config=input_config)
context.run(example_gen, enable_cache=False)
# Generating Stats
statistics_gen = StatisticsGen(examples=example_gen.outputs['examples'])
context.run(statistics_gen, enable_cache=False)
# Showing the Stats
context.show(statistics_gen.outputs['statistics'])
File being used:
index,inputSurname,label
0,BALLADARAS NATE RAE,0
1,LABRANCHE TRACIE SURIANO,0
2,VENTURES LLC TIERNAN RE,1
3,CHOU ABC,1
4,JENSEN DARREN RANEE,0
5,VANDERMOLEN DEBORA PATRICIA,0
6,ZAMBRANO YANGFANG SESE,0
7,IMAGE LLC DENTAL,1
8,OFFICE S BRUCE LAW,1
9,,
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:10 (2 by maintainers)
Top Results From Across the Web
Tensorflow TFDV does not work with Specific NaN values
I'm using Tensorflow Data Validation to generate stats from the data and infer an schema to input in TFX. I didn't find any...
Read more >tfx.v1.components.StatisticsGen - TensorFlow
The StatisticsGen component generates features statistics and random samples over training data, which can be used for visualization and validation.
Read more >Solved: MISSING FIELDS - Microsoft Power BI Community
From time to time, Existing fields do not show in the formula bar and writting them is not accepted and return an error...
Read more >TensorFlow Extended (TFX) for data validation in practice
We all know that real-life data can be low-quality and full of surprises: missing values, measurement errors, poorly specified fields or non- ...
Read more >Bulk Statistics File Format - Cisco
For example, a missing field indicates an error. This field does not validate the data. The keying information for these statistics in this...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I had same problem, and this worked as solution (i had only int and “missing”):
Are you satisfied with the resolution of your issue? Yes No