question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

datapkg_to_sqlite fails to load all of EPA CEMS

See original GitHub issue

After doing a full ETL of all years and states in CEMS, the datapkg_to_sqlite script doesn’t seem to load all of that data into the SQLite database. Rather, it only loads the last year of data into the database. However, the process terminates quickly, so it’s probably not even attempting to load all the data. Suspect it’s an issue with the iteration and/or partitioning…

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
cmgosnellcommented, Nov 20, 2019

Hey @roll! this was totally just our mistake, so don’t worry about it. We are making a bunch of data source specific data packages and then squishing them together into one package, I set up a process for determining how to generate a new data package without duplicating elements of the metadata… but I messed up and it was only grabbing one of the CEMS resources. It was a very simple fix once we figured out what was happening.

0reactions
zaneselvanscommented, Nov 19, 2019

No, still getting the error with the most recent versions of the datapackage libraries. A very simple version with a couple of resources in a group seems to work as expected, but the simplest PUDL output that tests the behavior doesn’t work. I’m trying to simplify that resource group output one step at a time until I get to a minimal example to share with you.

Read more comments on GitHub >

github_iconTop Results From Across the Web

PUDL Data Release v1.0.0 - Zenodo
Load All of PUDL in a Single Line Use cd to get into your new directory ... Convert the EPA CEMS data package...
Read more >
Building and Testing PUDL — PUDL 0.3.2 documentation
The ETL tests run the data processing pipeline on either the most recent year of data, or all working years of data. The...
Read more >
Could not load file or assembly 'System.Data.SQLite'
System.Data.SQLite.dll is a mixed assembly, i.e. it contains both managed code and native code. Therefore a particular System.Data.
Read more >
Frictionless Public Utility Data: A Pilot Study
The Catalyst team used Tabular Data Packages to record and store this metadata ... from EPA CEMS (e.g., ramp rates, min/max operating loads, ......
Read more >
PUDL v0.5.0: 2020 and Beyond - Catalyst Cooperative
In practice, we always loaded the data packages into SQLite and ... into a database or (in the case of the 800,000,000 row...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found