Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

SodaCL metadata/table discovery

See original GitHub issue

Generating and pushing table information such as: name, columns, database types will be performed by soda-core.

The original proposition looks like this:

profiling basic:
  tables:
    - SODATEST_%
    - include SODATEST_%
    - exclude SODATEST_%
  schema: enabled

We are however thinking about another top-level name along the lines of (based on a more explicit proposal from @janet-can:

discover tables:

I assume the rest of the controls are going to stay the same meaning that users should configure column profiling via this canonical sodaCL spec:

discover tables:
  tables:
    - SODATEST_%
    - include SODATEST_%
    - exclude SODATEST_%
  schema: enabled

@tombaeyens can you confirm this makes sense language-wise? Also, can you explain the intention behind the schema key? What does it control?

Issue Analytics

State:
Created a year ago
Comments:6 (6 by maintainers)

Top GitHub Comments

1reaction

tombaeyenscommented, Apr 6, 2022

Agreed. Blessing given for naming the SodaCL top level entry discover tables:

0reactions

tombaeyenscommented, Apr 19, 2022

@mathissedestrooper @bastienboutonnet collecting table samples is indeed also on a table level just like table discovery. But I think it’s important that the set of tables are distinct for those 2 things. I ll explain:

Querying a data source for all it’s tables and ensuring that all tables are available in the Soda Cloud UI, is something that you typically want for all tables. There is no real performance problem.

Capturing samples for all tables is much more demanding in terms of compute and storage requirements. So that is something that you want to limit to a specific subset of tables. At least that is my guess.

wdyt ?