question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Describe the feature

A clear and concise description of what you want to happen. dbt has the not_null test to monitor for null values, but in certain datasets blank values are also problematic. However, there is no not_blank test that I found when searching utils or dbt-expectations.

Describe alternatives you’ve considered

A clear and concise description of any alternative solutions or features you’ve considered. Creating a custom test that errors if there are blanks.

Additional context

Is this feature database-specific? Which database(s) is/are relevant? Please include any other relevant context here. I think blank entries are a possibility on all databases

Who will this benefit?

What kind of use case will this feature be useful for? Please be specific and provide examples, this will help us prioritize properly. Anyone needing to check data integrity to ensure there are no blank values in their dataset

Are you interested in contributing this feature?

Yes plz 🙏🏻

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
joellabescommented, Aug 8, 2022

@tigitz thanks for the feedback and the links to prior art!

Which raise the question on how the rule should behave when encountering a blank array for database that support them.

I would be surprised if we could validate both strings and arrays in the same test without knowing their type in advance (which I think is your point in recommending _string and _array variants?)

My suggestion would be to be explicit in the rules name and avoid grouping different kind of tests into a single one using parameter flags. That would cause confusion.

Can you say more about the confusion element? I’m pretty sure I disagree with you on this one, at least on tests of the same column type (e.g. empty string vs blank string). I consider it to be analogous to the inclusive param on accepted_range, and think that it’d be tidier/less cognitive overhead to have a single test whose behaviour you tweak as opposed to having two almost identical tests with suspiciously similar names.

(Accidental proving of my point: I wrote a lot of this response with blank and empty the wrong way around, and only discovered my mistake when I scrolled back up to your original comment before posting).

What shall we name the ~baby~ test?

A meme that gets in the way of my point

I would prefer that we go with not_empty_string, which conforms to the Java definition of empty (apart from the null stuff) and makes semantic sense at a glance. Then I’d have an optional parameter on it, trim_whitespace 1 which is false by default.

1 In an earlier version, I suggested ignore_whitespace. Coming to this with fresh eyes, that’s unclear - what’s doing the ignoring and how does ignore manifest?

Arrays

I could go either way on building a test to check whether arrays are empty or not - if either you or @epapineau wants to build one then I wouldn’t be against it, but I’m also happy to punt for a while.

0reactions
epapineaucommented, Aug 9, 2022

Thanks for all the incredible thought poured into this @joellabes and @tigitz. A quick first pass at the not_empty_string(trim_whitespace) version of the test is here.

Is there a preference on using length = 0 over column_name = ''?

I can add an array counterpart + tests to round out the PR in the next day or two 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Difference Between @NotNull, @NotEmpty, and @NotBlank ...
Now we need to examine how @NotNull actually works under the hood. To do so, let's create a simple unit test for the...
Read more >
How to test my @NotEmpty with JUnit - java - Stack Overflow
I think your test does not fail because 1) you are not actually validating the object your are just initialising it and 2)...
Read more >
Determine if a Cell is Not Blank in Excel - Got It AI
ISBLANK function is more straightforward. It tests whether a value or a cell is blank or not. Syntax. =ISBLANK(value).
Read more >
Using IF to check if a cell is blank - Microsoft Support
In this case we're using IF with the ISBLANK function: =IF(ISBLANK(D2),"Blank","Not Blank"). Which says IF(D2 is blank ...
Read more >
If cell is not blank - Excel formula - Exceljet
To test if a cell is not blank (i.e. has content), you can use a formula based on the IF function. In the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found