question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Reorganize rule names

See original GitHub issue

Search before asking

  • I searched the issues and found no similar issues.

Description

Let’s think about a nice way to rename rules/codes, so that we don’t have to look up what the rule code is everytime we think about them 😵

Here’s an example of what pylint messages look like:

$ pylint etl.py 
************* Module etl
etl.py:20:61: C0303: Trailing whitespace (trailing-whitespace)
etl.py:1:0: C0114: Missing module docstring (missing-module-docstring)
etl.py:7:0: W0622: Redefining built-in 'compile' (redefined-builtin)
etl.py:7:0: E0401: Unable to import 'parse' (import-error)

------------------------------------------------------------------
Your code has been rated at 5.56/10 (previous run: 9.95/10, -4.40)

You can see that the have “common sense” names as the end of each linting message.

Here’s an example of what the current output looks like for sqlfluff lint:

$ sqlfluff lint models/output/
=== [dbt templater] Sorting Nodes...                                                       
=== [dbt templater] Compiling dbt project...                                               
=== [dbt templater] Project Compiled.                                                      
== [models/output/redcap_import.sql] FAIL                                                  
L:   6 | P:   5 | L019 | Found leading comma. Expected only trailing.                      
L:   6 | P:   6 | L008 | Commas should be followed by a single whitespace unless
                       | followed by a comment.
L:   9 | P:  40 | L012 | Implicit/explicit aliasing of columns.
L:  17 | P:   6 | L003 | Expected 1 indentations, found 1 [compared to line 16]

For the record I think on the whole, our output is better organized/nicer to look at than pylint. But I think it would be greatly improved if we could implement common sense names like this:

$ sqlfluff lint models/output/
=== [dbt templater] Sorting Nodes...                                                       
=== [dbt templater] Compiling dbt project...                                               
=== [dbt templater] Project Compiled.                                                      
== [models/output/redcap_import.sql] FAIL                                                  
L:   6 | P:   5 | L019 | Found leading comma. Expected only trailing. (wrong-comma-style)                  
L:   6 | P:   6 | L008 | Commas should be followed by a single whitespace unless
                       | followed by a comment. (comma-missing-whitespace)
L:   9 | P:  40 | L012 | Implicit/explicit aliasing of columns. (wrong-column-alias-style)
L:  17 | P:   6 | L003 | Expected 1 indentations, found 1 [compared to line 16] (unmatched-indentation)

With this change, I could ideally also disable/enable rules according to their readable names and/or their codes, such as

select
  field_1,
field_2, --noqa: unmatched-indentation
from my_table

You’ll also notice from the above example that pylint codes have different prefixes (W0622, E0401, etc)

Here’s their definitions for those, but basically E = Error, W = Warning, R = Refactor, C = Convention

And this brings us to the second part of this issue: what the codes should be. I think we can follow pylint’s example, and break up the rules into categories. Here are a few to start:

R = readability

  • Operators should follow a standard for being before/after newlines (L007 --> R001)
  • Inconsistent capitalisation of keywords (L010 --> R002)

C = Convention (AKA Best Practices)

  • Implicit/explicit aliasing of table (L011 --> C002)
  • Table aliases should be unique within each clause (L020 --> C002)
  • Trailing commas within select clause. (L038 --> C002)

W = Whitespace

  • Indentation not consistent with previous lines (L003 --> W001)
  • Operators should be surrounded by a single whitespace (L006 --> W002)

D = dialect specific

  • SP_ prefix should not be used for user-defined stored procedures in T-SQL. (L056 --> D001)

Use case

  • Give all the rules “symbolic” names or “short” names
  • Include these in the lint messages
  • Allow these to be disabled in the config by short name and/or rule code
  • Consider if rules should be disabled by default (I’m looking at you L052. And yes I’m not going to say what rule that is here in order to prove a point on why this is an important change to make 😜 )
  • Should we more tightly integrate these will rule groups? Should these replace rule groups, and all just become implicit rule groups?

Dialect

All

Are you willing to work on and submit a PR to address the issue?

  • Yes I am willing to submit a PR!

Code of Conduct

Issue Analytics

  • State:open
  • Created 10 months ago
  • Comments:16 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
tunethewebcommented, Nov 7, 2022

Another question for you – do we continue to support the old LXXX syntax in the 2.0.0 release? If so, for how long? Or do we just do a clean break from it? We can just spit out a warning that there are unused config values, it doesn’t have to stop the linting from actually running (the same way if I has --noqa: L100 in the code.

A lot of linters (pylint included?) support both. I don’t see why we wouldn’t do the same? Maybe for 3.0.0 we drop the older ones, but I wouldn’t for 2.0.0 - that would be quite a big change IMHO.

0reactions
pwildenhaincommented, Dec 14, 2022

Been looking at the code today and getting some initial thoughts down

New selection syntax should support

  • Rule codes (LS001)
  • Old rule codes (L001)
  • Rule names (layout.spacing)
  • Rule namespaces (layout)
  • Rule groups (core, format)

And it should support nested configs, meaning if we have a .sqlfluff file in the project root AND in one of the child directories, then we carry over selected rules from the root directory and further include/exclude rules based off configuration in the child directory.

On a separate note, I don’t think that nesting rules/exclude_rules in the .sqlfluff currently works super well – though I might be approaching it the wrong way. I can talk more about this at the next maintainers meeting

Here’s one potential approach:

We’ll still keep rules and exclude_rules config values.

We support comma separated selection/exclusion e.g. core,LB005, layout,captilasation.literals

Before any filtering is done, the selection input is expanded into a list of rule names e.g. "layout.spacing,L009" --> [LS001,LS002,LS003,LB001]

It then replaces any selection configuration that came before it (nested configs)

Now we have our final rules selection

Follow the same process for exclude_rules (including exclusion configuration that comes before it)

Then filter the rules being included against the rules being excluded, and you have your rules to run for a given file(s).

I think in order to do proper expansion from criteria to individual rules we’ll need to collect some sort of rule metadata at the beginning of linting that we can reference while doing the selection (similar to how dbt looks at the manifest.json to help with model selection). It might even be smart to follow their lead include a manifest when sqfluff is installed that has all this metadata. We could auto-generate it whenever rules are added/re-categorized, and include it as an asset in the package.

Lastly, with this approach it would no longer allow as much fine tuning. For example, under this proposal if someone only wants to run L005, it’s going to be re-mapped to LS001 which actually includes L001,L005,L006,L008,L015,L017,L023,L024,L039,L048,L050. I think this is actually a big improvement, though it is a breaking change. But if someone only wants L005 and not all these other rules, they’ll be disappointed

I know this was a little rambly, but I’m happy to take any feedback

Read more comments on GitHub >

github_iconTop Results From Across the Web

ReNamer:Rules:Rearrange - den4b Wiki
This rule allows you to chop up the existing file name and reuse any/all of the parts in any order to compose a...
Read more >
Edit an Outlook rule - Microsoft Support
To change the settings, name, location or behavior of a rule: Click File > Manage Rules & Alerts. Check the box next to...
Read more >
Rearrange code | IntelliJ IDEA Documentation - JetBrains
Matching rules let you define elements order as a list of rules, where every rule has a set of matching conditions, such as...
Read more >
Organize your rules with labels | Cloud automation Cloud
Enter a name for your label, and choose a color. Once a label has been added, drag and drop a rule on the...
Read more >
California Rules of Court Reorganization
California Rules of Court Reorganization ... New Rule. Number. Old Rule. Number. Rule Title. Repealed ... Compliance with fictitious business name laws.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found