Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Reorganize rule names

See original GitHub issue

Search before asking

I searched the issues and found no similar issues.

Description

Let’s think about a nice way to rename rules/codes, so that we don’t have to look up what the rule code is everytime we think about them 😵

Here’s an example of what pylint messages look like:

$ pylint etl.py 
************* Module etl
etl.py:20:61: C0303: Trailing whitespace (trailing-whitespace)
etl.py:1:0: C0114: Missing module docstring (missing-module-docstring)
etl.py:7:0: W0622: Redefining built-in 'compile' (redefined-builtin)
etl.py:7:0: E0401: Unable to import 'parse' (import-error)

------------------------------------------------------------------
Your code has been rated at 5.56/10 (previous run: 9.95/10, -4.40)

You can see that the have “common sense” names as the end of each linting message.

Here’s an example of what the current output looks like for sqlfluff lint:

$ sqlfluff lint models/output/
=== [dbt templater] Sorting Nodes...                                                       
=== [dbt templater] Compiling dbt project...                                               
=== [dbt templater] Project Compiled.                                                      
== [models/output/redcap_import.sql] FAIL                                                  
L:   6 | P:   5 | L019 | Found leading comma. Expected only trailing.                      
L:   6 | P:   6 | L008 | Commas should be followed by a single whitespace unless
                       | followed by a comment.
L:   9 | P:  40 | L012 | Implicit/explicit aliasing of columns.
L:  17 | P:   6 | L003 | Expected 1 indentations, found 1 [compared to line 16]

For the record I think on the whole, our output is better organized/nicer to look at than pylint. But I think it would be greatly improved if we could implement common sense names like this:

$ sqlfluff lint models/output/
=== [dbt templater] Sorting Nodes...                                                       
=== [dbt templater] Compiling dbt project...                                               
=== [dbt templater] Project Compiled.                                                      
== [models/output/redcap_import.sql] FAIL                                                  
L:   6 | P:   5 | L019 | Found leading comma. Expected only trailing. (wrong-comma-style)                  
L:   6 | P:   6 | L008 | Commas should be followed by a single whitespace unless
                       | followed by a comment. (comma-missing-whitespace)
L:   9 | P:  40 | L012 | Implicit/explicit aliasing of columns. (wrong-column-alias-style)
L:  17 | P:   6 | L003 | Expected 1 indentations, found 1 [compared to line 16] (unmatched-indentation)

With this change, I could ideally also disable/enable rules according to their readable names and/or their codes, such as

select
  field_1,
field_2, --noqa: unmatched-indentation
from my_table

You’ll also notice from the above example that pylint codes have different prefixes (W0622, E0401, etc)

Here’s their definitions for those, but basically E = Error, W = Warning, R = Refactor, C = Convention

And this brings us to the second part of this issue: what the codes should be. I think we can follow pylint’s example, and break up the rules into categories. Here are a few to start:

R = readability

Operators should follow a standard for being before/after newlines (L007 --> R001)
Inconsistent capitalisation of keywords (L010 --> R002)

C = Convention (AKA Best Practices)

Implicit/explicit aliasing of table (L011 --> C002)
Table aliases should be unique within each clause (L020 --> C002)
Trailing commas within select clause. (L038 --> C002)

W = Whitespace

Indentation not consistent with previous lines (L003 --> W001)
Operators should be surrounded by a single whitespace (L006 --> W002)

D = dialect specific

SP_ prefix should not be used for user-defined stored procedures in T-SQL. (L056 --> D001)

Use case

Give all the rules “symbolic” names or “short” names
Include these in the lint messages
Allow these to be disabled in the config by short name and/or rule code
Consider if rules should be disabled by default (I’m looking at you L052. And yes I’m not going to say what rule that is here in order to prove a point on why this is an important change to make 😜 )
Should we more tightly integrate these will rule groups? Should these replace rule groups, and all just become implicit rule groups?

Dialect

All

Are you willing to work on and submit a PR to address the issue?

Yes I am willing to submit a PR!

Code of Conduct

I agree to follow this project’s Code of Conduct

Issue Analytics

State:
Created 10 months ago
Comments:16 (13 by maintainers)

Top GitHub Comments

1reaction

tunethewebcommented, Nov 7, 2022

Another question for you – do we continue to support the old LXXX syntax in the 2.0.0 release? If so, for how long? Or do we just do a clean break from it? We can just spit out a warning that there are unused config values, it doesn’t have to stop the linting from actually running (the same way if I has --noqa: L100 in the code.

A lot of linters (pylint included?) support both. I don’t see why we wouldn’t do the same? Maybe for 3.0.0 we drop the older ones, but I wouldn’t for 2.0.0 - that would be quite a big change IMHO.

0reactions

pwildenhaincommented, Dec 14, 2022

Been looking at the code today and getting some initial thoughts down

New selection syntax should support

Rule codes (LS001)
Old rule codes (L001)
Rule names (layout.spacing)
Rule namespaces (layout)
Rule groups (core, format)

And it should support nested configs, meaning if we have a .sqlfluff file in the project root AND in one of the child directories, then we carry over selected rules from the root directory and further include/exclude rules based off configuration in the child directory.

On a separate note, I don’t think that nesting rules/exclude_rules in the .sqlfluff currently works super well – though I might be approaching it the wrong way. I can talk more about this at the next maintainers meeting

Here’s one potential approach:

We’ll still keep rules and exclude_rules config values.

We support comma separated selection/exclusion e.g. core,LB005, layout,captilasation.literals

Before any filtering is done, the selection input is expanded into a list of rule names e.g. "layout.spacing,L009" --> [LS001,LS002,LS003,LB001]

It then replaces any selection configuration that came before it (nested configs)

Now we have our final rules selection

Follow the same process for exclude_rules (including exclusion configuration that comes before it)

Then filter the rules being included against the rules being excluded, and you have your rules to run for a given file(s).

I think in order to do proper expansion from criteria to individual rules we’ll need to collect some sort of rule metadata at the beginning of linting that we can reference while doing the selection (similar to how dbt looks at the manifest.json to help with model selection). It might even be smart to follow their lead include a manifest when sqfluff is installed that has all this metadata. We could auto-generate it whenever rules are added/re-categorized, and include it as an asset in the package.

Lastly, with this approach it would no longer allow as much fine tuning. For example, under this proposal if someone only wants to run L005, it’s going to be re-mapped to LS001 which actually includes L001,L005,L006,L008,L015,L017,L023,L024,L039,L048,L050. I think this is actually a big improvement, though it is a breaking change. But if someone only wants L005 and not all these other rules, they’ll be disappointed

I know this was a little rambly, but I’m happy to take any feedback

Top Results From Across the Web

ReNamer:Rules:Rearrange - den4b Wiki

This rule allows you to chop up the existing file name and reuse any/all of the parts in any order to compose a...

Edit an Outlook rule - Microsoft Support

To change the settings, name, location or behavior of a rule: Click File > Manage Rules & Alerts. Check the box next to...

Rearrange code | IntelliJ IDEA Documentation - JetBrains

Matching rules let you define elements order as a list of rules, where every rule has a set of matching conditions, such as...

Organize your rules with labels | Cloud automation Cloud

Enter a name for your label, and choose a color. Once a label has been added, drag and drop a rule on the...

California Rules of Court Reorganization

California Rules of Court Reorganization ... New Rule. Number. Old Rule. Number. Rule Title. Repealed ... Compliance with fictitious business name laws.