question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

get_query_table_aliases(query) is far from being final

See original GitHub issue

While I was testing on Oracle SQL:

  • missed LEFT JOINs (it was easy to add to the code)
  • missed table names starting with schema name (SCHEMA.TABLE)
  • missed aliases without AS (this is mentioned in the code but not handled)
  • missed recognizing subselects (e.g. FROM (SELECT … ) )

I touched the code to overcome these issues and found solution for them. Tested with a huge SQL but may need to do some more testing.

New code:

def get_query_table_aliases(query: str) -> Dict[str, str]:
    """
    Returns tables aliases mapping from a given query

    E.g. SELECT a.* FROM users1 AS a JOIN users2 AS b ON a.ip_address = b.ip_address
    will give you {'a': 'users1', 'b': 'users2'}
    """
    aliases = dict()
    last_keyword_token = None
    last_table_name = None

    for token in get_query_tokens(query):
        # print(token.ttype, token, last_table_name)

        # handle "FROM foo alias" syntax (i.e, "AS" keyword is missing)
        # if last_table_name and token.ttype is Name:
        #     aliases[token.value] = last_table_name
        #     last_table_name = False

        if last_table_name:
            if token.value=='.': 
                last_table_name = last_table_name + token.value #add the dot
            if token.value==',' or token.is_keyword and token.value.upper()!='AS': #there is no alias
                aliases[''] = last_table_name
                last_table_name = False
            if prev_token.value.upper()=='AS': #previous keyword was AS then we found the alias
                aliases[token.value] = last_table_name
                last_table_name = False
            if token.ttype is Name: 
                if prev_token.value=='.': 
                    last_table_name = last_table_name + token.value #add Name to last_table_name
                else: #found alias
                    aliases[token.value] = last_table_name
                    last_table_name = False


        if last_keyword_token:
           if last_keyword_token.value.upper() in ["FROM", "JOIN", "INNER JOIN","LEFT JOIN"] and token.value!='(':
                last_table_name = token.value

        last_keyword_token = token if token.is_keyword else False
        prev_token = token

    return aliases


Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
bandriscacommented, Feb 19, 2021

SQL copied in looking awful, sorry. Also tried to copy it in as ‘code’.

So when I ran the modified module on the above SQL I got this dictionary:

{'C': 'EBH_DM_CRM.CRM_CUSTOMER', 'RB_ACCT': 'KMR_FINANCIAL_INDEX_STAGE_01', 'OD_FAC': 'KMR_FINANCIAL_INDEX_STAGE_07', 'CARD': 'KMR_FINANCIAL_INDEX_STAGE_02', 'CL_LOAN': 'KMR_FINANCIAL_INDEX_STAGE_03', 'CL_INV': 'KMR_FINANCIAL_INDEX_STAGE_05', 'BEF_ZRT': 'KMR_FINANCIAL_INDEX_STAGE_04', 'AR': 'SCHEMA.CRM_ARRANGEMENT', 'L': 'SCHEMA.CRM_LOAN', 'CF': 'SCHEMA.CRM_CASH_FLOW', 'CA': 'SCHEMA.CRM_REL_CUSTOMER_ARRANGEMENT', 'LPRE': 'SCHEMA.CRM_LOAN', 'P': 'SCHEMA.CRM_PRODUCT', 'PD': 'SCHEMA.CRM_PAST_DUE', 'DAT': 'SCHEMA.CRM_DATE', 'BLL': 'SCHEMA.CRM_BALANCE_LOAN_LOAN', 'E': 'SCHEMA.CRM_EXCHANGE_RATE', 'LE': 'SCHEMA.CRM_LEASING', 'BLE': 'SCHEMA.CRM_BALANCE_LEASING', 'SCB': 'SCHEMA.CRM_RL_SUBSIDIARY_COMPANIE_BAT', 'SZCH': 'KMR_FINANCIAL_INDEX_STAGE_06'}

Which I checked and looks correct.

this was the test py file:

import sqlparse
import sql_metadata2

sql_filename=r'test.sql'
with open(sql_filename,'r',encoding='latin-1') as content_file:
    content = content_file.read();  
content=sqlparse.format(content, strip_comments=True).strip() #need to remove comments first
print(content)
print(sql_metadata2.get_query_table_aliases(content))
1reaction
bandriscacommented, Feb 19, 2021

@macbre , I added a pull request. I don’t know where to share the test sql. Let me paste it here. SELECT *

FROM EBH_DM_CRM.CRM_CUSTOMER C

LEFT JOIN KMR_FINANCIAL_INDEX_STAGE_01 RB_ACCT ON RB_ACCT.CUSTOMER_ID = C.CUSTOMER_ID

LEFT JOIN KMR_FINANCIAL_INDEX_STAGE_07 OD_FAC ON OD_FAC.CUSTOMER_ID = C.CUSTOMER_ID

LEFT JOIN KMR_FINANCIAL_INDEX_STAGE_02 CARD ON CARD.CUSTOMER_ID = C.CUSTOMER_ID

LEFT JOIN KMR_FINANCIAL_INDEX_STAGE_03 CL_LOAN ON CL_LOAN.CUSTOMER_ID = C.CUSTOMER_ID

LEFT JOIN KMR_FINANCIAL_INDEX_STAGE_05 CL_INV ON CL_INV.CUSTOMER_ID = C.CUSTOMER_ID

LEFT JOIN KMR_FINANCIAL_INDEX_STAGE_04 BEF_ZRT ON BEF_ZRT.CUSTOMER_SYMBOLS_SID = C.SYMBOLS_CUSTOMER_SID

LEFT JOIN (SELECT CA.CUSTOMER_ID, SUM(CASE WHEN P.KMR_TYPE_GROUP IN (‘CRED_MORTG’, ‘CRED_HOUSE_NORM’, ‘CRED_HOUSE_SUBV’) AND LPRE.RELATED_LOAN IS NULL AND L.ROTATION_CODE = ‘N’ AND CF.INVOICE_TYPE = ‘PRI’ THEN CF.PAID_AMT ELSE 0 END) AS FIN_CREDIT_MORTGAGE_AP_AMT, SUM(CASE WHEN P.KMR_TYPE_GROUP IN (‘CRED_PERS’, ‘CRED_OTHER_LOMB’, ‘CRED_OTHER_GAR’, ‘CRED_OTHER_GAR_FRPR’, ‘CRED_OTHER_INVEST’, ‘CRED_OTHER_CURR’, ‘CRED_OTHER_SUBVINV’, ‘CRED_OTHER_OTHER’) AND LPRE.RELATED_LOAN IS NULL AND L.ROTATION_CODE = ‘N’ AND CF.INVOICE_TYPE = ‘PRI’ THEN CF.PAID_AMT ELSE 0 END) AS FIN_CREDIT_SHORT_AP_AMT FROM SCHEMA.CRM_ARRANGEMENT AR

           JOIN SCHEMA.CRM_LOAN L
             ON L.LOAN_ID = AR.LOAN_ID
            AND L.START_OF_VALIDITY <= DATE'2020-01-01'
            AND L.END_OF_VALIDITY > DATE'2020-01-01'

           JOIN SCHEMA.CRM_CASH_FLOW CF
             ON CF.LOAN_ID = L.LOAN_ID
            AND CF.RECEIPT_DATE BETWEEN TRUNC(&GLOBAL_P_EFFECTIVE_LOAD_DATE, 'MONTH') AND &GLOBAL_P_EFFECTIVE_LOAD_DATE
            AND CF.EFFECTIVE_LOAD_DATE BETWEEN TRUNC(&GLOBAL_P_EFFECTIVE_LOAD_DATE, 'MONTH') AND DATE'2020-01-01'

           JOIN SCHEMA.CRM_REL_CUSTOMER_ARRANGEMENT CA
             ON CA.ARRANGEMENT_ID = AR.ARRANGEMENT_ID
            AND CA.START_OF_VALIDITY <= DATE'2020-01-01'
            AND CA.END_OF_VALIDITY > DATE'2020-01-01'
            AND CA.RELATION_TYPE = 'PRIMARY_CUSTOMER'

           LEFT JOIN SCHEMA.CRM_LOAN LPRE
             ON LPRE.LOAN_SID1 = L.RELATED_LOAN
            AND LPRE.START_OF_VALIDITY <= DATE'2020-01-01'
            AND LPRE.END_OF_VALIDITY > DATE'2020-01-01'

           LEFT JOIN SCHEMA.CRM_PRODUCT P
             ON P.PRODUCT = L.LOAN_CODE2
            AND P.DML <> 'D'
            AND P.START_OF_VALIDITY <= DATE'2020-01-01'
            AND P.END_OF_VALIDITY > DATE'2020-01-01'

          WHERE 1 = 1
            AND AR.START_OF_VALIDITY <= DATE'2020-01-01'
            AND AR.END_OF_VALIDITY > DATE'2020-01-01'
          GROUP BY CA.CUSTOMER_ID) QRM_ELOTORL
ON QRM_ELOTORL.CUSTOMER_ID = C.CUSTOMER_ID

LEFT JOIN (SELECT PD.CUSTOMER_ID, CASE WHEN MAX(PD.DAYS_PAST_DUE) > 30 THEN MAX(PD.DAYS_PAST_DUE) - 30 ELSE 0 END AS EXP_CRED_DAYS_OVR_30_DAYS, SUM(CASE WHEN PD.DAYS_PAST_DUE > 30 THEN PD.PAST_DUE_AMOUNT ELSE 0 END) AS EXP_CRED_HUF_OVR_30_DAYS_AMT, CASE WHEN MAX(PD.DAYS_PAST_DUE) > 90 THEN MAX(PD.DAYS_PAST_DUE) - 90 ELSE 0 END AS EXP_CRED_DAYS_OVR_90_DAYS, SUM(CASE WHEN PD.DAYS_PAST_DUE > 90 THEN PD.PAST_DUE_AMOUNT ELSE 0 END) AS EXP_CRED_HUF_OVR_90_DAYS_AMT, MAX(PD.DAYS_PAST_DUE) AS EXP_CRED_DAYS, SUM(PD.PAST_DUE_AMOUNT) AS EXP_CRED_AMT

           FROM SCHEMA.CRM_PAST_DUE PD

          WHERE 1 = 1
            AND PD.PAST_DUE_SID4 = 'FACILITY_MAX_DPD'
            AND PD.DAYS_PAST_DUE > 0
            AND PD.PAST_DUE_AMOUNT > 0
            AND PD.EFFECTIVE_LOAD_DATE = &GLOBAL_P_EFFECTIVE_LOAD_DATE
          GROUP BY PD.CUSTOMER_ID) KESEDELMEK
ON KESEDELMEK.CUSTOMER_ID = C.CUSTOMER_ID

LEFT JOIN (SELECT BLL.CUSTOMER_ID, SUM(CASE WHEN P.TMO_HIER_LEVEL_1 = ‘HITEL’ AND P.TMO_HIER_LEVEL_2 = ‘MIKROVÃLLALATI HITEL’ AND (P.TMO_HIER_LEVEL_4 = ‘FAKTOR’ OR P.TMO_HIER_LEVEL_3 = ‘KÃNYSZERHITEL’ OR AR.ESTIMATED_END_DATE - L.FIRST_DISBURSEMENT_DATE <= 365) A ND AR.ESTIMATED_END_DATE > DAT.CALENDAR_MONTH_LAST_WORKDAY THEN NVL(BLL.OUTSTANDING_AMT * E.EXCHANGE_RATE_VALUE, 0) ELSE 0 END) AS FIN_CRED_OTH_SHORT_BAL,

                SUM(CASE
                      WHEN (P.TMO_HIER_LEVEL_1 = 'HITEL' AND P.TMO_HIER_LEVEL_2 = 'MIKROVÃLLALATI HITEL' AND
                           (AR.ESTIMATED_END_DATE - L.FIRST_DISBURSEMENT_DATE > 365 OR L.FIRST_DISBURSEMENT_DATE IS NULL)) AND AR.ESTIMATED_END_DATE > DAT.CALENDAR_MONTH_LAST_WORKDAY THEN
                       NVL(BLL.OUTSTANDING_AMT * E.EXCHANGE_RATE_VALUE, 0)
                      ELSE
                       0
                    END) AS FIN_CRED_OTH_LONG_BAL

           FROM SCHEMA.CRM_ARRANGEMENT AR

           JOIN SCHEMA.CRM_LOAN L
             ON L.LOAN_ID = AR.LOAN_ID
            AND L.START_OF_VALIDITY <= DATE'2020-01-01'
            AND L.END_OF_VALIDITY > DATE'2020-01-01'

           JOIN (SELECT CALENDAR_MONTH_LAST_WORKDAY
                  FROM SCHEMA.CRM_DATE DAT
                 WHERE DAT.REFERENCE_DAY = &GLOBAL_P_EFFECTIVE_LOAD_DATE
                   AND DAT.START_OF_VALIDITY <= DATE'2020-01-01'
                   AND DAT.END_OF_VALIDITY > DATE'2020-01-01'
                ) DAT
             ON 1 = 1

           LEFT JOIN SCHEMA.CRM_BALANCE_LOAN_LOAN BLL
             ON BLL.LOAN_ID = L.LOAN_ID
            AND BLL.EFFECTIVE_LOAD_DATE = DATE'2020-01-01'

           LEFT JOIN SCHEMA.CRM_EXCHANGE_RATE E
             ON E.EXCHANGE_RATE_DATE = DATE'2020-01-01'
            AND E.EFFECTIVE_LOAD_DATE = DATE'2020-01-01'
            AND E.TARGET_CURRENCY = BLL.CCY
            AND E.EXCHANGE_RATE_CODE = 'FT0'

           LEFT JOIN SCHEMA.CRM_PRODUCT P
             ON P.PRODUCT = L.LOAN_CODE2
            AND P.MODUL = 'CL'
            AND NVL(P.RB_FORCED_LOAN, '#') = '#'
            AND P.DML <> 'D'
            AND P.START_OF_VALIDITY <= DATE'2020-01-01'
            AND P.END_OF_VALIDITY > DATE'2020-01-01'

          WHERE 1 = 1
            AND AR.START_OF_VALIDITY <= DATE'2020-01-01'
            AND AR.END_OF_VALIDITY > DATE'2020-01-01'
            AND AR.ARRANGEMENT_TYPE = 'LOAN'
          GROUP BY BLL.CUSTOMER_ID) SHORT_LONG
ON SHORT_LONG.CUSTOMER_ID = C.CUSTOMER_ID

LEFT JOIN (SELECT BLE.CUSTOMER_ID, SUM(BLE.FUTURE_CAPITAL) AS FI_CRED_LEASING_BAL, COUNT(LE.LEASING_ID) AS PR_CRED_LEASING_CNT

           FROM SCHEMA.CRM_LEASING LE

           JOIN SCHEMA.CRM_BALANCE_LEASING BLE
             ON BLE.LEASING_ID = LE.LEASING_ID
            AND BLE.CUSTOMER_ID IS NOT NULL
            AND BLE.EFFECTIVE_LOAD_DATE = DATE'2020-01-01'

          WHERE 1 = 1
            AND LE.START_OF_VALIDITY <= DATE'2020-01-01'
            AND LE.END_OF_VALIDITY > DATE'2020-01-01'

          GROUP BY BLE.CUSTOMER_ID) DEALS_AND_INTERESTS
ON DEALS_AND_INTERESTS.CUSTOMER_ID = C.CUSTOMER_ID

LEFT JOIN (SELECT SCB.SYMBOLS_ID, COUNT(DISTINCT SCB.CONTRACT_ID) AS BPR_INS_LIFE_CNT, SUM(SCB.AMOUNT) AS FIN_INS_LIFE_BAL FROM SCHEMA.CRM_RL_SUBSIDIARY_COMPANIE_BAT SCB WHERE SCB.SOURCE_TCH = ‘BIZTOSITO’ AND SCB.EFFECTIVE_LOAD_DATE = &GLOBAL_P_EFFECTIVE_LOAD_DATE GROUP BY SCB.SYMBOLS_ID) LIFE_INSUR ON LIFE_INSUR.SYMBOLS_ID = C.SYMBOLS_CUSTOMER_SID

LEFT JOIN (SELECT PD.CUSTOMER_ID, MAX(PD.DAYS_PAST_DUE) AS EXP_CRED_DAYS_MAX, ROUND(SUM(PD.PAST_DUE_AMOUNT)) AS EXP_CRED_AMT_MAX

           FROM SCHEMA.CRM_PAST_DUE PD

           JOIN SCHEMA.CRM_ARRANGEMENT AR
             ON AR.ARRANGEMENT_ID = PD.ARRANGEMENT_ID
            AND AR.START_OF_VALIDITY <= DATE'2020-01-01'
            AND AR.END_OF_VALIDITY > DATE'2020-01-01'
            AND AR.ARRANGEMENT_TYPE = 'LOAN'

           JOIN SCHEMA.CRM_LOAN L
             ON L.LOAN_ID = AR.LOAN_ID
            AND L.START_OF_VALIDITY <= DATE'2020-01-01'
            AND L.END_OF_VALIDITY > DATE'2020-01-01'
            AND L.LOAN_CODE1 NOT LIKE 'T%'

          WHERE 1 = 1
            AND PD.PAST_DUE_SID4 = 'FACILITY_MAX_DPD'
            AND PD.EFFECTIVE_LOAD_DATE = &GLOBAL_P_EFFECTIVE_LOAD_DATE
          GROUP BY PD.CUSTOMER_ID) EXPIRED_LOAN
ON EXPIRED_LOAN.CUSTOMER_ID = C.CUSTOMER_ID

LEFT JOIN KMR_FINANCIAL_INDEX_STAGE_06 SZCH ON SZCH.CUSTOMER_ID = C.CUSTOMER_ID

WHERE 1 = 1 AND C.START_OF_VALIDITY <= DATE’2020-01-01’ AND C.END_OF_VALIDITY > DATE’2020-01-01’

Read more comments on GitHub >

github_iconTop Results From Across the Web

SQL Alias: A Guide to the SQL Aliases and the SQL AS Keyword
Table aliases allow you to name your table for use in other parts of your query, such as the SELECT or WHERE clauses....
Read more >
How to extract all table names and aliases from Tsql select ...
One suggestion was to create a view or stored procedure containing the query and use sys.sql_dependencies on the new view/SP to get the ......
Read more >
How to Use Aliases in SQL Queries - LearnSQL.com
There are several ways to use aliases in your SQL queries. This article shows you how, using simple explanations and helpful examples.
Read more >
SQL AS keyword overview and examples - SQLShack
This article explains the SQL AS keyword that is used to assign an alias for columns and table names in SQL Server.
Read more >
How to use aliases in SQL - Educative.io
Aliases in SQL are names that you can allocate to a table or table column for use in a single SQL query. In...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found