question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DataFrame.query() could be vulnerable to SQL injections like attacks if the user doesn't know what they are doing

See original GitHub issue

Currently DataFrame.query() takes a string to query a database in a way that’s quite similar to how an SQL query reads data from a database. If a user writes code that uses pythons standard string replacement or that uses regex, the code is vunerable in a similar way to how SQL queries that are build with string replacement and regex are vulnerable.

Pythons sqlite3 writes in it’s documentation:

# Never do this -- insecure!
symbol = 'RHAT'
c.execute("SELECT * FROM stocks WHERE symbol = '%s'" % symbol)

# Do this instead
t = ('RHAT',)
c.execute('SELECT * FROM stocks WHERE symbol=?', t)

I think it would be valuable to provide a similar API for DataFrame.query(). In most cases DataFrame.query() likely will be used in scientific computing context where injection attacks aren’t a concern but sooner or later someone will write code with DataFrame.query() where injection attacks are a concern.

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:11 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
alexmojakicommented, Mar 10, 2022

If var1 is an actual string, that’s safe. If users can pass arbitrary objects with an overridden __eq__ then it might be a problem, I haven’t checked.

@ already seems like the SQL-like parameter binding you want. The problem as described originally is if you use something like %s instead which makes it easy to inject code into the string passed to query.

0reactions
do-mecommented, Mar 10, 2022

Thanks for your immediate reply!

How about the query variable substitution, e.g.

var1 = "evil string"
df.query("col1 == @var1") # engine='python')

Is there room for your former exploit or similar (also with respect to different engines)?

Maybe this might lead to a general discussion about SQL-like parameter binding. I would suggest adding some side note in the documentation about security under DataFrame.query().

Read more comments on GitHub >

github_iconTop Results From Across the Web

Preventing SQL Injection Attacks With Python
Any time user input is used in a database query, there's a possible vulnerability for SQL injection. The key to preventing Python SQL...
Read more >
Warning! Are your queries vulnerable to SQL injection?
A SQL query is vulnerable to SQL injection if a user can run a query other than the one that was originally intended....
Read more >
DVWA SQL Injection Exploitation Explained (Step-by-Step)
SQL injection is one of the most common attacks used by hackers to exploit any SQL database-driven web application. It's a technique where...
Read more >
Preventing SQL injections in Python (and other vulnerabilities)
SQL Injection in Python ... If the user enters a legitimate search value, for example, Tony , then all is well. But if...
Read more >
SQL Injection Tutorial - w3resource
SQL injection is a technique (like other web attack mechanisms) to attack data driven applications. This attack can bypass a firewall and ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found