Python SQL Injection not being detected for CWE-089 #16353

leviaurizon · 2024-04-29T05:44:51Z

In one of our projects we have identified a python SQL Injection Vulnerability for CWE-089 which doesn't appear to be being identified by the python SqlInjection.ql found here:

https://github.com/github/codeql/tree/main/python/ql/src/Security/CWE-089/SqlInjection.ql

Here is an example of the code which has vulnerabilities. Note that running bandit -r -t B608 [script name] does seem to work to identify the vulnerability, however, Code QL does not detect it.

def contract_sql_query(
contract_name: str,
col: list = ["*"],
cancel: bool = False,
risk_cat: str = "Aurizon",
current_month: bool = True,
):
if cancel:
table_name = "[interface.batch.yield].[YieldVisCancellation]"
else:
table_name = "[interface.batch.yield].[YieldVisAddition]"
current_month = int(current_month)

sql_string = (
    "SELECT "
    + ", ".join(col)
    + f" FROM {table_name} WHERE ContractName = '{contract_name}'  AND MonthCorrect = {current_month}"
)
sql_string += f" AND  {risk_cat} = 1" if cancel else ""
sql_string += f" AND TimeStored = (SELECT MAX(TimeStored) FROM  {table_name} WHERE ContractName = '{contract_name}')"
# print(sql_string)
return sql_string

The text was updated successfully, but these errors were encountered:

leviaurizon · 2024-04-29T05:50:45Z

Also, here are the results from the bandit scan for reference:

`bandit -r -t B608 src/router/additions.py
[main] INFO profile include tests: None
[main] INFO profile exclude tests: None
[main] INFO cli include tests: B608
[main] INFO cli exclude tests: None
[main] INFO running on Python 3.12.3
Run started:2024-04-29 05:08:23.372323

Test results:

Issue: [B608:hardcoded_sql_expressions] Possible SQL injection vector through string-based query construction.
Severity: Medium Confidence: Low
CWE: CWE-89 (https://cwe.mitre.org/data/definitions/89.html)
More Info: https://bandit.readthedocs.io/en/1.7.8/plugins/b608_hardcoded_sql_expressions.html
Location: .\src/router/additions.py:146:22
145
146 sql_query = f"SELECT {', '.join(coladd)} FROM [interface.batch.yield].[YieldVisAddition] WHERE ContractName = '{contract_name}'"
147 if len(corridor_names) > 0:

Code scanned:
Total lines of code: 143
Total lines skipped (#nosec): 0

Run metrics:
Total issues (by severity):
Undefined: 0
Low: 0
Medium: 1
High: 0
Total issues (by confidence):
Undefined: 0
Low: 1
Medium: 0
High: 0
Files skipped (0):`

tausbn · 2024-04-30T12:42:44Z

Thank you for your report. I think we need more information in order to determine why CodeQL isn't flagging this.

The code you've supplied just creates and returns a string, and so to the eyes of the CodeQL Python analysis, that code in isolation is completely safe. If this string is subsequently used in a way so that it is executed as a SQL query, only then does it become a potential SQL injection.

In other words, the CodeQL Python analysis does not consider the construction of a string that looks like a SQL injection to be a vulnerability. That string has to flow to a place where it gets executed in order to be counted as a vulnerability.

Moreover, there also has to be some user-controlled piece of data that gets interpolated into the string, otherwise it still isn't considered a vulnerability. (For instance, if you're just using the above function to create SQL queries with strings that are hard-coded in the program, then that's also considered safe.)

Thus, a SQL injection in the eyes of the CodeQL Python analysis requires a data-flow path that

Starts in some user-controlled piece of data (an example of this would be a HTTP request), and
Ends in a place where we either construct or execute a SQL query (the assumption being that if we've called a library function to construct a query, then we're likely going to execute it as well).

There are a variety of reasons why we may not be detecting the vulnerability:

We may not have a model for the relevant kind of user-controlled data.
We may not have a model for the particular SQL library that is used.
The string may flow through some other function in a way that we are not able to model properly (and so the flow simply gets stuck at that point).

We would need to know more about how your code works in order to determine why the injection is not found.

leviaurizon · 2024-05-01T01:36:27Z

Thank you for your report. I think we need more information in order to determine why CodeQL isn't flagging this.

The code you've supplied just creates and returns a string, and so to the eyes of the CodeQL Python analysis, that code in isolation is completely safe. If this string is subsequently used in a way so that it is executed as a SQL query, only then does it become a potential SQL injection.

In other words, the CodeQL Python analysis does not consider the construction of a string that looks like a SQL injection to be a vulnerability. That string has to flow to a place where it gets executed in order to be counted as a vulnerability.

Moreover, there also has to be some user-controlled piece of data that gets interpolated into the string, otherwise it still isn't considered a vulnerability. (For instance, if you're just using the above function to create SQL queries with strings that are hard-coded in the program, then that's also considered safe.)

Thus, a SQL injection in the eyes of the CodeQL Python analysis requires a data-flow path that

Starts in some user-controlled piece of data (an example of this would be a HTTP request), and

Ends in a place where we either construct or execute a SQL query (the assumption being that if we've called a library function to construct a query, then we're likely going to execute it as well).

There are a variety of reasons why we may not be detecting the vulnerability:

We may not have a model for the relevant kind of user-controlled data.

We may not have a model for the particular SQL library that is used.

The string may flow through some other function in a way that we are not able to model properly (and so the flow simply gets stuck at that point).

We would need to know more about how your code works in order to determine why the injection is not found.

Thanks, this is an example of code that was detected as a vulnerability by bandit, but not CodeQL, where coladd is a string array, and contract_name is a string

sql_query = f"SELECT {', '.join(coladd)} FROM [interface.batch.yield].[YieldVisAddition] WHERE ContractName = '{contract_name}'"
if len(corridor_names) > 0:
sql_query += " AND Corridor IN ('" + "', '".join(corridor_names) + "')"
df_add_hauls = pd.DataFrame.from_records(cursor.execute(sql_query).fetchall())

tausbn · 2024-05-06T13:28:40Z

Thank you for the additional information. As far as I can tell, we don't currently have any modelling of Pandas DataFrames, so that's at least one reason why we wouldn't find this result.

My guess is that bandit has a much more heuristic approach, and will flag things that simply look like they might be SQL injections (so, more results but probably also more false positives).

leviaurizon added the question Further information is requested label Apr 29, 2024

leviaurizon changed the title ~~General issue~~ Python SQL Injection not being detected for CWE-089 Apr 29, 2024

sidshank added the Python label Apr 30, 2024

sidshank assigned tausbn May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python SQL Injection not being detected for CWE-089 #16353

Python SQL Injection not being detected for CWE-089 #16353

leviaurizon commented Apr 29, 2024 •

edited

leviaurizon commented Apr 29, 2024

tausbn commented Apr 30, 2024

leviaurizon commented May 1, 2024 •

edited

tausbn commented May 6, 2024

Python SQL Injection not being detected for CWE-089 #16353

Python SQL Injection not being detected for CWE-089 #16353

Comments

leviaurizon commented Apr 29, 2024 • edited

leviaurizon commented Apr 29, 2024

tausbn commented Apr 30, 2024

leviaurizon commented May 1, 2024 • edited

tausbn commented May 6, 2024

leviaurizon commented Apr 29, 2024 •

edited

leviaurizon commented May 1, 2024 •

edited