Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taint Tracking to a LocalVariable #16438

Open
mies47 opened this issue May 6, 2024 · 2 comments
Open

Taint Tracking to a LocalVariable #16438

mies47 opened this issue May 6, 2024 · 2 comments
Assignees
Labels
C++ question Further information is requested

Comments

@mies47
Copy link

mies47 commented May 6, 2024

Hello,

I'm trying my query on a simple code before moving to my main codebase. Basically, I would like to track all the local taints from all the function arguments to the LocalVariables in that function. This is the simplest version but eventually I want to find all such taints that are coming from an assignment in a loop.

Here's my simple C code:

#include <stdlib.h>
#include <stdio.h>

void call1(char* in) {
        char buff[10];
        char a;
        a = in[3];
        for (int i = 0; i < 10; i++) {
                buff[i] = in[i];
        }
        printf("%s\n", buff);
}

int main() {
        char *input = "Hello!!!!";
        call1(input);
        return 0;
}

Here's my simple CodeQL query:

from DataFlow::Node source, DataFlow::Node sink, LocalVariable lv, Function f
where
        f.getAParameter() = source.asParameter() and
        lv.getAnAccess() = sink.asExpr() and
        lv.getFunction() = f and
        TaintTracking::localTaint(source, sink)
select source, sink

I want to find the following taints:
in -> buff, in -> a

For now the query returns nothing. But if I comment the TaintTracking::localTaint(source, sink) line it would return the following:

Result set: edges
| a | b |
+---+---+

Result set: nodes
| n | key | val |
+---+-----+-----+

Result set: subpaths
| arg | par | ret | out |
+-----+-----+-----+-----+

Result set: #select
| source | sink |
+--------+------+
| in     | a    |
| in     | i    |
| in     | i    |
| in     | buff |
| in     | i    |
| in     | i    |
| in     | buff |

I'm not really sure why this happens and if I should probably define an additional taint step. I'd appreciate any help.

@mies47 mies47 added the question Further information is requested label May 6, 2024
@MathiasVP
Copy link
Contributor

Hi @mies47

Thanks for your question!

There are a couple of problems with your query. I'll first list the issues and then suggest how you may rewrite your query to fix these.

  1. Your specify that source is the value of in. However, remember that in[3] really means *(in + 3). The fact that there is a dereference there means you need to specify that your source is whatever the parameter points to so that there's flow through the dereference.

  2. There's not really flow from the right-hand side of an assignment to the left-hand side since the left-hand side is an address (i.e., the address of a that's being written to), and the right-hand side is the value that ends up being stored in a. So sink.asExpr is probably the wrong predicate to use in this case.

Those are the two issues in your query. Now, let me propose how you might fix these issues:

  1. In order to specify that your source is what the parameter points to you can use source.asParameter(1) to specify that it's not in that's your source, but rather whatever in points to (i.e., *in, and *(in + 1), etc).

  2. In order to select the assignment to a as your sink you can use a somewhat esotetic predicate sink.asDefinition(). That predicate gives you the node that represents an assignment operation (or initializer of a declaration in case your code instead was char a = in[3]). In this case, this will give you back a AssignExpr, and you can ask for the left-hand side of this assignment to get a.

Finally, since you need to modify your sink to use sink.asDefinition you still need to account for the fact that you also want to reach buff in printf("%s\n", buff); as that's not a definition. So that needs to use sink.asIndirectExpr() to match the fact that you're selecting what the source points to.

Combining all of these points, your query becomes:

from DataFlow::Node source, DataFlow::Node sink, LocalVariable lv, Function f
where
  f.getAParameter() = source.asParameter(1) and
  lv.getAnAccess() = [sink.asDefinition().(AssignExpr).getLValue(), sink.asIndirectExpr()] and
  lv.getFunction() = f and
  TaintTracking::localTaint(source, sink)
select source, sink

I hope that helps! Let me know if you have any more questions :)

@mies47
Copy link
Author

mies47 commented May 8, 2024

@MathiasVP Thank you so much for your detailed explanation. This definitely works.

I was just wondering if there are any resources that explain these concepts in detail like you did :)
I explored the guides and I mainly use the API reference that doesn't go into much detail.

I also used the following yesterday and added isAdditionalFlowStep which gave me buff:

module FuncParamToBuffTaint implements DataFlow::ConfigSig {
        predicate isSource(DataFlow::Node source) {
                exists(Function f, Type t |
                        f.getAParameter() = source.asParameter() and
                        source.getType() = t and
                        t instanceof PointerType
                )
        }

        predicate isSink(DataFlow::Node sink) {
                exists(ArrayExpr ae |
                        ae.getArrayBase() = sink.asExpr()
                ) or
                exists(PointerArithmeticOperation pao |
                        pao.getLeftOperand() = sink.asExpr()
                )
        }

        predicate isAdditionalFlowStep(DataFlow::Node pred, DataFlow::Node succ) {
                exists(LoadInstruction l |
                        l.getSourceAddressOperand() = pred.asOperand()
                )
        }
}

module FuncParamToBuffTaintFlow = TaintTracking::Global<FuncParamToBuffTaint>;

from DataFlow::Node source, DataFlow::Node sink
where
        source.getFunction() = sink.getFunction() and
        FuncParamToBuffTaintFlow::flow(source, sink)

select source, sink, source.getFunction()

I am trying to take source and sink as pointers. Of course this doesn't count a as sink which is fine since I'm more interested in buff. Would this work the same way as your version?

@mbg mbg added the C++ label May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants