CodeQL run time increased from mins to hours #16448

asreehari-splunk · 2024-05-07T20:10:59Z

The code analysis run duration increased from mins to hours from 2.16.4. I've attached the runtime options as pdf for both versions below
2.16.4.pdf
2.16.3.pdf

It was consistently in the minute range before and has only increase since 2.16.4. The exact date seems to be March 12th, 2024 Latest version that is being used is "2.17.1"

mbg · 2024-05-07T22:15:01Z

Hi @asreehari-splunk 👋

We shipped some updates to our Go support in 2.16.4 which mean that we better understand large Go repositories and analyse more code than we did in previous versions of CodeQL. For large repositories, that can lead to big increases in the time needed for the analysis because we perform a lot more work than before.

We shipped some performance improvements in CodeQL 2.17.0 that should have improved the time again. Can you confirm that you've tried 2.17.0 or later? If so, has that made any difference at all?

asreehari-splunk · 2024-05-07T23:39:47Z

Hi @mbg ...thank you for the clarification. yeah, looks like we are currently using 2.17.1 and seeing 1h+ run time. here is a screenshot

asreehari-splunk · 2024-05-08T00:22:36Z

one follow up question. Is there a metric I can look for like # of lines-of-code/files/packages etc that qualifies as large ? to my knowledge there hasn't been a big change in the repo

aibaars · 2024-05-08T08:30:56Z

@mbg These are two runs at almost the same time, the main difference is the switch from 2.16.3 to 2.16.4 . The source code should be mostly identical. It's the autobuilding phase that is taking an hour more. Was there any change in the autobuilding strategy or is there a some performance problem in the extractor? I also see that CodeQL is running the go tests, these are not needed for analysis so perhaps test running can be skipped to speed things up.

aibaars · 2024-05-08T08:36:57Z

Looking at those logs, it could be that @mbg is right about 2.16.4 scanning more code:

CodeQL scanned 155 out of 917 Go files in this invocation. Check the status page for overall coverage information: https://github.com/open-telemetry/opentelemetry-collector/security/code-scanning/tools/CodeQL/status/
CodeQL scanned 488 out of 917 Go files in this invocation. Check the status page for overall coverage information: https://github.com/open-telemetry/opentelemetry-collector/security/code-scanning/tools/CodeQL/status/

Spending 6 times longer on 3 times as many files sounds like there may still be a performance issue. It can also be that the test cases for the additional files simply take a long time. Could you try a run without running any tests (a test pull request with a quick-n-dirty tweak to the Makefile should work).

mbg · 2024-05-08T09:00:35Z

Thanks @aibaars for having a look.

Like I said, we shipped big changes to the Go autobuilder in 2.16.4 which result in better support for repositories with multiple go.mod files. In CodeQL 2.16.3 and earlier, we didn't really support that well and our analysis of such repositories was just a best effort. As @aibaars notes for your repository, that meant we only extracted 155 out of 917 Go files then.

As of 2.16.4 and above, we do support repositories with multiple go.mod files properly and that often explains an increase in extraction/analysis time for such repositories, since we actually extract more code now. As @aibaars notes, that's now 488 out of 917 Go files.

I have had a look over your logs to determine why there is such a big increase in the time needed for this now and why this has not improved with 2.17.0 or above (when we shipped performance improvements to dependency extraction that improved the times for most users).

For mainly historical reasons, we run make if there is a Makefile in the repository before we begin extraction of the source code. Your Makefile in particular seems to build and test all of your code.

When we implemented the changes to the Go autobuilder in 2.16.4, we kept the part that invokes make before extraction to ensure that CodeQL would not suddenly break for repositories which relied on this behaviour. However, it seems that this now gets erroneously invoked for every go.mod file in your repository. I will look into getting this fixed ASAP.

In the meantime, to avoid this issue until it is fixed, you can either revert to 2.16.3 (but fewer Go sources files will get extracted) or switch to a custom build. The latter would involve replacing the autobuild step in your workflow with a step that invokes the right build commands for your repository (possibly just make).

asreehari-splunk · 2024-05-08T17:07:12Z

Thank you @aibaars and @mbg for the quick follow ups and really appreciate helping me understand the root cause. I will take this back and see what the team has to say. Do we need to change the label on this to something other than a question ?

mbg · 2024-05-08T18:46:23Z

I have updated the labels, but we are also tracking this internally as well.

asreehari-splunk added the question Further information is requested label May 7, 2024

asreehari-splunk mentioned this issue May 7, 2024

CodeQL build times increased to over an hour open-telemetry/opentelemetry-collector#10064

Open

mbg added bug Something isn't working acknowledged GitHub staff acknowledges this issue Go and removed question Further information is requested labels May 8, 2024

mbg self-assigned this May 8, 2024

mbg mentioned this issue May 13, 2024

Go Autobuild failure reason unclear #16469

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CodeQL run time increased from mins to hours #16448

CodeQL run time increased from mins to hours #16448

asreehari-splunk commented May 7, 2024

mbg commented May 7, 2024

asreehari-splunk commented May 7, 2024

asreehari-splunk commented May 8, 2024

aibaars commented May 8, 2024

aibaars commented May 8, 2024

mbg commented May 8, 2024

asreehari-splunk commented May 8, 2024

mbg commented May 8, 2024

CodeQL run time increased from mins to hours #16448

CodeQL run time increased from mins to hours #16448

Comments

asreehari-splunk commented May 7, 2024

mbg commented May 7, 2024

asreehari-splunk commented May 7, 2024

asreehari-splunk commented May 8, 2024

aibaars commented May 8, 2024

aibaars commented May 8, 2024

mbg commented May 8, 2024

asreehari-splunk commented May 8, 2024

mbg commented May 8, 2024