New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C#/C++: Convert C# code to use paket
package manager
#16376
Conversation
371fadc
to
2511f2c
Compare
b3447d7
to
95355d2
Compare
95355d2
to
a2dcb59
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good work and thorough explanation and description! Thank you!
This looks plausible to me, but I would also like one more person from the C# team to take a look.
In the meantime here are some questions :-)
- What is the workflow for generating
paket.lock
andPaket.Restore.targets
. - Today we use a VS code plugin for updating NuGet packages (NuGet Package Manager GUI). Is there any tool that you can recommend for updating
paket.dependencies
? - Could you be persuaded to update our documentation for the “workflow” for updating the dependencies (basically the questions above): https://github.com/github/codeql-csharp-team/blob/master/UpdatingRoslyn.md.
Hi, thanks for the review! I agree we should also get a second set of eyes on this.
See above - I think by relaxing version requirements, we'd reduce the effort to update dependencies to
As converting the build to bazel introduces an additional step in updating the dependencies (the paket lockfile needs to be converted in a separate step to a bazel file), I'll provide a script to execute all the necessary commands in one go. |
|
Pinning Roslyn more makes a lot of sense! I'd like to merge this PR as-is, and then we can improve the dependency version constraints as follow-up, so that the internal tests on the external PRs correctly run - right now, they need the internal PR to be merged that's waiting on merging this PR. |
Internal CI now passes on this PR :) |
Some comments
I think this is not a real advantage over sln/csproj files. We can achieve the same with csproj files too by adding a top level
This can be achieved with nuget too. See for example here.
I think we should be able to work around this limitation by adding a
Aren't there other drawbacks? For example,
First experience with paketI've tried the following:
This results in
How should I configure my The above
It looks like we try to parse/restore all Second experience with paketIf I rerun Also, after deleting
Other times I'm getting more errors:
And sometimes the build succeeds. IDE integrationI'm using VS Code with C# Dev Kit. It is working as expected:
Note that it relies on Others on the team might be using other IDEs, it would be worth checking if basic navigational features work there too. Overall, paket seems to be working. I need to learn a bit more about it to understand what happens under the hood, especially, where some of the caches are stored and how it chooses initially the files to parse/restore, but I don't see anything major that would be blocking this change. |
cpp/autobuilder/Semmle.Autobuild.Cpp.Tests/Semmle.Autobuild.Cpp.Tests.csproj
Outdated
Show resolved
Hide resolved
Thanks Tamas! I'll address your questions in multiple rounds, as I need to do some research first. Re deduplication of version numbers/lockfile: Yes, I'm not claiming this is the only way to achieve this, but it's part of the PR. Unfortunately, right now this is the only way to achieve those goals while also supporting a bazel build. Re
Yes I'm sad about that, too. I don't really know what to do. This is maybe connected with the
I'll try to find the correct incantation in
see above, also: https://fsprojects.github.io/Paket/paket-folder.html says
which means we can ignore sub-folders, but (unless there's another feature I've not found) only by creating a file in them. Imo that's acceptable for sub-folders inside
Either we manage to pass more specific command line options to
https://fsprojects.github.io/Paket/caches.html#NuGet-machine-cache mentions using the nuget machine cache.
I'll investigate. Getting different errors looks like a concurrency thing to me.
Which would those be? I was hoping that Visual Studio would be working, if the IDE integration for VSCode picks up the dependencies. Apropos IDE: There's some IDE support for paket files and commands available here: https://fsprojects.github.io/Paket/editor-support.html. |
We're not using the full blown VS, because we're all on MacOS. So I think Rider, and the discontinued VS for Mac are the two other IDEs that might be relevant. cc @michaelnebel @hvitved which IDE are you using? |
@criemen In the meantime I have an answer too: The |
This is a necessary preparation for moving the C# dependency management to `paket`, which in turn is a necessary preparation for moving the C# build to bazel. As we discovered in #16376, `paket` tries to restore all projects recursively from the root folder. If we support building C# code under both `ql/csharp` and `ql/cpp`, we need to have a single lockfile under `ql`, as both codebases share the same set of dependencies (and utilities from `ql/csharp/extractor`). Then, `paket` will also try to restore things that look like "C# projects" in other languages' folders, which is not what we want. Therefore, we address this by moving all C# code into a common root directory, `ql/csharp`. This needs an internal PR to adjust the buildsystem to look for the autobuilder in the new location.
This is a necessary preparation for moving the C# dependency management to `paket`, which in turn is a necessary preparation for moving the C# build to bazel. As we discovered in #16376, `paket` tries to restore all projects recursively from the root folder. If we support building C# code under both `ql/csharp` and `ql/cpp`, we need to have a single lockfile under `ql`, as both codebases share the same set of dependencies (and utilities from `ql/csharp/extractor`). Then, `paket` will also try to restore things that look like "C# projects" in other languages' folders, which is not what we want. Therefore, we address this by moving all C# code into a common root directory, `ql/csharp`. This needs an internal PR to adjust the buildsystem to look for the autobuilder in the new location.
I filed #16487 to move the C++ Windows autobuilder into |
Do you want explicit feedback from the C/C++ team? Most recent changes have been driven by the C# team, and we've generally assumed that they know what they were doing, not requiring explicit approvals from the C/C++ team. I'd assume the same thing here? |
@jketema I'd like explicit approval from the C++ team (and C# too) on #16487, and for everything else I'll rely on the C# team and Paolo for the bazel bits (coming in the future). |
I use VS Code with C# Dev kit. |
9fa7878
to
80f2d63
Compare
I've worked some more on this, now that the C++ autobuilder has been moved.
I got this working in so far as |
80f2d63
to
0b82896
Compare
0b82896
to
950e8c8
Compare
I haven't checked this, but I think we could added a |
I'll check that out! I also played with deleting the paket cache file, and I couldn't reproduce the paket errors you reported above. Besides playing with the before target for local development, is there anything I've missed from your comments Tamas? How are you feeling about the state of the PR now? |
Let's check this small issue, and then I think the PR can be merged. |
This is not a general fix, as we not always build the solution file, but this should improve the DX for local developers that use the solution file.
Great, I made it work with the before file, good find! This only works for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look reasonable to me. Have you checked the performance of the build before and after this change?
I'm seeing very high CPU usage during restores, so high that my VSCode and other windows keeps freezing. At the same time, there's significant slowdown in the build locally.
Running dotnet build
once to do the restore, and then measuring the exec time of a second run results in the below. (Only the first one should be doing a restore, the latter one should just build.)
On main
:
❯ time dotnet build /t:rebuild
Determining projects to restore...
All projects are up-to-date for restore.
...
Time Elapsed 00:00:17.10
dotnet build /t:rebuild 8.69s user 1.77s system 59% cpu 17.717 total
On criemen/dotnet-paket
:
❯ time dotnet build /t:rebuild
...
Time Elapsed 00:01:17.71
dotnet build /t:rebuild 13.14s user 17.39s system 38% cpu 1:18.43 total
It's quite a big change. Also, I think the only difference should be the package restore, but the run with paket also lists vulnerable package messages to the console, such as warning NU1903: Package 'System.Text.RegularExpressions' 4.3.0 has a known high severity vulnerability, https://github.com/advisories/GHSA-cmhx-cq75-c4mj
, so it's definitely doing more work than the restore with nuget.
Can you try once more with one more repetition of the rebuild? What machine do you have? On my local machine (Mac M1) I get the following three timings (in sec), Nuking all caches I could find (including
So indeed this is slower, but not egregiously so for repeated builds. |
I think the high CPU usage is caused by I'm getting the below on the
And on
I'm on an intel based mac (2.4 GHz 8-Core Intel Core i9, 64GB memory). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge this, and then we'll figure out if we can improve the performance.
Running with diagnostics logs (dotnet build /t:rebuild /v:diag
), the below targets are the ones that take longer than 1s:
1002 ms LocalToolRestore 2 calls
1273 ms GenerateBuildDependencyFile 16 calls
1820 ms CoreClean 16 calls
2793 ms CoreCompile 16 calls
5398 ms _CreateAppHost 6 calls
6685 ms CleanReferencedProjects 16 calls
7735 ms _GetProjectReferenceTargetFrameworkProperties 16 calls
7746 ms _GenerateRestoreGraph 1 calls
14759 ms _GitRepositoryUrl 10 calls
16429 ms _GitBranch 10 calls
16686 ms _GitCommitDate 10 calls
17930 ms _GitBaseVersionTagExists 10 calls
18609 ms _EnsureGit 10 calls
26560 ms Rebuild 17 calls
33449 ms _GitCommit 10 calls
44216 ms _GitBaseVersionTag 10 calls
80897 ms _GitRoot 10 calls
119636 ms PaketRestore 32 calls
I don't yet understand why we spend this much time in PaketRestore
on an already restored solution. But I couldn't quickly figure this out from the logs. There are other performance optimizations that we should look into, such as is it worth calling all the _Git*
targets 10 times? Are those calls always returning the same data? Why are we calling LocalToolRestore
twice? ...
I tried to fix that by 74e446e, but that didn't work. Something in I'm surprised that Thanks for the approval, I also have a follow-up that gets rid of some transitive dependencies, hopefully that'll speed up the |
This PR converts our C# codebase to use the
paket
package manager.Why am I doing this?
Bazel's
rules_dotnet
require the use ofpaket
to interface withnuget
dependencies. Therefore, we need to maintain apaket.dependencies
andpaket.lock
file for our (future) bazel build.What are the design choices I made?
Integrated
paket
into ourdotnet
CLI buildWe could, in theory, keep the existing solution files as-is, but then we'd have to duplicate our external dependencies in two disjoint systems. I don't believe that's a good way forward, and we'd risk our external dependencies going slightly out-of-sync. As we're currently using the dotnet solution files to build an extractor pack for some of our CI, as well as for local development/debugging, I don't think having (potential) differences between those builds is advisable.
One lockfile for all C# code
You can see that the dependency list and lockfile is in the top-level folder. This is because
paket
requires those files to be in the top-level folder of the "project", i.e. the common folder of all sub-projects. You'd think that'sql/csharp
, but as we have C# code inql/cpp
(namely, the Windows autobuilder) that depends onql/csharp
code, I had to move the files to the top level.An alternative might be to split up the paket dependency files into two, one file for C++, and one for C#. As the C++ projects depend on some of the C# projects, though, that imo made less sense. I've also not explored how the Bazel integration would work if we have two disjoint lock files, but a shared dependency tree.
What are the drawbacks?
I believe the main drawback is having to run
dotnet tool restore
beforedotnet restore/dotnet build
will work. Unfortunately that's adotnet
limitation. There'll be a good error message from thedotnet
tooling, though, telling you to rundotnet tool restore
to installpaket
if you haven't.Also, all build/CI scripts have to be adjusted to call
dotnet tool restore
before attempting to build our code (this is mainly relevant until we switch over to a Bazel-based build).What are the advantages?
Deduplication of version numbers
Currently, we've duplicated the version numbers of all direct dependencies in all projects that reference those dependencies. Now, with
paket
, we have a single place (thepaket.dependencies
file) that lists all our external dependencies version, without (by accident) introducing the same dependency at multiple versions.Reproducible builds with a lockfile
By using a lockfile, the entire set of transitive dependencies is fixed. In practice, I believe this was not a big problem before, as
nuget
(to my big surprise) always chooses the minimal version number for all transitive dependencies that satisfy the version constraints in dependency resolution. That way, a newly published package version will not start randomly breaking builds, as we know it from other ecosystems that go for the maximum version that satisfies all constraints.paket
supports both dependency resolution algorithms, and makes taking the maximum version safe by providing lockfile support - now version upgrades happens only when the user chooses to do so.Usability
I've been using
paket
now for a few days, and it's pretty easy to use, with good documentation. Once we have bazel support as well, I'll provide a script to run when introducing/updating new dependencies that'll update the lockfile and the bazel interpretation of that.