Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using a package manager to build third party dependencies #893

Open
jherico opened this issue Feb 6, 2024 · 18 comments
Open

Consider using a package manager to build third party dependencies #893

jherico opened this issue Feb 6, 2024 · 18 comments
Labels
build This is relevant to the build system feature request Request for a new feature or improvement framework This is relevant to the framework

Comments

@jherico
Copy link
Contributor

jherico commented Feb 6, 2024

The integration of third part dependencies as submodules (or otherwise fetched and built within the main project source tree) presents a number of pain points in my experience.

  • Longer build times when building from scratch (either on CI or after you perform a "clean")
  • More complicated project workspaces
  • Useless warning messages from third party headers that are too burdensome to get fixed upstream. For example:

image

An alternative to that is to use a package manager like vcpkg to build your dependencies. I've used this directly from CMake in a number of projects and I've found it generally solves the above issues.

My approach puts the vcpkg build folder outside of the main project source / build folders and builds the dependencies at CMake configure time, not at project build time. It removes the third-party targets themselves from the resulting CMake generated project.

This also means when you clean the project, or are starting from scratch, as long as the dependencies haven't changed from what you've built in the past, they aren't rebuilt, so that shaves off a significant amount of build time. On CI servers this corresponds to configuring them to cache the resulting vcpkg folder, so you also gain savings on CI build times.

Finally, because the third party libraries and the corresponding headers are now outside of the build tree, their headers are now treated by CMake as SYSTEM headers, meaning warnings from them will generally be suppressed automatically by the compiler.

I've been working on a similar project on the Cesium-Native project with this PR, and I'm curious if it's something that might be of interest to this project.

The intent would be to have it be transparent to users. If you can build and run the project now using only CMake and your chosen generator, you'd still be able to do so without any perceived change in the setup / build process for end users (other than a longer CMake configure time the first time you build the dependencies).

@jherico
Copy link
Contributor Author

jherico commented Feb 6, 2024

Looking at the current set of third party dependencies, all of these are already present in VCPKG

  • catch2
  • cli11
  • fmt
  • glfw
  • glm
  • glslang
  • imgui
  • ktx
  • spdlog
  • spirv-cross
  • stb
  • tinygltf
  • vma
  • volk
  • opencl (OpenCL-Headers)
  • vulkan (Vulkan-Headers)

These are not currently in vcpkg and could either be skipped or added upstream (I had to fix one upstream package in vcpkg for the cesium-native work, they're pretty responsive to updated versions and new ports).

  • astc
  • ctpl
  • hwcpipe

@SaschaWillems
Copy link
Collaborator

SaschaWillems commented Feb 6, 2024

One thing to consider: We also develop samples in private for things that are in development and not ready for public. So stuff like the Vulkan headers, volk, glslang and opencl can't be moved to package managers.

We also have IHVs building on non-standard platforms, I'm unsure if vcpkg is available there.

@jherico
Copy link
Contributor Author

jherico commented Feb 6, 2024

@SaschaWillems Thanks for pointing out that issue. There's two approaches to dealing with it that I can think of off the top of my head.

First is to simply not include those libraries in any potential migration to vcpkg. The bulk of the third party libraries should still be eligible (I'm actually going to start by looking at adding the 3 missing items to vcpkg, although I'm not sure if it's worth the effort for astc given your comment on another issue about the possibility of moving to basis for all compressed assets). Obviously this is the easiest path

The second would be having a fork of the vcpkg repository that references the private versions of the pre-release materials. The vcpkg fork itself would not necessarily need to be private, but it could reference private repositories and commits in the corresponding portfiles, such as this one for volk. All you have to do is change the REPO REF and SHA512 values, as long as the build process for the library itself hasn't changed.

The second approach is more maintenance intensive and I imagine that if you're building against a constantly changing HEAD of a private version of Vulkan-Headers or Volk it would be a pain to be constantly updating the fork, and also negate much of the advantage. The first approach is probably better in that case.

On the other hand if it's more of an issue of "We have a pre-release for Version 1.3.XXXX" and the people working on the examples have a reasonably stable target for the libraries, the second approach isn't too bad. I guess it corresponds to how often you're having to do a submodule update in the current process.

@jherico
Copy link
Contributor Author

jherico commented Feb 6, 2024

Just saw your edit about the IHV targets. Again, thanks.

vcpkg supports a variety of "triplets" expressed as <ARCH>-<OS>[-<LINKTYPE>], both an officially maintained set focused on the main desktop and mobile (with the exception of iOS) targets as well as a set of community triplets which includes iOS as well as esoteric architectures and less used OS targets.

Unless the IHVs are targeting some kind of bespoke CPU / OS, I think it's unlikely to be a problem.

@gpx1000
Copy link
Contributor

gpx1000 commented Feb 6, 2024

vcpkg isn't a bad solution. However, for the purposes of answering the specific issues pointed out:

  • Longer build times when building from scratch (either on CI or after you perform a "clean")

    I'd debate that this is a possibly solved issue within Vulkan samples with our current support of CCache. CCache does mean that if you don't have CCache installed or are building from pure scratch (i.e. very first time downloaded); then it'd be less fast. However, we could host the build files from the CI generation; and even that first time could be faster.

  • More complicated project workspaces
    This is questionable. We can equally use cmake hunter, or brew, etc. While it does reduce the downloaded dependency need, vcpkg suffers from the same issues as the other package managers I can think of. Very few if any exist without a different tool also being downloaded. Some of those tools don't exist on all platforms (embedded).

  • Useless warning messages from third party headers that are too burdensome to get fixed upstream. For example:

    Completely agreed. Something that is doable is to make a fork that addresses the warning messages when they are egregious; then submit upstream as we fix locally via PR. Losing the "inspiration" to contribute back upstream makes us less good consumers of open source libraries.

All that playing devil's advocate aside, at a bare minimum, we should check if a dependency exists at the system level, use that if it does and not bother with building the dependency locally. I'm all for having a vcpkg config files for the project to make use of it when it exists, however, I think maybe at this time it's better to have it as an optional path while the main path is still use the system libs with fall back on the source dependencies.

Thanks for bringing this issue up; it's definitely something we need to think about and plan out.

@tomadamatkinson
Copy link
Collaborator

Hey @jherico,

I don't think our build time issues are coming from our third parties

I threw together this. lib.cpp doesn't contain code but I needed a non-interface target to trigger a compile.

Screenshot 2024-02-06 at 21 39 47

Ran a clean build against this target and got

cmake --build build --target vendor -- -j 8  132.61s user 7.57s system 503% cpu 27.861 total

Although this example is crude and not a complete picture. I think the frameworks compile issues are deeper routed in its architecture. Something I've been trying to break apart for a long time 😅

We have a single mega target for the framework which everything depends on. This means that compilation serializes at points. CCache does help here. But the root cause is how the project is structured. There is likely more benefit in cleaning up the frameworks architecture and completing our current goals than to invest in vcpkg.

@jherico
Copy link
Contributor Author

jherico commented Feb 6, 2024

@gpx1000 thanks for the feedback. Although I agree with it being a good instinct to be motivated to submit fixes to upstream libraries, that should also be balanced against the primary use case of this repository, which is to be a set of good example code for developers who want to use Vulkan, and getting to a zero-warnings build of this project should take precedence for their benefit.

That said it would not be unreasonable to include some kind of configure time option that allows a developer to force the build to treat the vcpkg dependencies as non-system headers (the Cesium-Native codebase was actually doing the opposite, building packages as submodules and then using CMake target property manipulation to force the headers to be considered SYSTEM, eliminating warnings that way). That should provide the best outcome for people who just want to use the Vulkan samples without the burden of upstream library maintenance while also allowing people to set that flag and retain the impetus to be good open source consumers.

Also, to my knowledge "brew" isn't really a viable tool on Windows, and I don't know to what extent it would be useful on the kinds of IHV environments Sascha mentioned earlier. As for CMake hunter, I haven't used it but it does look like it's doing the same thing I'm proposing. However, it also looks like it's a CMake-only tool. vcpkg is a command line tool that can be used from CMake, but which also allows me to work directly from the command line to test out changes to portfiles and debug build failures pretty easily.

As for the tool download issue, the mechanism I use to incorporate vcpkg into cmake builds only requires three things: a functioning git executable, a functioning cmake executable and a development environment of some kind that works with CMake. You can use it in embedded environments as well as in cross-compilation environments.

@jherico
Copy link
Contributor Author

jherico commented Feb 6, 2024

@tomadamatkinson Also thanks for the feedback. I'm not trying to suggest that the improvement for build times would be huge, just that it was a potential advantage, and even then I wasn't really considering the usage of CCache (I should probably look into that). However, it's worth noting that the vcpkg approach benefits users whether or not they have CCache installed.

Actually, looking into the use of compiler caching in the CMakeLists.txt I just noticed...

if(CCACHE_FOUND AND NOT SCCACHE_FOUND)
    message("setting CCACHE to ${CCACHE_FOUND}")
    set(CMAKE_C_COMPILER_LAUNCHER ${CCACHE_FOUND})
    set(CMAKE_CXX_COMPILER_LAUNCHER ${CCACHE_FOUND})
elseif(SCCACHE_FOUND AND NOT CCACHE_FOUND)
    message("setting CCACHE to ${CCACHE_FOUND}")
    set(CMAKE_C_COMPILER_LAUNCHER ${SCCACHE_FOUND})
    set(CMAKE_CXX_COMPILER_LAUNCHER ${SCCACHE_FOUND})
endif(CCACHE_FOUND AND NOT SCCACHE_FOUND)

If both ccache and sccache are found you don't use either? That seems like a bug. Also, if no launcher is set, you might want to consider an else() clause that prints out a STATUS or WARNING level message that build times could be improved with ccache (or sccache, but apparently not both).

As for the tractability of the framework target, it's a little off-topic, and I doubt it would solve the problem of bottlenecking, but I'm currently working on a refactor of my own fork of Sascha's old examples and one thing I found helpful was to break down the framework into a number of components by areas of focus. I'm curious if you're considering the same.

@tomadamatkinson
Copy link
Collaborator

As for the tractability of the framework target, it's a little off-topic, and I doubt it would solve the problem of bottlenecking, but I'm currently working on a refactor of my own fork of Sascha's old examples and one thing I found helpful was to break down the framework into a number of components by areas of focus. I'm curious if you're considering the same.

This is exactly our approach. Please see the components folder. We are slowly breaking things apart. My approach for the shader compiler also moves more things into this folder.

@gpx1000 will be able to explain this logic better than me. I believe it is due to which ccache you need to install if you are on windows or not.

@gpx1000
Copy link
Contributor

gpx1000 commented Feb 6, 2024

FWIW, if we were better about our approach, we'd create a fork, fix it in the fork and point samples at the fork; which achieves the same result of keeping us inspired to push fixes upstream as well as giving users the ability to not see warnings. Also, in the odd case where it's a security issue; our users would have a "fixed" version. To the end users, they'd still get a zero-warning experience.

I think it's reasonable to approach it as an option instead of the core methodology of getting dependencies. I'm remembering that Windows is the only system which lacks a real packaging option. It's better in most cases to not reinvent the wheel as it were.

I don't think it makes a lot of sense to do brew; (chocolatey I believe is the windows equivalent or nuget etc) but using cpack to target various builds would make some sense to me.

I think it's smart to use the system libs if they exist, then fall back to the submodule from git as we currently do. If they have vcpkg support, then use that in place of the system libs. However, I think we should definitely do something.

@gpx1000
Copy link
Contributor

gpx1000 commented Feb 6, 2024

Yeah, that logic was entirely due to how github allows CI to work for sccache and ccache. I think I should revisit that as you're right; if they have both installed, then neither would be used I think. Getting the windows CI to work with sccache on github has been, a challenge.

@jherico
Copy link
Contributor Author

jherico commented Feb 6, 2024

I don't think it makes a lot of sense to do brew; (chocolatey I believe is the windows equivalent or nuget etc) but using cpack to target various builds would make some sense to me.

I think it's important to avoid conflating the two kinds of "package managers". There's the developer-centric package manager where what you want is link-libraries, potentially shared-libraries, header files, etc, vs things like brew and choco which are "package managers" in the sense that they are installing binary tools. Like, neither brew nor choco have an imgui package, because it doesn't provide any end-user binaries.

Platforms like Linux tend to blur the line because they use a single package manager, but then you get two completely different types of packages... libfoo and libfoo-dev which are targeting different audiences.

In my experience, build systems as a whole are moving more and more towards a project declaring its dependencies, then having those dependencies locked down to specific versions, in the name of guaranteed repeatable builds. In the case of language compiling to native binaries, they also tend specifically towards grabbing all of one's dependencies directly as source and building them specifically for the required top level target.

Go and Rust are alike in this way, and Java, Python and JS/Node have all been veering strongly in this direction as well. I've been favoring vcpkg as a tool for build time dependencies because it allows me to most closely approximate this approach, and lately the addition of manifest mode is bringing it even closer. CMake hunter looks like it's on a similar path, but I don't know if they support anything like manifest mode, and as I mentioned, I strongly prefer something I can mess around with directly from the command line for debugging purposes.

I think it's smart to use the system libs if they exist, then fall back to the submodule from git as we currently do. If they have vcpkg support, then use that in place of the system libs.

My main issue with that is you end up with the developer version of "DLL-hell", especially if a given developer needs a specific version of a library for project A and a different version for project B.

This has a the potential to become a huge roadblock for users who don't understand why a build is failing, or a huge support burden for repository managers who have to try to help users figure out which path was taken for each dependency and why it might be failing in some edge case environment. I would strongly prefer either submodule builds or vcpkg to any system that had conditional logic to decide where it got these dependencies. Just my two cents.

@gpx1000
Copy link
Contributor

gpx1000 commented Feb 7, 2024

I understand, there are plenty of examples of using a local build style environment built from the local directory style. Most other languages that do this have reasons that make a lot of sense. It nicely supports having more than one version ultimately installed to assist in development etc. This has a draw back in the sense that a use case for the samples is to provide an example of how projects can work and provide a known good implementation to people wishing to debug. So, it's actually more useful to use the system libraries that are installed for all projects as that gives developers the ability to see the same setup work in the same system etc.

However, it's actually easier with CMake to not bother with difficulties (i.e. use system path and then fall back to local builds). find_package natively looks for the system path first. It then has specific rules for falling back. One can also use the config version of find_package to update per project or per system. As it's the normal method for how CMake works, it nicely avoids dll-hell.

You are making really good points however, we'll have to discuss internally during the next samples call @marty-johnson59 mind putting this in the agenda?

@jherico
Copy link
Contributor Author

jherico commented Feb 9, 2024

So after messing around with the builds for a few more days I wanted to touch on a couple things I was reminded of in terms of what I see as the "burden" of using submodules for dependencies.

First, although the compile times themselves on the CI servers are pretty good due to ccache, the CI projects still spend typically between 1 and 2 minutes just checking out the submodules. Now that I have a proposed fix for the lack of ccache usage on the Windows builds, that 1-2 minutes is a not insignificant fraction of each build job's total time. Additionally, it's worth pointing out that I suspect the bulk of the target audience using this repository on Windows is going to be using MSVC and therefore simply can't get the benefit of caching like that.

Second, there's a maintenance overhead involved. Both the necessity of a VKBP_DISABLE_WARNINGS construct in the source code as well as the size of the third_party/CMakeLists.txt file. In a local branch where I attempted to remove the need for the VKBP_DISABLE_WARNINGS macros, I stumbled on the realization that two of the third party projects both include different versions of the nlohmann json.hpp and they have different versions of the name used for the include guard, so some files end up getting both versions included, and all the "X is already defined" warnings are apparently suppressed by the macro. That's not a "signed/unsigned comparison" warning... that's a serious danger to the stability of the code IMO.

This is a pattern I've seen elsewhere and ultimately you end up with an ever growing third party CMakeLists.txt that converges on being a subset of the features that vcpkg already provides. For instance, vcpkg in general forbids this kind of embedding of library A into the output artifacts of library B, so the vcpkg version of tinygltf explicitly depends on the vcpkg version of nlohmann json, and doesn't embed the header in its own package artifacts. Since Tom is adding an explicit copy of nlohmann json to the offline shader compilation branch, that means that there would potentially be three different versions of the json.hpp header available for inclusion in the sample code, while the vcpkg solution would ensure there's only one, and that everything (tinygltf, hwcpip, and this repo) all depend on a single version of the header.

@SaschaWillems
Copy link
Collaborator

SaschaWillems commented Feb 9, 2024

One thing to consider too are the people that maintain this repo (both over here in public and in private at a different git repo). I feel fine and comfortable wit the way it's setup now (submodules anc cmake) and I have never used e.g. vcpkg. So I would probably have to adopt a new/different build system on top of all the other things that are already quite involved when working on samples esp. for new / unreleased Vulkan functionality. And as noted above, we can't move Vulkan/Volk/OpenCL to that new package manager, which I fear could make things actually worse for some of us.

And tbh. we rarely update thrid party stuff anyway, so from my perspective I'm not sure of changing all of this would be that much beneficial. There are areas with much higher priority, e.g the framework rework.

@jherico
Copy link
Contributor Author

jherico commented Feb 9, 2024

Bear in mind that in the path I'm suggesting, developers who don't care about the third party dependencies and versions wouldn't have to do anything at all. Nobody has to install anything or change their tooling and the CMake script transparently manages using vcpkg.

Additionally, some of the third party dependencies actually seem to need updating. VMA is pointing to a random 4 year old commit that isn't any actual released version of VMA. TinyGLTF could be updated so that at least the include guard would match the one in hwcpipe, preventing any file from ending up including them both.

The only people who would need to know anything about the specifics of vcpkg are people who want to add or update a third party dependency and that's really only if that dependency isn't already in vcpkg. And even for the current projects that aren't, I'm literally offering to do all the work.

As for the issue with unreleased content, there's no reason those can't stay as they are now. There are 19 current packages (22 in Tom's branch) and only 3 that should stay as submodules (plus another 3 that are currently not available in vcpkg). That leaves 13 (16 in Tom's branch) packages that could be moved.

Why don't I make a branch that migrates the existing packages (with the exclusions), get it to pass the CI tests using vcpkg, and you can try it out and tell me if it disrupts your workflow in any way?

@SaschaWillems SaschaWillems added the feature request Request for a new feature or improvement label Feb 10, 2024
@SaschaWillems SaschaWillems added framework This is relevant to the framework build This is relevant to the build system labels Feb 12, 2024
@gary-sweet
Copy link
Contributor

Unless the IHVs are targeting some kind of bespoke CPU / OS, I think it's unlikely to be a problem.

We have our own compiler toolchains for cross-compiling for our set-top-boxes, so I suspect this might actually be a problem.

@SaschaWillems
Copy link
Collaborator

I'm for closing this. I don't think the effort to get this up and running on all platforms is justified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build This is relevant to the build system feature request Request for a new feature or improvement framework This is relevant to the framework
Projects
None yet
Development

No branches or pull requests

5 participants