-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Permit mounting via volumes-from by passing orchestrator ID #924
base: master
Are you sure you want to change the base?
Conversation
We are facing the same issue: Our application has no access to the Docker host, only to the Daemon itself via remote API. With this approach we could create a codeclimate container, copy all files to 👍 Would love to see this PR merged in the near future. |
@efueger I think this is worth to look at, WDYT? |
@cheald I can't find |
I guess the image was built locally from this PR's branch and tagged |
c1c0b00
to
e5d3483
Compare
@cheald Could you rebase? |
Hi @floh96 👋🏼 !, sorry and yes I can make some room to review this one once it's rebased. |
tl;dr This helps CodeClimate engines not need intimiate docker host knowledge. In contexts like self-hosted Gitlab, we sometimes have a context where we have an invoking runner like Gitlab CI running the Docker executor, which exposes the Docker socket to the running job, so that the running job may invoke its own Docker jobs on the host. Gitlab's top-level job will set up some filesystem context (/builds, mounted as a Docker volume, in the Gitlab case). Right now, Gitlab can only support CodeClimate in a Docker-in-Docker runner, because CodeClimate performs volume mounting for the individual engines via Docker's --volume flag, which mounts not the path from the invoking container, but rather a path on the docker host. This requires that the path passed to CodeClimate as the CODECLIMATE_CODE variable match the real host path, and in the Gitlab CI case, we don't want that, so we have to "hide" the host with a DinD approach. However, this means that we also don't get any layer caching between jobs, which makes running CodeClimate prohibitively expensive, as all the engines etc have to be downloaded for each job. By supporting Docker's `volumes-from` mounting option, we can instead tell the engines to inherit any mounts from the invoking orchestrator. This permits CodeClimate to allow the top-level context set up a Docker volume, bind it to the orchestrator, and then allow the orchestrator to pass that to invoked children. This sidesteps the issue of the Engines needing to know the actual host path; as long as the orchestrator's /code directory is mounted, the children can just presume to use it as-is. To accomplish this, we just a) name the top-level container, and b) pass that name via the CODECLIMATE_ORCHESTRATOR env var: docker run \ --interactive --tty --rm \ --name codeclimate_orchestrator \ --env CODECLIMATE_ORCHESTRATOR="codeclimate_orchestrator" \ --env CODECLIMATE_CODE="/code" \ --volume "$PWD":/code \ --volume /var/run/docker.sock:/var/run/docker.sock \ --volume /tmp/cc:/tmp/cc \ codeclimate/codeclimate-wrapped analyze In the bare-metal case, this doesn't change anything - we're mounting the real host path, which then gets passed to the individual children mounted on the /code mount. While not immediately pertinent to the CodeClimate PR, In Gitlab, we can invoke the Gitlab codequality image like so: script: - CONTAINER_ID=$(docker ps -q -f "label=com.gitlab.gitlab-runner.job.id=${CI_JOB_ID}") - BUILDS_VOLUME_ID=$(docker inspect $CONTAINER_ID --format '{{ range .Mounts }}{{ if eq .Destination "/builds" }}{{ .Name }}{{ end }}{{ end }}') - docker run --rm --name "codeclimate_orchestrator_${CI_JOB_ID}" --env SOURCE_CODE="/code" --env CODECLIMATE_VERSION="volumes-from" --env ORCHESTRATOR_ID="codeclimate_orchestrator_${CI_JOB_ID}" --volume /var/run/docker.sock:/var/run/docker.sock --volume "${BUILDS_VOLUME_ID}":/code codequality:orch /code ("volumes-from" is my local Docker image for the altered CodeClimage build, and "codequality:orch" is my altered Gitlab codequality image) Because this job _must_ be executed in a context that is visible to Docker, we can query Docker to get the current job's container ID, and from there get the volume ID mounted as `/builds`. We then volume mount that volume as /code, and specify /code as the "host" location of our code to be evaluated. The orchestrator will use the passed volume as /code, which is then passed onto the engine jobs, allowing the entire process to run against an ephemeral Docker volume rather than requiring a known path on the host.
e5d3483
to
ee7c1d9
Compare
Heh, wow, not often I see a 3 year old PR get necroed. Sure, I went ahead and rebased my branch onto master. It did rebase cleanly, but I didn't run any tests locally - I guess we'll see if CI still likes it! |
fyi @fede-moya it's rebased |
tl;dr This helps CodeClimate engines not need intimiate docker host knowledge, which permits the usage of CodeClimate outside of docker-in-docker setups. In particular, this permits for easily running CodeClimate checks in Gitlab while retaining Docker layer caching, vastly improving the runtime of each build.
In contexts like self-hosted Gitlab, we sometimes have a context where we have an invoking runner like Gitlab CI running the Docker executor, which exposes the Docker socket to the running job, so that the running job may invoke its own Docker jobs on the host. Gitlab's top-level job will set up some filesystem context (/builds, mounted as a Docker volume, in the Gitlab case).
Right now, Gitlab can only support CodeClimate in a Docker-in-Docker runner, because CodeClimate performs volume mounting for the individual engines via Docker's --volume flag, which mounts not the path from the invoking container, but rather a path on the docker host. This requires that the path passed to CodeClimate as the CODECLIMATE_CODE variable match the real host path, and in the Gitlab CI case, we don't want that, so we have to "hide" the host with a DinD approach. However, this means that we also don't get any layer caching between jobs, which makes running CodeClimate prohibitively expensive, as all the engines etc have to be downloaded for each job.
By supporting Docker's
volumes-from
mounting option, we can instead tell the engines to inherit any mounts from the invoking orchestrator. This permits CodeClimate to allow the top-level context set up a Docker volume, bind it to the orchestrator, and then allow the orchestrator to pass that to invoked children. This sidesteps the issue of the Engines needing to know the actual host path; as long as the orchestrator's /code directory is mounted, the children can just presume to use it as-is.To accomplish this, we just a) name the top-level container, and b) pass that name via the CODECLIMATE_ORCHESTRATOR env var:
In the bare-metal case, this doesn't change anything - we're mounting the real host path, which then gets passed to the individual children mounted on the /code mount.
While not immediately pertinent to the CodeClimate PR, In Gitlab, we can invoke the Gitlab codequality image like so:
Because this job must be executed in a context that is visible to Docker, we can query Docker to get the current job's container ID, and from there get the volume ID mounted as
$CI_BUILDS_DIR
. We then volume mount that volume as /code, and specify /code as the "host" location of our code to be evaluated. The orchestrator will use the passed volume as /code, which is then passed onto the engine jobs, allowing the entire process to run against an ephemeral Docker volume rather than requiring a known path on the host.