Dockerfile: install build dependencies#423
Dockerfile: install build dependencies#423pjonsson wants to merge 3 commits intodavidfrantz:developfrom
Conversation
Building in one image and running in a different image can lead to strange errors, so base them on the same internal base image to ensure that the build image and the result image always come from the same image.
Copy the list of build dependencies from base_image and install them in the build container. This will enable removing the build dependencies from base_image, and that will eliminate hundreds of complaints from Trivy, including the current CVE-2026-23112 which is marked as high severity. Refs davidfrantz#415
| ARG build=all | ||
|
|
||
| # Refresh package list & upgrade existing packages | ||
| RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \ |
There was a problem hiding this comment.
This step takes 10 seconds right now, but will take a bit longer if more packages have to be installed. For local builds you normally hit the docker cache so this step rarely executes, and when it does the odds are still quite good that many of the latest packages will be present in the cache-mount.
I'm fairly sure this list can also be trimmed if it's just FORCE's build dependencies we need to install, but it's easier to do that after the build dependencies are no longer part of base_image.
The output from a Docker build might contain outdated Ubuntu packages, so get the latest packages installed. Refs davidfrantz#415
|
Hi Peter, I don't fully understand the nature of this PR. Why would I want to re-install packages in the FORCE container? When we reinstall everything here again, there would be no point in using the base image at all? In the list you copied, there are tools that need to be available in the final FORCE container (e.g., rename, parallel, lockfile-progs). Also, some of the libraries (e.g. gsl or jansson) are dynamically linked, hence removing them again won't work. Is the main purpose to not store certain packages in any of the containers? Which ones are the crucial packages? |
I'm trying to only install runtime dependencies (in a wide sense) of FORCE and the Python/R packages in the FORCE image that people end up running. My personal motivation is that Trivy keeps shouting about security vulnerabilities in the development packages, so if a real security vulnerability shows up it's easy to miss that since the Trivy report is always orange/red. People who don't have a security scanner also benefit since they get a smaller image.
The base image contains the Python packages, the R packages, and OpenCV. All those take a reasonable amount of time to build, so I'm not sure there wouldn't be a point of having the base image. But if your point was that there isn't as much benefit from having the base image if we have to install development packages in the build image here anyways, I completely agree with you.
For the particular examples, it sounds like the base image should install rename/parallel/lockfile-progs as well as the libgsl and libjansson packages, and then the build stage for FORCE would install the libgsl-dev and libjansson-dev packages to get the headers/static libraries required to build FORCE. I hadn't gotten to the stage of sorting out exactly which dependencies are required where though, I just copied the entire list so this CI would continue to work, and then unnecessary packages could gradually be trimmed from the base image.
I'm guessing you know that better than I do, but I can partially answer the inverse question: the |
|
OK, I think I mostly understand the reasoning now. Would it help if I annotate how and where each installed package is being used? |
|
We actually have those kind of annotations in our internal Dockerfiles, at least for the non-obvious stuff (gcc/g++ are obviously required for compiling anything in C/C++, but knowing everything that requires libpq-dev to build is a bit less obvious, and sometimes things build just fine without a particular dev library, but then the feature is missing in the binary). My experience is that the annotations tend to get outdated over time so they aren't always 100% correct, but they at least give an idea about where something initially came from. The annotations would give me some guidance for what packages that should come from the base image and doesn't need to be installed in the build container here, but I don't have a clever way to provide the development tools installed in the base image without them ending up in the runner image here unfortunately. |
Copy the list of build dependencies
from base_image and install them
in the build container. This will
enable removing the build dependencies
from base_image, and that will eliminate
hundreds of complaints from Trivy,
including the current CVE-2026-23112
which is marked as high severity.
Edit: and do an
apt-get upgradefor the packages in the output image as well to get the latest security fixes.Refs #415