/gallery

Gentoo installation - Binary dependency tracking

While I'm no longer a Gentoo developer since a few years back, that doesn't mean I've completely stopped using and thinking about some of the problem domains that are around.

One of the more annoying ones I have issues with is the somewhat frequent need for revdep-rebuild, especially on long-living slow-updating maintainance only systems. It would seem that to fix this, there is a need to not just post-breakage scan for things that broke and rebuild, but to preemptively work about it.

Library changes and dependency changes is a fact of life. There is no going on about it, no matter what we do, where we do or where we go, things change. Maintaining a static system isn't anyway interesting for people using Gentoo, there's Debian stable and RedHat/CentOS for that.

However, this doesn't mean that we can't do things to alleviate this situation, _without_ adding more requirements for developer QA.

A first step would of course be to implement and have a working --as-needed build environment. This is mostly a matter of developer buy-in rather than anything else. However, beyond this goes my suggestion:

Taking a hint from how rpm does it shows that after src.rpm build( the install step), the list of files for an rpm is sent through "find-requires" which will scan them for binary elf files and so-files (using file/ldd/objdump) and extract a list of weak symbols for glibc ( "Version References" in objdump -p), .so-file dependencies and binary dependencies.

We can use a similar step just after src_install() in Gentoo, scanning $D for elfs with scanelf/readelf to do a twostep list for our needs.

First, we build a pair list of the form "file dependency_file" for our package, this list gets shipped inside the installation path list. ( Premature optimization: reduce it to only a list of dependencies. Personally, I think we could do more interesting thing with a higher granulation of data ) second, we also generate a backwards-resolved list of "Dependend files => currently installed package owning it" to be stored for QA reasons.

** Extra feature **
If we wish to provide an additional QA tool, we can then build an empty graph (system => current package) and map this graph to the dependencies we just extracted, and if any package listed as dependency is not in the graph, we fail QA tests due to hidden dependency.
** Back to normal **

At this same time we also scan through our installed package, and extract the _provided_ ELF libraries ( and perhaps even function names ).

Now, what use does all this extra work give us? Well for both binary and source installation, we can at a pre_install step check if our package (binary or source) is currently unsupportable by the current system, (concatenate the list of installed libraries on our system with the list of libraries to be installed, match this list with the list of required libraries, if something is missing, we're in an inconsistent state and would break the to-be-installed package. Fall back to automatic resolution, or fail to operator)

That as well as on a pkg_prerm() we scan the to be removed libraries ( We have that cached since installation time) , remove them from the global list of provided libraries, and compare the list with the concatenated list of total dependencies, if we find a mismatch, we would break things by removing this library, and can then fail neatly.

For portage to protect against more subtle breakage, we can expand this to do the checks before we overwrite any libraries in pkg_rpreinst(), by scanning the current systems dependencies against files that we are about to overwrite, then doing either a header comparision in this backwards expanded list to make sure that we do not break any functions in existing programs, or overwrite things with incompatible versions. This would then defend us against binary incompabilities on systems.

So, the costs are during build and installation time, and work more like a global cache of our installed files, leaving us with more tools to protect running systems, both users and developers against breakage, while also giving us opportunity to increase QA tools and availability.

Some references:

rpm: find-requires php:
scripts/find-requires.php
( same tree has mono, perl and python scripts)

And this is the rpm.org main elf-scanner for dependency generation/Resolution
autodeps/linux.req

This has been filed as Gentoo bug #327809

0 Responses to Gentoo installation - Binary dependency tracking

  1. There are currently no comments.

Leave a Reply