releng: better dynamic library verification#10691
Conversation
|
After looking through all my old PRs/branches I pieced together why I wanted to move verify-libraries back to build-and-test: I had intended to save a bunch of time in releng using the then-new I still want to propose this change as-is, to fix #7149, and then figure out if we should kick |
papertigers
left a comment
There was a problem hiding this comment.
It took me a bit to fully follow what was happening here since I have never looked at the releng job system before. Just so I follow completely this is always checking the recovery and host libraries but only verifying debug if --verify-debug-libraries is passed?
Otherwise this looks good to me, thanks for doing this work.
Closes #7149.
Prior to this change we ran the
verify-librariesxtask on release binaries. This appeared like an optimal choice at the time, but I realized that we were no longer testing that debug binaries required only expected dynamic libraries (which ended up regressing for a bit), and because of the differences in how omicron-package and theverify-librariesxtask compiled the software we weren't even testing the actual binaries we shipped. More details in #7149.This adds two new tasks to releng,
host-librariesandrecovery-libraries, which runs the verify-libraries check on all the binaries produced by omicron-package. Theverify-librariesreleng task now compiles and checks debug binaries after all of thecargo buildtasks needed for the repo are done.When approaching this I initially was going to move
cargo xtask verify-librariesback to the build-and-test job, because I was annoyed that local runs ofcargo xtask relengwas running a lengthy check I didn't care about. However that would have added about 6-8 minutes to the runtime of that test (the bins can't be built at the same time as the tests, otherwise the bins end up being linked against test-only libraries that pull in dynamic libraries we explicitly don't want). The whole reason it was moved to this task in the first place is because it was basically "free": the second half-ish of the releng process is building OS images which is mostly network/disk IO and barely uses any CPU. So instead of moving it I added an off-by-default option to enable this check, and added it to the Buildomat job definition.