Weird build dependency issue causing missing symbols
Bills, Jason M
jason.m.bills at linux.intel.com
Fri Jul 3 05:58:43 AEST 2020
Hi All,
We are hitting a weird build dependency issue with Yocto and
phosphor-dbus-interfaces and are looking for any help or insight anyone
may have on how to fix it. We have not been able to pinpoint exactly
when the issue started, but we believe it has come up since the dunfell
update.
The symptom of this issue is we see an undefined symbol error at runtime:
[ 101.733677] Jul 02 10:37:48 intel-obmc phosphor-ledcontroller[461]:
phosph
or-ledcontroller: symbol lookup error: phosphor-ledcontroller: undefined
symbol:
_ZN9sdbusplus3xyz15openbmc_project3Led6server8Physical17setPropertyByNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt7variantIJtNS4_7PaletteEhNS4_6ActionEEEb
Once we hit this error, it persists across rebuilds until we delete the
Yocto build directory (likely something in the cache) and start a fresh
build.
We have narrowed this down to being caused by two separate issues:
1. When phosphor-dbus-interfaces is rebuilt it will sometimes change the
order of the PropertiesVariant in server.hpp.
2. When the order of PropertiesVariant changes on a rebuild, the recipes
that already have an old copy of server.hpp are not triggered to rebuild
and are left with the old copy of server.hpp.
I have a system that is in this state and have found that if I taint
phosphor-dbus-interfaces by running "bitbake -C fetch
phosphor-dbus-interfaces", I see many components rebuild and the symbol
issue goes away. If I then remove the taint by running "bitbake -c
clean phosphor-dbus-interfaces" only phosphor-dbus-interfaces and any
components in my devtool status list rebuild and the symbol issue comes
back.
We ran an experiment where we compared the contents of
".../Led/Physical/server.hpp" between components by running this command
(where the base file came from an existing build):
for fname in $(find . -iname server.hpp|grep -i "led/physical"); do echo
"$fname"; diff
"./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-network/1.0+gitAUTOINC+d0679f9bb4-r1/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp"
"$fname"; done
With the tainted phosphor-dbus-interfaces, there is no diff in any of
the server.hpp files.
After cleaning the taint and rebuilding, I get the following results:
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-sel-logger/0.1+gitAUTOINC+761bf202ba-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/intel-ipmi-oem/0.1+gitAUTOINC+e4f710d7d9-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-dbus-interfaces/1.0+gitAUTOINC+26ff1c8446-r1/sysroot-destdir/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
66,67c66
< Action,
< uint16_t,
---
> uint8_t,
69c68,69
< uint8_t>;
---
> Action,
> uint16_t>;
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-ipmi-ipmb/0.1+gitAUTOINC+a86059348f-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-host-postd/0.1+gitAUTOINC+bf002b46d5-r1/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-network/1.0+gitAUTOINC+d0679f9bb4-r1/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/x86-power-control/1.0+gitAUTOINC+b0c613aa88-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
66,67c66
< Action,
< uint16_t,
---
> uint8_t,
69c68,69
< uint8_t>;
---
> Action,
> uint16_t>;
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-post-code-manager/1.0+gitAUTOINC+9d91a39a3a-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/obmc-ikvm/1.0+gitAUTOINC+861337e8ec-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/service-config-manager/0.1+gitAUTOINC+83241c09ec-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-ipmi-kcs/1.0+gitAUTOINC+d8594e9a62-r1/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/sysroots-components/arm1176jzs/phosphor-dbus-interfaces/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
66,67c66
< Action,
< uint16_t,
---
> uint8_t,
69c68,69
< uint8_t>;
---
> Action,
> uint16_t>;
The order of the variant changed in server.hpp in
phosphor-dbus-interfaces. I had x86-power-control in my devtool status
list, so it rebuilt and got the new copy of server.hpp, but everything
else still had the old copy.
Does anyone have any ideas on what could be happening or if we're
missing something to properly trigger the rebuilds?
Thanks for your help!
-Jason
More information about the openbmc
mailing list