Weird build dependency issue causing missing symbols

Bills, Jason M jason.m.bills at linux.intel.com
Fri Jul 3 05:58:43 AEST 2020


Hi All,

We are hitting a weird build dependency issue with Yocto and 
phosphor-dbus-interfaces and are looking for any help or insight anyone 
may have on how to fix it.  We have not been able to pinpoint exactly 
when the issue started, but we believe it has come up since the dunfell 
update.

The symptom of this issue is we see an undefined symbol error at runtime:
[  101.733677] Jul 02 10:37:48 intel-obmc phosphor-ledcontroller[461]: 
phosph
or-ledcontroller: symbol lookup error: phosphor-ledcontroller: undefined 
symbol: 
_ZN9sdbusplus3xyz15openbmc_project3Led6server8Physical17setPropertyByNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt7variantIJtNS4_7PaletteEhNS4_6ActionEEEb

Once we hit this error, it persists across rebuilds until we delete the 
Yocto build directory (likely something in the cache) and start a fresh 
build.

We have narrowed this down to being caused by two separate issues:
1. When phosphor-dbus-interfaces is rebuilt it will sometimes change the 
order of the PropertiesVariant in server.hpp.
2. When the order of PropertiesVariant changes on a rebuild, the recipes 
that already have an old copy of server.hpp are not triggered to rebuild 
and are left with the old copy of server.hpp.

I have a system that is in this state and have found that if I taint 
phosphor-dbus-interfaces by running "bitbake -C fetch 
phosphor-dbus-interfaces", I see many components rebuild and the symbol 
issue goes away.  If I then remove the taint by running "bitbake -c 
clean phosphor-dbus-interfaces" only phosphor-dbus-interfaces and any 
components in my devtool status list rebuild and the symbol issue comes 
back.

We ran an experiment where we compared the contents of 
".../Led/Physical/server.hpp" between components by running this command 
(where the base file came from an existing build):
for fname in $(find . -iname server.hpp|grep -i "led/physical"); do echo 
"$fname"; diff 
"./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-network/1.0+gitAUTOINC+d0679f9bb4-r1/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp" 
"$fname"; done

With the tainted phosphor-dbus-interfaces, there is no diff in any of 
the server.hpp files.

After cleaning the taint and rebuilding, I get the following results:
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-sel-logger/0.1+gitAUTOINC+761bf202ba-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/intel-ipmi-oem/0.1+gitAUTOINC+e4f710d7d9-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-dbus-interfaces/1.0+gitAUTOINC+26ff1c8446-r1/sysroot-destdir/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
66,67c66
<                 Action,
<                 uint16_t,
---
 >                 uint8_t,
69c68,69
<                 uint8_t>;
---
 >                 Action,
 >                 uint16_t>;
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-ipmi-ipmb/0.1+gitAUTOINC+a86059348f-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-host-postd/0.1+gitAUTOINC+bf002b46d5-r1/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-network/1.0+gitAUTOINC+d0679f9bb4-r1/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/x86-power-control/1.0+gitAUTOINC+b0c613aa88-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
66,67c66
<                 Action,
<                 uint16_t,
---
 >                 uint8_t,
69c68,69
<                 uint8_t>;
---
 >                 Action,
 >                 uint16_t>;
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-post-code-manager/1.0+gitAUTOINC+9d91a39a3a-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/obmc-ikvm/1.0+gitAUTOINC+861337e8ec-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/service-config-manager/0.1+gitAUTOINC+83241c09ec-r0/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/work/arm1176jzs-openbmc-linux-gnueabi/phosphor-ipmi-kcs/1.0+gitAUTOINC+d8594e9a62-r1/recipe-sysroot/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
./tmp/sysroots-components/arm1176jzs/phosphor-dbus-interfaces/usr/include/xyz/openbmc_project/Led/Physical/server.hpp
66,67c66
<                 Action,
<                 uint16_t,
---
 >                 uint8_t,
69c68,69
<                 uint8_t>;
---
 >                 Action,
 >                 uint16_t>;

The order of the variant changed in server.hpp in 
phosphor-dbus-interfaces.  I had x86-power-control in my devtool status 
list, so it rebuilt and got the new copy of server.hpp, but everything 
else still had the old copy.

Does anyone have any ideas on what could be happening or if we're 
missing something to properly trigger the rebuilds?

Thanks for your help!
-Jason


More information about the openbmc mailing list