The incomplete result of mapper GetSubTree/Paths

Ed Tanous edtanous at google.com
Tue May 24 02:19:34 AEST 2022


On Thu, May 19, 2022 at 8:14 PM Lei Yu <yulei.sh at bytedance.com> wrote:

> This email is to describe an issue in mapper that the
> GetSubTree/GetSubTreePaths could return incomplete results when it's
> doing introspect.
>
> Steps to reproduce the issue:
> 1. Configure phosphor-logging to get max 1000 entries. (with
> -Derror_info_cap=1000 meson option)
> 2. Create 1000 logging entries.
> 3. Call GetSubTreePaths and make sure it gets the correct 1000 entries:
>     # busctl call "xyz.openbmc_project.ObjectMapper"
> "/xyz/openbmc_project/object_mapper"
> "xyz.openbmc_project.ObjectMapper" GetSubTreePaths sias
> /xyz/openbmc_project/logging/entry 0 1
> xyz.openbmc_project.Logging.Entry | awk '{print $2;}'
>     1000
> 4. Restart logging service
>     # systemctl restart xyz.openbmc_project.Logging.service
> 5. After the service is restarted, call GetSubTreePaths for multiple
> times in the short time (e.g. within 10 seconds)
>     # busctl call "xyz.openbmc_project.ObjectMapper"
> "/xyz/openbmc_project/object_mapper"
> "xyz.openbmc_project.ObjectMapper" GetSubTreePaths sias
> /xyz/openbmc_project/loggiz.openbmc_project.Logging.Entry | awk
> '{print $2;}'
>     47
>     # busctl call "xyz.openbmc_project.ObjectMapper"
> "/xyz/openbmc_project/object_mapper"
> "xyz.openbmc_project.ObjectMapper" GetSubTreePaths sias
> /xyz/openbmc_project/loggiz.openbmc_project.Logging.Entry | awk
> '{print $2;}'
>     375
>     # busctl call "xyz.openbmc_project.ObjectMapper"
> "/xyz/openbmc_project/object_mapper"
> "xyz.openbmc_project.ObjectMapper" GetSubTreePaths sias
> /xyz/openbmc_project/loggiz.openbmc_project.Logging.Entry | awk
> '{print $2;}'
>     851
>     # busctl call "xyz.openbmc_project.ObjectMapper"
> "/xyz/openbmc_project/object_mapper"
> "xyz.openbmc_project.ObjectMapper" GetSubTreePaths sias
> /xyz/openbmc_project/loggiz.openbmc_project.Logging.Entry | awk
> '{print $2;}'
>     1000
>
> We can see that the result of GetSubTreePaths is increasing until it gets
> 1000.
> This actually happens when mapper is doing introspect of the logging
> service, and getting more and more objects.

The above "incomplete" behavior will impact the logic that depends on
> the result of GetSubTreePaths.
> E.g. in ipmid, the "cached SEL" feature depends on the reliable result
> of GetSubTreePath, to get the number of current logging entries.


I'm not really following why this is a problem, and sounds like a bug in
the cached SEL feature.  Dbus objects can get added or removed at any time,
including logging objects, if the cached sel feature isn't handling all the
added/removed signals as it should, that really should be fixed.  Can you
go into more detail about what this actually causes?  Is it that logging
objects are getting added out of order?


> If
> it's not correct, ipmid will not know the "missed" entries.
>

What is a "missed" entry in this context?


>
> The question is, should we make sure mapper returns the "stable"
> result in the above case?
>

I don't think so, at least, when it was built, it wasn't designed to return
atomically created results, but given that ANY object on dbus can be added
or removed at any time, it's not clear why this is a problem in the mapper
itself.


> When it's doing introspect of a service (e.g. nameOwnerChanged), it
> could throw if the service is not fully introspected, and only return
> the "correct" result after the service is fully introspected.


> If mapper could not guarantee the stable result, the service calling
> mapper will have to add more complex logic to make sure it gets the
> "full and correct" result.
>

I left comments on your code review as well, but please don't do this.
This is going to force retry code into EVERY service that relies on the
mapper, at EVERY mapper call site.  If the current behavior is truly a
problem, I would much rather the mapper support multi-versioned atomically
create copies of the dbus-map, and have the mapper expose the "reliable"
interface, rather than spreading that throughout the system.  For context,
bmcweb alone has dozens of call sites to the mapper that would need to be
fixed if it were made to be possible for it to fail in this way.


>
> --
> BRs,
> Lei YU
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20220523/ecf5b307/attachment-0001.htm>


More information about the openbmc mailing list