sdbusplus: asio: Hang problem with asio::connection using new_method_call
Zhang Jian
zhangjian.3032 at bytedance.com
Tue Nov 29 00:55:53 AEDT 2022
Hi temas;
I encountered a problem when I was trying to use the dbus synchronized
call `new_method_call`
via `asio::connection`, the dbus server sometimes can't receive and
respond to the request.
I found out that when the `asio` service is synchronous calling, the
dbus request is coming.
In this case, the dbus server can't receive the request. Also, at the
same time, another dbus request is
coming. The dbus server will handle 2 requests at the same time.
After debugging, I found the callback `socket.async_read_some`[0]
is not called, I don't know
why.
[0]: https://github.com/openbmc/sdbusplus/blob/master/include/sdbusplus/asio/connection.hpp#L324
I wrote a simple test case to reproduce this problem.
https://github.com/zhangjian3032/bug_simple/blob/master/case1/README.md
1. Build this case
```
meson build
ninja -C build
```
2. Run the dbus server `fake_server_bar`
Because this bug happens when the dbus server is synchronous calling,
so we need to
run a dbus server that only provides a service to be called, and this
problem will be reproduced
when the dbus server is synchronous calling, thus for every dbus
request, this dbus server
will sleep 1 second to simulate the synchronous calling(actually, it
could happen at any time,
1 second is just so easy to reproduce).
```
~# ./build/case1/fake_server_bar
```
3. Run the dbus server `fake_server`
This server had a timer to sync call the dbus server `fake_server_bar`
every 5 seconds.
```
# using a new terminal.
~# ./build/case1/fake_server
```
4. Run the dbus client `fake_client`
This client will send a dbus request to the dbus server `fake_server`
every 10ms,
wait a minute, you will see this bug(Client will hang and timeout exception).
```
# using a new terminal.
~# ./build/case1/fake_client
```
5. Optional: Run the dbus client `busctl --user get-property com.foo
/com/foo com.foo Foo`
When this bug happens, using this command can break the hang and
timeout exception.
```
# using a new terminal.
busctl --user get-property com.foo /com/foo com.foo Foo
```
# Others
Because I found the socket callback `socket.async_read_some` is not
called, so I tried to
run `read_immediate` after sync call, here's a workaround on this.
More information about the openbmc
mailing list