<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"Segoe UI Emoji";
panose-1:2 11 5 2 4 2 4 2 2 3;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
@font-face
{font-family:"\@DengXian";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
font-size:10.0pt;
font-family:"Courier New";}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:Consolas;}
p.m5398798417703686960m-3000132491391917046msipfooter90245289, li.m5398798417703686960m-3000132491391917046msipfooter90245289, div.m5398798417703686960m-3000132491391917046msipfooter90245289
{mso-style-name:m_5398798417703686960m-3000132491391917046msipfooter90245289;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
p.m5398798417703686960msipfooter90245289, li.m5398798417703686960msipfooter90245289, div.m5398798417703686960msipfooter90245289
{mso-style-name:m_5398798417703686960msipfooter90245289;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle22
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
p.msipfooter90245289, li.msipfooter90245289, div.msipfooter90245289
{mso-style-name:msipfooter90245289;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">Awesome. I tested it in my environment too and no more crash
<span style="font-family:"Segoe UI Emoji",sans-serif">😊</span>.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Thanks<o:p></o:p></p>
<p class="MsoNormal">Dipinder<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> William Kennington <wak@google.com> <br>
<b>Sent:</b> Friday, September 8, 2023 3:38 PM<br>
<b>To:</b> Chhabra, DipinderSingh <Dipinder_Chhabra@Dell.com><br>
<b>Cc:</b> openbmc@lists.ozlabs.org<br>
<b>Subject:</b> Re: phosphor-network terminated due to SIGBUS<o:p></o:p></p>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p><span style="color:#CE1126">[EXTERNAL EMAIL] <o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal">It should be fixed now :) <a href="https://urldefense.com/v3/__https:/gerrit.openbmc.org/c/openbmc/phosphor-networkd/*/66533__;Kw!!LpKI!gob3sQ6wP0W0wIjJwF-jO3EI7bC7-rVkbUdSUHX0Yck02vif7XTQ2qmye1eTn2hfFw3EtqqiudlI8Q$">https://gerrit.openbmc.org/c/openbmc/phosphor-networkd/+/66533
[gerrit.openbmc.org]</a><o:p></o:p></p>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal">On Thu, Sep 7, 2023 at 6:24 PM Chhabra, DipinderSingh <<a href="mailto:Dipinder.Chhabra@dell.com">Dipinder.Chhabra@dell.com</a>> wrote:<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Something to do with the callback in the timer context. As a temporary workaround, I have removed the inline implementation of reloadConfigs and moved the completed block of code
inside reload.setCallback directly inside reloadConfigs (including reloadPreHooks, actual dbus call and reloadPostHooks). This works pretty good and no more SIGBUS.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Depending upon the scenario this will cause multiple Reload calls to systemd-networkd (unlike the timer case where it be always be a single call) but I guess it may be ok in the
interim.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Will continue investigating further from my end too.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Thanks<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Dipinder<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b>From:</b> William Kennington <<a href="mailto:wak@google.com" target="_blank">wak@google.com</a>>
<br>
<b>Sent:</b> Thursday, September 7, 2023 5:07 PM<br>
<b>To:</b> Chhabra, DipinderSingh <<a href="mailto:Dipinder_Chhabra@Dell.com">Dipinder_Chhabra@Dell.com</a>><br>
<b>Cc:</b> <a href="mailto:openbmc@lists.ozlabs.org" target="_blank">openbmc@lists.ozlabs.org</a><br>
<b>Subject:</b> Re: phosphor-network terminated due to SIGBUS<o:p></o:p></p>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div>
<p><span style="color:#CE1126">[EXTERNAL EMAIL] </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">We are investigating the same issue on our side, I'm trying some other tests to figure out why the references aren't working as expected.<o:p></o:p></p>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">On Thu, Sep 7, 2023 at 1:27 PM Chhabra, DipinderSingh <<a href="mailto:Dipinder.Chhabra@dell.com" target="_blank">Dipinder.Chhabra@dell.com</a>> wrote:<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Yes.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b>From:</b> William Kennington <<a href="mailto:wak@google.com" target="_blank">wak@google.com</a>>
<br>
<b>Sent:</b> Thursday, September 7, 2023 2:55 PM<br>
<b>To:</b> Chhabra, DipinderSingh <<a href="mailto:Dipinder_Chhabra@Dell.com" target="_blank">Dipinder_Chhabra@Dell.com</a>><br>
<b>Cc:</b> <a href="mailto:openbmc@lists.ozlabs.org" target="_blank">openbmc@lists.ozlabs.org</a><br>
<b>Subject:</b> Re: phosphor-network terminated due to SIGBUS<o:p></o:p></p>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div>
<p><span style="color:#CE1126">[EXTERNAL EMAIL] </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Do you happen to be using aarch64?<o:p></o:p></p>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;margin-left:23.1pt">
On Thu, Sep 7, 2023 at 12:52 PM Chhabra, DipinderSingh <<a href="mailto:Dipinder.Chhabra@dell.com" target="_blank">Dipinder.Chhabra@dell.com</a>> wrote:<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Hi There<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Recently we updated our OpenBMC distro to tag 2.14.0 (phosphor-network SRCREV f78a415e154bac274e1d07ce8128c69e9d1cd710).<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Since then we are seeing that the phosphor-network service crashes after configuration change due to SIGBUS.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<pre><span style="color:black">Sep 07 09:51:45 bmc phosphor-network-manager[627]: Wrote networkd file: /etc/systemd/network/00-bmc-end1.network</span><o:p></o:p></pre>
<pre><span style="color:black">Sep 07 09:51:45 bmc phosphor-network-manager[627]: Wrote networkd file: /etc/systemd/network/00-bmc-end0.network</span><o:p></o:p></pre>
<pre><span style="color:black">Sep 07 09:51:49 bmc systemd[1]: xyz.openbmc_project.Network.service: Main process exited, code=dumped, status=7/BUS</span><o:p></o:p></pre>
<pre><span style="color:black">Sep 07 09:51:49 bmc systemd[1]: xyz.openbmc_project.Network.service: Failed with result 'core-dump'.</span><o:p></o:p></pre>
<pre><span style="color:black">Sep 07 09:51:49 bmc systemd[1]: xyz.openbmc_project.Network.service: Consumed 1.365s CPU time.</span><o:p></o:p></pre>
<pre><span style="color:black">Sep 07 09:51:50 bmc systemd[1]: xyz.openbmc_project.Network.service: Scheduled restart job, restart counter is at 1.</span><o:p></o:p></pre>
<pre><span style="color:black">Sep 07 09:51:50 bmc systemd[1]: Stopped Phosphor Network Manager.</span><o:p></o:p></pre>
<pre><span style="color:black">Sep 07 09:51:50 bmc systemd[1]: xyz.openbmc_project.Network.service: Consumed 1.365s CPU time.</span><o:p></o:p></pre>
<pre><span style="color:black">Sep 07 09:51:50 bmc systemd[1]: Starting Phosphor Network Manager...</span><o:p></o:p></pre>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Based on my debugging, I can confirm that the timer gets scheduled correctly after the config write and the registered call back does get invoked. The crash happens due to the below
dbus call in network_manager.cpp.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.0pt;font-family:"Courier New""> try</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.0pt;font-family:"Courier New""> {</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.0pt;font-family:"Courier New""> bus.get()</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.0pt;font-family:"Courier New""> .new_method_call("org.freedesktop.network1",</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.0pt;font-family:"Courier New""> "/org/freedesktop/network1",</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.0pt;font-family:"Courier New""> "org.freedesktop.network1.Manager", "Reload")</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.0pt;font-family:"Courier New"">
<span style="color:black;background:yellow">.call();</span></span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.0pt;font-family:"Courier New""> lg2::info("Reloaded systemd-networkd");</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:10.0pt;font-family:"Courier New""> }</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">I have looked into any fixes to this in the later commits but do not find any.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">I also tried to change it to call_noreply but that does not help and get the same BUS error.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<pre><span style="color:black"> try</span><o:p></o:p></pre>
<pre><span style="color:black"> {</span><o:p></o:p></pre>
<pre><span style="color:black"> lg2::info("Try systemd-networkd reload...");</span><o:p></o:p></pre>
<pre><span style="color:black"> auto method = bus.get().new_method_call(NETWORKD_BUSNAME, NETWORKD_PATH,</span><o:p></o:p></pre>
<pre><span style="color:black"> NETWORKD_INTERFACE, "Reload");</span><o:p></o:p></pre>
<pre><span style="color:black"> bus.get().call_noreply(method);</span><o:p></o:p></pre>
<pre><span style="color:black"> lg2::info("Reloaded systemd-networkd");</span><o:p></o:p></pre>
<pre><span style="color:black"> }</span><o:p></o:p></pre>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">When I manually invoke this from the shell that seems to go fine.<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<pre><span style="color:black">root@bmc:~# busctl call org.freedesktop.network1 /org/freedesktop/network1 org.freedesktop.network1.Manager Reload </span><o:p></o:p></pre>
<pre><span style="color:black">root@bmc:~# echo $?</span><o:p></o:p></pre>
<pre><span style="color:black">0</span><o:p></o:p></pre>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Anyone else seeing this issue with phosphor-network or any idea why this could be happening?<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Thanks<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Dip<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p style="margin:0in"><span style="font-size:7.0pt;color:#737373">Internal Use - Confidential</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="m5398798417703686960m-3000132491391917046msipfooter90245289" style="margin:0in">
<span style="font-size:7.0pt;color:#737373">Internal Use - Confidential</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="m5398798417703686960msipfooter90245289" style="margin:0in"><span style="font-size:7.0pt;color:#737373">Internal Use - Confidential</span><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="msipfooter90245289" style="margin:0in"><span style="font-size:7.0pt;color:#737373">Internal Use - Confidential</span><o:p></o:p></p>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</body>
</html>