<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;
      charset=windows-1252">
  </head>
  <body>
    <p>No, I haven't. How can I get it?<br>
    </p>
    <div class="moz-cite-prefix">On 06.04.2021 16:00, Daniel M Crowell
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:OF3A5A2C8C.6340E7C7-ON862586AF.004CC354-862586AF.004CEA27@notes.na.collabserv.com">
      <meta http-equiv="content-type" content="text/html;
        charset=windows-1252">
      <p><font size="2">Have you attempted to get a complete scom trace
          from the original Hostboot code and compare it to your new
          code? That is a pretty typical debug strategy on our side when
          migrating from the initial hardware bringup scripts into the
          firmware implementation.</font><br>
        <font size="2"><br>
          --<br>
          Dan Crowell<br>
          Senior Software Engineer - Power Systems Enablement Firmware<br>
          IBM Rochester: t/l 553-2987<br>
          <a class="moz-txt-link-abbreviated" href="mailto:dcrowell@us.ibm.com">dcrowell@us.ibm.com</a></font><br>
        <br>
        <img src="cid:part1.72668866.150D7313@3mdeb.com" alt="Inactive
          hide details for Krystian Hebel ---04/06/2021 07:45:26
          AM---Update: I have dealt with write leveling issue, I
          accident" class="" width="16" height="16" border="0"><font
          size="2" color="#424282">Krystian Hebel ---04/06/2021 07:45:26
          AM---Update: I have dealt with write leveling issue, I
          accidentally shifted a bit twice when trying to s</font><br>
        <br>
        <font size="2" color="#5F5F5F">From: </font><font size="2">Krystian
          Hebel <a class="moz-txt-link-rfc2396E" href="mailto:krystian.hebel@3mdeb.com"><krystian.hebel@3mdeb.com></a></font><br>
        <font size="2" color="#5F5F5F">To: </font><font size="2">Daniel
          M Crowell <a class="moz-txt-link-rfc2396E" href="mailto:dcrowell@us.ibm.com"><dcrowell@us.ibm.com></a></font><br>
        <font size="2" color="#5F5F5F">Cc: </font><font size="2"><a class="moz-txt-link-abbreviated" href="mailto:firmware@3mdeb.com">firmware@3mdeb.com</a>,
          <a class="moz-txt-link-abbreviated" href="mailto:openpower-firmware@lists.ozlabs.org">openpower-firmware@lists.ozlabs.org</a></font><br>
        <font size="2" color="#5F5F5F">Date: </font><font size="2">04/06/2021
          07:45 AM</font><br>
        <font size="2" color="#5F5F5F">Subject: </font><font size="2">[EXTERNAL]
          Re: [OpenPower-Firmware] Problem with CCS</font><br>
      </p>
      <hr style="color:#8091A5; " width="100%" size="2"
        noshade="noshade" align="left"><br>
      <br>
      <br>
      <font size="1" color="#FFFFFF">Update: I have dealt with write
        leveling issue, I accidentally shifted a bit twice when trying
        to set PAR_A17_MASK in SEQ_CONTROL0, so it was left unmasked.
        Now I'm back to initial issue with loop in CCS. This time
        however I see a difference ZjQcmQRYFpfptBannerStart</font> <br>
      <b><font face="Arial">This Message Is From an External Sender </font></b><br>
      <font size="2" face="Arial">This message came from outside your
        organization. </font><br>
      <font size="1" color="#FFFFFF">ZjQcmQRYFpfptBannerEnd</font>
      <p>Update: I have dealt with write leveling issue, I accidentally
        shifted a bit twice when trying to set PAR_A17_MASK in
        SEQ_CONTROL0, so it was left unmasked.
      </p>
      <p>Now I'm back to initial issue with loop in CCS. This time
        however I see a difference between original code (refresh on):
      </p>
      <p>    0x0000000000000000 - APB_ERROR_STATUS0<br>
            0x0000000000001000 - RC_ERROR_STATUS0<br>
            0x0000000000000000 - SEQ_ERROR_STATUS0<br>
            0x0000000000000000 - WC_ERROR_STATUS0<br>
            0x0000000000000400 - PC_ERROR_STATUS0<br>
            0x0000000000002008 - PC_INIT_CAL_ERROR<br>
            0x0000000000000688 - DDRPHY_PC_INIT_CAL_STATUS<br>
            0x0000000000000080 - IOM_PHY0_DDRPHY_FIR_REG
      </p>
      <p>and after setting DDRPHY_PC_INIT_CAL_CONFIG1_P0 as in previous
        mail:<br>
        <br>
            0x0000000000000000 - APB_ERROR_STATUS0<br>
            0x0000000000001000 - RC_ERROR_STATUS0<br>
            0x0000000000000000 - SEQ_ERROR_STATUS0<br>
            0x0000000000000000 - WC_ERROR_STATUS0<br>
            0x0000000000000000 - PC_ERROR_STATUS0<br>
            0x0000000000000000 - PC_INIT_CAL_ERROR<br>
            0x0000000000000608 - DDRPHY_PC_INIT_CAL_STATUS<br>
            0x0000000000000000 - IOM_PHY0_DDRPHY_FIR_REG
      </p>
      <p>PC_INIT_CAL_ERROR no longer reports an error, but
        DDRPHY_PC_INIT_CAL_STATUS still doesn't report a success. No
        DQ/DQS bits are disabled, neither with nor without refresh.
      </p>
      <p>On 06.04.2021 12:28, Krystian Hebel wrote:
      </p>
      <ul>
        <ul>
          Hi Daniel,
          <p>Thanks for quick and informative response.
          </p>
          <ul>
            <ul>
              <font size="2" face="Arial">I got these answers from one
                of our memory experts.</font><br>
              <font size="2" face="Arial"> </font><br>
              <font size="2" face="Arial">Hi Krystian,</font>
              <ul>
                <font size="2">1. </font><font size="2" face="Arial">IBM
                  mostly uses x4 DIMM's. Is it possible to run with a x4
                  DIMM for debug purposes to see if the problem
                  persists? This will help debug configuration issues
                  with the x8 DIMM's</font>
              </ul>
            </ul>
          </ul>
          This may be difficult due to remote work, but I'll see what
          can be done.
          <ul>
            <ul>
              <ul>
                <font size="2">2. </font><font size="2" face="Arial">Have
                  you tried disabling refresh to see if the issues go
                  away?</font>
              </ul>
            </ul>
          </ul>
          Is it enough to just modify DDRPHY_PC_INIT_CAL_CONFIG1_P0? If
          yes, I changed all of REFRESH_COUNT, REFRESH_CONTROL and
          REFRESH_ALL_RANKS to all 0's and REFRESH_INTERVAL to all 1's.
          It still fails the same way, but a few microseconds faster
          than before.
          <ul>
            <ul>
              <ul>
                <font size="2">3. </font><font size="2" face="Arial">For
                  calibration fails (which it looks like you are
                  experiencing), I would recommend dumping the following
                  registers for rank 0<br>
                  DQS disable bits<br>
                  0x8000007d0701103f<br>
                  0x8000047d0701103f<br>
                  0x8000087d0701103f<br>
                  0x80000c7d0701103f<br>
                  0x8000107d0701103f<br>
                  <br>
                  DQ disable bits<br>
                  0x8000007c0701103f<br>
                  0x8000047c0701103f<br>
                  0x8000087c0701103f<br>
                  0x80000c7c0701103f<br>
                  0x8000107c0701103f<br>
                  <br>
                  If calibration is passing on a given DRAM, all of the
                  bits should be 0's. Fails are noted by 1's in the
                  register. As per all PHY registers only the right most
                  16 bits matter.</font>
              </ul>
            </ul>
          </ul>
          Here I can see some fails: all DQ bits on first and second
          DP16 and all configured DQS bits (0xc300 for first and 0x3c00
          for second, which is consistent with settings from [1]). The
          rest of DP16s passes. This DIMM works with Hostboot so I think
          clock bits are selected properly.
          <p>I haven't thought that these are updated by a hardware and
            then used as an input for next steps. Now I know that what I
            think was a successful write leveling, was actually skipping
            bad bits. I was mislead by the fact that the second attempt
            took more time than the first one, but it makes sense, as it
            starts from a higher initial delay and has a longer way to
            go down and up again, if I understand this step correctly.
          </p>
          <p>I went a step further and dumped all
            WR_DELAY_VALUE_x_RP0_REG - for passed bits it is somewhere
            in range 0x1900-0x2b00, where every set of 8 DQ bits and its
            accompanying DQS bit have the same value, which I believe is
            expected for x8 memory. For failed bits this value is always
            0x3a00 for DQ bits (and whatever is in DELAY_VALUE_16-22
            which isn't configured as a DQS), but 0x4200 for DQS bits.
            Contrary to passing DP16s, these values don't change between
            boots. They can change slightly when I modify
            DDRPHY_WC_CONFIG1_P0, but still no pass.
          </p>
          <ul>
            <ul>
              <ul>
                <font size="2">4. </font><font size="2" face="Arial">To
                  my knowledge, there should not be an issue sending the
                  RCW commands via i2c.</font><br>
                <font size="2">5. </font><font size="2" face="Arial">Running
                  in our test environment, I am seeing the following
                  scoms for DQS align: </font>
                <ul>
                  <font size="2" face="Arial">CRONUSDEBUG(30807) :
                    PUTSCOM   : p9n.mcbist:k0:n0:s0:p01:c1 :
                    070123A5             4000000000000000 # Stop CCS<br>
                    CRONUSDEBUG(30818) : PUTSCOM   :
                    p9n.mcbist:k0:n0:s0:p01:c1 : 07012315            
                    000000F0CC0000C0 # Configure init calibration<br>
                    CRONUSDEBUG(30823) : PUTSCOM   :
                    p9n.mcbist:k0:n0:s0:p01:c1 : 07012335            
                    0000000000000041 # Go to instruction 1<br>
                    CRONUSDEBUG(30826) : PUTSCOM   :
                    p9n.mcbist:k0:n0:s0:p01:c1 : 07012316            
                    000008F0CC000000 # don't do anything<br>
                    CRONUSDEBUG(30831) : PUTSCOM   :
                    p9n.mcbist:k0:n0:s0:p01:c1 : 07012336            
                    0000000000000020 # End CCS<br>
                    CRONUSDEBUG(30839) : PUTSCOM   :
                    p9n.mcbist:k0:n0:s0:p01:c1 : 070123DB            
                    0400000000000000 # Configure the port to run<br>
                    CRONUSDEBUG(30848) : PUTSCOM   :
                    p9n.mcbist:k0:n0:s0:p01:c1 : 070123A5            
                    8000000000000000 # Kick off CCS<br>
                    <br>
                    I hope that this trace helps.</font>
                </ul>
              </ul>
            </ul>
          </ul>
          So, DDR_CAL_RANK in ARR1 is a number, and not a bit map of
          selected ranks? That was my initial understanding, but then I
          changed the code to treat it as a bit map. Still, fixing the
          code doesn't help, even though now it is identical to the
          trace above.
          <p>[1] <a
href="https://git.raptorcs.com/git/talos-hostboot/tree/src/import/chips/p9/procedures/hwp/memory/lib/phy/dp16.C#n1963"
              moz-do-not-send="true"><u><font color="#0000FF">https://git.raptorcs.com/git/talos-hostboot/tree/src/import/chips/p9/procedures/hwp/memory/lib/phy/dp16.C#n1963</font></u></a><br>
            <tt>-- <br>
              Krystian Hebel<br>
              Firmware Engineer<br>
            </tt><a href="https://3mdeb.com" moz-do-not-send="true"><tt><u><font
                    color="#0000FF">https://3mdeb.com</font></u></tt></a><tt> |
              @3mdeb_com</tt></p>
        </ul>
      </ul>
      <tt>-- <br>
        Krystian Hebel<br>
        Firmware Engineer<br>
      </tt><a href="https://3mdeb.com" moz-do-not-send="true"><tt><u><font
              color="#0000FF">https://3mdeb.com</font></u></tt></a><tt> |
        @3mdeb_com</tt><br>
      <br>
      <br>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 
Krystian Hebel
Firmware Engineer
<a class="moz-txt-link-freetext" href="https://3mdeb.com">https://3mdeb.com</a> | @3mdeb_com</pre>
  </body>
</html>