[PATCH v14 00/15] phy: Add support for Lynx 10G SerDes

Tue Aug 22 09:39:15 AEST 2023

On 8/21/23 18:48, Vladimir Oltean wrote:
> On Mon, Aug 21, 2023 at 05:06:46PM -0400, Sean Anderson wrote:
>> On 8/21/23 15:58, Vladimir Oltean wrote:
>> > On Mon, Aug 21, 2023 at 02:46:53PM -0400, Sean Anderson wrote:
>> >> After further review, it seems the reason 28g can get away without this
>> >> is because there's a one-to-one mapping between protocol controllers and
>> >> lanes. Unfortunately, that regularity is not present for 10g.
>> >> 
>> >> --Sean
>> > 
>> > There are some things I saw in your phy-fsl-lynx-10g.c driver and device
>> > tree bindings that I don't understand (the concept of lane groups)
>> 
>> Each lane group corresponds to a phy device (struct phy). For protocols
>> like PCIe or XAUI which can use multiple lanes, this lets the driver
>> coordinate configuring all lanes at once in the correct order.
> 
> For PCIe I admit that I don't know. I don't even know if there's any
> valid (current or future) use case for having the PCIe controller have a
> phandle to the SerDes. Everything else below in my reply assumes Ethernet.

One use case could be selecting PCIe/SATA as appropriate for an M.2
card. Another use case could be allocating lanes to different slots
based on card presence (e.g. 1 card use x4, 2+ cards use x1). I recently
bought a motherboard whose manual says

| PCI_E4 [normally x4] will run x1 speed when installing devices in PCI_E2/
| PCI_E3/ PCI_E5 slot [all x1].
which implies exactly this sort of design.

>> > and
>> > I'm not sure if they're related with what you're saying here, so if you
>> > could elaborate a bit more (ideally with an example) on the one-to-one
>> > mapping and the specific problems it causes, it would be great.
>> 
>> For e.g. the LS2088A, SerDes1 lanes H-A use SG1-8 and XFI1-8. SerDes2
>> lanes A-H use SG9-16 and XFI9-16. Each lane has its own controller, and
>> the mapping is 1-to-1. In the PCCRs, each controller uses the same
>> value, and is mapped in a regular way. So you can go directly from the
>> lane number to the right value to mask in the PCCR, with a very simple
>> translation scheme.
>> 
>> For e.g. the LS1046A, SerDes1 lane D uses XFI.9 (aka XFIA) and lane C
>> uses XFI.10 (aka XFIB). This is the opposite of how SerDes1 lanes A-D
>> use SGMII.9, .10, .5, and .6 (aka SGMIIA-D).
>> 
>> For e.g. the T4240, SerDes1 lanes A-H use sg1.5, .6, .10, .9, .1, .2,
>> .3, .4 (aka SGMII E, F, H, G, A, B, C, D).
>> 
>> For e.g. the B4860, SerDes lanes A uses sgmii1 or sgmii5 and B uses
>> sgmii2 or sgmii6. The MAC selected is determined based on the value
>> programmed into PCCR2.
> 
> It's so exceedingly unlikely that B4860 will gain support for anything
> new, that it's not even worth talking about it, or even considering it
> in the design of a new driver. Just forget about it.
> 
> Let's concentrate on the Layerscapes, and on the T series to the extent
> that we're not going out of our way to support them with a fairly simple
> design.

Well, these are just easy ways to show the oddities in the PCCRs. I
could have made the same points with the various Layerscapes.

> In the Lynx 10G block guide that I have, PCCR2 is a register that does
> something completely different from Ethernet. I'm not sure if B4860 has
> a Lynx 10G block and not something else.

T-series SoCs use a different PCCR layout than Layerscape, despite using
the same registers in the rest of the SerDes. The LS1021 also uses this
scheme, so it's not just T-series (but it's close enough). This is why I
had an option for different PCCR configuration functions in earlier
versions of this series, so if someone wanted they could easily add
T-series support.

>> While I appreciate that your hardware engineers did a better job for
>> 28g, many 10g serdes arbitrarily map lanes to protocol controllers.
>> I think the mapping is too irregular to tame, and it is better to say
>> "if you want this configuration, program this value".
> 
> Ok, but that's a lateral argument (or I'm not understanding the connection).

To restate, there's no systemic organization to the PCCRs. The driver
configuration should reflect this and allow arbitrary mappings.

> Some maintainers (Mark Brown for sure, from my personal experience) prefer
> that expert-level knowledge of hardware should be hardcoded into the kernel
> driver based on known stuff such as the SoC-specific compatible string.
> I certainly share the same view.
> 
> With your case, I think that argument is even more pertinent, because
> IIUC, the lane group protocols won't be put in the SoC .dtsi (so as to
> be written only once), but in the board device trees (since the
> available protocols invariably depend upon the board provisioning).
> So, non-expert board device tree writers will have to know what's with
> the PCCR stuff. Pretty brain-intensive.

The idea is to stick all the PCCR stuff in the SoC devicetree, and the
board-level devicetrees just set up the clocks and pick which MACs use
which phys. Have a look at patches 10 and 15 of this series for some
typical board configurations. I think it's pretty simple, since all the
complexity is in the SoC dt.

That said, I originally had the driver set up so that everything was
based on the compatible, but I had to change that because it was nacked
by the devicetree people.

>> > I may be off with my understanding of the regularity you are talking about,
>> > but the LX2160 (and Lynx 28G block) also has multi-lane protocols like 40G,
>> > 100G, assuming that's what you are talking about. I haven't started yet
>> > working on those for the mtip_backplane driver, but I'm not currently
>> > seeing a problem with the architecture where a phy_device represents a
>> > single lane that's part of a multi-lane port, and not an entire group.
>> 
>> Resetting one lane in a group will reset the rest, which could confuse
>> the driver. Additionally, treating the lanes as one phy lets us set the
>> reset direction and first lane bits correctly.
> 
> Yeah, in theory that is probably correct, but in practice can't we hide
> our head in the sand and say that the "phys" phandles have to have the
> lanes in the same order as LNmGCR0[1STLANE] expects them (which is also
> the "natural" order as the SoC RM describes it)? With a "for" loop
> implementation in the MAC, that would work just fine 100% of the time.
> Doing more intricate massaging of the "phys" in the consumer, and you're
> just asking for trouble. My 2 cents.

Yeah, but from the perspective of the driver, if we have one phy per
lane we don't get any indication from the subsystem that we are doing a
multi-lane configuration. So we need to stick this sort of thing into
the phy handle. But IMO we can do all this in the driver and the board
integrator never has to worry about it. 

> Sure, it's the same kind of ask of a board device tree writer as "hey,
> please give me a good PCCR value", but I honestly think that the head
> scratching will be much less severe.
> 
>> > In my imagination, there are 2 cases:
>> > - all 4 lanes are managed by the single dpaa2-mac consumer (which has 4
>> >   phandles, and iterates over them with a "for" loop)
>> > - each of the 4 lanes is managed by the respective backplane AN/LT core,
>> >   and thus, there's one phandle to each lane
>> 
>> By doing the grouping in the driver, we also simplify the consumer
>> implementation. The MAC can always use a single phy, without worrying
>> about the actual number of lanes. This matches the hardware, since the
>> MAC is going to talk XGMII (or whatever) to the protocol controller
>> anyway.
> 
> XGMII is the link between the MAC and the PCS, but the "phys" phandle to
> the SerDes gives insight to the MAC driver way beyond the PCS layer.
> That kinda invalidates the idea that "you don't need to worry about the
> actual number of lanes". When you're a MAC driver with an XLAUI link and
> you need insight into the PMA/PMD layer, you'd better not be surprised
> about the fact that there are 4 lanes, at the very least?

Well, this is why they are condensed into a "lane group". From the MAC's
perspective the whole thing is opaque, and gets handled by the phy
subsystem.

>> I think it will be a lot easier to add multi-lane support with this
>> method because it gives the driver more information about what's going
>> on. The driver can control the whole configuration/reset process and the
>> timing.
> 
> Also, I'm thinking that if you support multi-lane (which dpaa2-mac currently
> doesn't, even in LSDK), you can't avoid multiple "phys" phandles in dpaa2-mac,
> and a "for" loop, eventually, anyway. That's because if your lanes have protocol
> retimers on them, those are going to be modeled as generic PHYs too. And those
> will not be bundled in these "groups", because they might be one chip per lane.
> 
> The retimer thing isn't theoretical, but, due to reasons independent of NXP,
> we lack the ability to provide an upstream user of the "lane retimer as
> generic PHY" functionality in dpaa2-mac. So it stays downstream for now.
> https://cas5-0-urlprotect.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2fgithub.com%2fnxp%2dqoriq%2flinux%2fcommit%2f627c5f626a13657f46f68b90882f329310e0e22f&umid=54bd2b00-07e7-4f55-8e2a-ed7b7cf7d142&auth=d807158c60b7d2502abde8a2fc01f40662980862-a6780f8e227e850b7785d5afbae1abed18ded9d9
> 
> So, if you're thinking of easing the work of the consumer side, I'm not
> sure that the gains will be that high.

Well, FWIW those serdes don't need good coordination. That said, I
think it might be interesting to have some subsystem-level support for
this, much like clock groups.

> Otherwise, I will repeat the idea that lynx-10g and lynx-28g should be
> treated in unison, because some drivers (dpaa2-mac, mtip_backplane) will
> have to interface with both, and I don't really believe that major deviations
> in software architecture between the 2 SerDes drivers are justifiable in
> any way (multi-protocol handled differently, for example).

Well, I think we should take the opportunity to think about the hardware
which exists and how we plan to model it. IMO grouping lanes into a
single phy simplifies both the phy driver and the mac driver.

--Sean