[PATCH] mlx4_en: fix transmit of packages when blue frame is enabled

Eli Cohen eli at dev.mellanox.co.il
Sun Oct 9 18:35:46 EST 2011


On Sun, Oct 09, 2011 at 09:25:18AM +0200, Benjamin Herrenschmidt wrote:
> On Thu, 2011-10-06 at 15:57 +0200, Eli Cohen wrote:
> > On Wed, Oct 05, 2011 at 10:15:02AM +0200, Eli Cohen wrote:
> > 
> > How about this patch - can you give it a try?
> > 
> > 
> > >From dee60547aa9e35a02835451d9e694cd80dd3072f Mon Sep 17 00:00:00 2001
> > From: Eli Cohen <eli at mellanox.co.il>
> > Date: Thu, 6 Oct 2011 15:50:02 +0200
> > Subject: [PATCH] mlx4_en: Fix blue flame on powerpc
> > 
> > The source buffer used for copying into the blue flame register is already in
> > big endian. However, when copying to device on powerpc, the endianess is
> > swapped so the data reaches th device in little endian which is wrong. On x86
> > based platform no swapping occurs so it reaches the device with the correct
> > endianess. Fix this by calling le32_to_cpu() on the buffer. On LE systems there
> > is no change; on BE there will be a swap.
> 
> That looks wrong.
Not sure I understand: are you saying that on ppc, when you call
__iowrite64_copy, it will not reach the device swapped?

The point is that we must always have the buffer ready in big endian
in memory. In the case of blue flame, we must also copy it to the
device registers in pci memory space. So if we use the buffer we
already prepared, we must have another swap. I can think of a nicer
way to implement this functionality but the question is do you think
my observation above is wrong and why.

> 
> What is this __iowrite64_copy... oh I see
> 
> Nice, somebody _AGAIN_ added a bunch of "generic" IO accessors that are
> utterly wrong on all archs except x86 (ok, -almost-). There isn't a
> single bloody memory barrier in there !
> 
> So, __iowrite64_copy is doing raw_writel which will -not- swap, so your
> buffer is going to have the same endianness in the destination than it
> has in the source. This is _NOT_ the right place to do a swap.
> 
> It's the original construction of the descriptor that needs change. The
> data itself should never need to be affected accross a copy operation
> (unless your HW is terminally busted).
> 
> Cheers,
> Ben.
> 
> > Signed-off-by: Eli Cohen <eli at mellanox.co.il>
> > ---
> >  drivers/net/mlx4/en_tx.c |   10 ++++++++++
> >  1 files changed, 10 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/net/mlx4/en_tx.c b/drivers/net/mlx4/en_tx.c
> > index 16337fb..3743acc 100644
> > --- a/drivers/net/mlx4/en_tx.c
> > +++ b/drivers/net/mlx4/en_tx.c
> > @@ -601,6 +601,16 @@ u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb)
> >  
> >  static void mlx4_bf_copy(unsigned long *dst, unsigned long *src, unsigned bytecnt)
> >  {
> > +	int i;
> > +	__le32 *psrc = (__le32 *)src;
> > +
> > +	/*
> > +	 * the buffer is already in big endian. For little endian machines that's
> > +	 * fine. For big endain machines we must swap since the chipset swaps again
> > +	 */
> > +	for (i = 0; i < bytecnt / 4; ++i)
> > +		psrc[i] = le32_to_cpu(psrc[i]);
> > +
> >  	__iowrite64_copy(dst, src, bytecnt / 8);
> >  }
> >  
> > -- 
> > 1.7.7.rc0.70.g82660
> > 
> > 
> > 
> > > On Tue, Oct 04, 2011 at 05:26:20PM -0300, Thadeu Lima de Souza Cascardo wrote:
> > > 
> > > I believe we have an endianess problem here. The source buffer is in
> > > big endian - in x86 archs, it will rich the pci device unswapped since
> > > both x86 and pci are little endian. In ppc, it wil be swapped by the
> > > chipset so it will reach the device in little endian which is wrong.
> > > So, in mlx4_bf_copy, you could loop over the buffer and swap32 the all
> > > the dwords before the call to __iowrite64_copy. Of course which should
> > > fix this in an arch independent manner. Let me know this works for
> > > you.
> > > 
> > > > On Tue, Oct 04, 2011 at 08:02:12AM +0200, Benjamin Herrenschmidt wrote:
> > > > > On Mon, 2011-10-03 at 17:53 -0300, Thadeu Lima de Souza Cascardo wrote:
> > > > > 
> > > > >  .../...
> > > > > 
> > > > > > > Can you also send me the output of ethtool -i?
> > > > > > > It seems that there is a problem with write combining on Power processors, we will check this issue.
> > > > > > > 
> > > > > > > Yevgeny
> > > > > > 
> > > > > > Hello, Yevgeny.
> > > > > > 
> > > > > > You will find the output of ethtool -i below.
> > > > > > 
> > > > > > I am copying Ben and powerpc list, in case this is an issue with Power
> > > > > > processors. They can provide us some more insight into this.
> > > > > 
> > > > > May I get some background please ? :-)
> > > > > 
> > > > > I'm not aware of a specific issue with write combining but I'd need to
> > > > > know more about what you are doing and the code to do it to comment on
> > > > > whether it should work or not.
> > > > > 
> > > > > Cheers,
> > > > > Ben.
> > > > > 
> > > > > 
> > > > 
> > > > Hello, Ben.
> > > > 
> > > > Sorry for that. I am testing mlx4_en driver on a POWER. Yevgeny has
> > > > added blue frame support, that does not require writing to the device
> > > > memory to indicate a new packet (the doorbell register as it is called).
> > > > 
> > > > Well, the ring is getting full with no interrupt or packets transmitted.
> > > > I simply added a write to the doorbell register and it works for me.
> > > > Yevgeny says this is not the right fix, claiming there is a problem with
> > > > write combining on POWER. The code uses memory barriers, so I don't know
> > > > why there is any problem.
> > > > 
> > > > I am posting the code here to show better what the situation is.
> > > > Yevgeny can tell more about the device and the driver.
> > > > 
> > > > The code below is the driver as of now, including a diff with what I
> > > > changed and had resulted OK for me. Before the blue frame support, the
> > > > only code executed was the else part. I can't tell much what the device
> > > > should be seeing and doing after the blue frame part of the code is
> > > > executed. But it does send the packet if I write to the doorbell
> > > > register.
> > > > 
> > > > Yevgeny, can you tell us what the device should be doing and why you
> > > > think this is a problem on POWER? Is it possible that this is simply a
> > > > problem with the firmware version?
> > > > 
> > > > Thanks,
> > > > Cascardo.
> > > > 
> > > > ---
> > > >         if (ring->bf_enabled && desc_size <= MAX_BF && !bounce &&
> > > > !vlan_tag) {
> > > >                 *(u32 *) (&tx_desc->ctrl.vlan_tag) |=
> > > > ring->doorbell_qpn;
> > > >                 op_own |= htonl((bf_index & 0xffff) << 8);
> > > >                 /* Ensure new descirptor hits memory
> > > >                 * before setting ownership of this descriptor to HW */
> > > >                 wmb();
> > > >                 tx_desc->ctrl.owner_opcode = op_own;
> > > > 
> > > >                 wmb();
> > > > 
> > > >                 mlx4_bf_copy(ring->bf.reg + ring->bf.offset, (unsigned
> > > > long *) &tx_desc->ctrl,
> > > >                      desc_size);
> > > > 
> > > >                 wmb();
> > > > 
> > > >                 ring->bf.offset ^= ring->bf.buf_size;
> > > >         } else {
> > > >                 /* Ensure new descirptor hits memory
> > > >                 * before setting ownership of this descriptor to HW */
> > > >                 wmb();
> > > >                 tx_desc->ctrl.owner_opcode = op_own;
> > > > -               wmb();
> > > > -               writel(ring->doorbell_qpn, ring->bf.uar->map +
> > > > MLX4_SEND_DOORBELL);
> > > >         }
> > > > 
> > > > +       wmb();
> > > > +       writel(ring->doorbell_qpn, ring->bf.uar->map +
> > > > MLX4_SEND_DOORBELL);
> > > > +
> > > > ---
> 


More information about the Linuxppc-dev mailing list