[Cbe-oss-dev] [RFC 1/9] AXON - common driver core

Wed Dec 27 07:00:57 EST 2006

On Friday 22 December 2006 17:38, Jean-Christophe Dubois wrote:
> On Thursday 21 December 2006 22:13, Arnd Bergmann wrote:
> > > +#include <axon_utils.h>
> > > +#include <axon_io.h>
> > > +#include <axon_dbg.h>
> > > +#include <axon_defs.h>
> > > +#include <axon_bitops.h>
> > > +#include <axon_dma_ops_dma_aux.h>
> >
> > [style] Your own headers should move to a more appropriate place.
> > Ideally, this is the local directory, with '#include "axon_foo.h"'
> > lines. If that doesn't work, put them into include/asm.
> 
> This can't be "include/asm" because it need to be available to all PCI-E host 
> architecture. So it need to be something "common". I didn't want to go an 
> pollute "include/linux" directly so I am actually seeking advice on where 
> architecture independent header files related to the Axon should go.

Yes, good point.

This depends a lot in what way the header is shared. The interesting cases
are:

1. only in one directory: easy, all headers go to that dir
2. arch specific, kernel only, but multiple locations: should
   go to include/asm.
3. kernel only, but multiple architectures, few headers: include/linux
4. kernel only, multiple architectures, many headers: include/linux/foo/
   or include/foo/.
5. shared with user space, linux specific: include/linux/{,foo/}, but
   needs to be announced/discussed on linux-kernel at vger.kernel.org.
6. shared with user space and other OSs: include/foo, definitely needs
   discussion.

I guess that we're talking about 3 or 4, so first put them into
include/linux/axon_*.h.

> > > +static int      axon_aux_dma_is_broken = 1;
> > > +
> > > +module_param(axon_aux_dma_is_broken, int, 0644);
> > > +MODULE_PARM_DESC(axon_aux_dma_is_broken,
> > > +		 "On FCAB the AuxDMA is broken and can't cross 256 bytes bondaries
> > > without corrupting data. If you know your AuxDMA is not broken you can
> > > set this parameter to 0");
> >
> > This should be detected from a device tree property of the dma
> > device.
> 
> On FCAB, the device tree is very minimal and doesn't have that much in it. I 
> am not in control of the device tree. It is provided by IBM and it sounds 
> like the FCAB device tree is stuck forever at that point. I'll see if I can 
> come up with my own device tree using DTC and the flattened device tree.

That's an interesting idea, using DTC should vastly simplify the round-trip
time even for other devices where the firmware is not final. If you need
kernel patches to integrate the flat device tree in this, please share them
so we can use the same mechanism.

In the long run, is FCAB relevant at all? My understanding was that this
was a prototype produced in extreme small quantities.
> >
> > Using 'volatile' in a variable declaration is practically always
> > a bug. Why do you think you need it here?
> 
> This could be a pointer accross the PCI-E bus to the other side memory space. 
> It has to be volatile when/if you were to write to it. Actually it is 
> assigned but not used so I guess I could remove the "volatile" thing even if 
> it sounds correct to me.

If it goes across a PCI{,-E} bus, it needs to be __iomem, not volatile,
and you can't dereference it at all, except with functions like ioread32,
readl or in_be32 (depending on your needs). These functions internally
cast to volatile as needed, so declaring the variable itself volatile
is pointless and can only cause the compiler to skip some optimizations.

readl/writel should be the right abstractions here, they also do
the correct endianess conversion.

> > > +	volatile u8    *p_dma_channel;
> >
> > volatile seems wrong here.
> 
> it is a pointer to an IO space where live the DMA control registers. It can be 
> on the PLB or on DCR depending on the DMA (Aux_DMA, DMAX).

s/volatile u8/u8 __iomem/ again. Since this is on PLB directly, not on PCI,
you should use in_be32/out_be32 and friends to access them.

> > __raw_writel and __raw_readl are somewhat underspecified, so I'd
> > rather avoid them. out_be32 and in_be32 are probably what you
> > want.
> 
> remember, it also work on LE platform ...

Exactly. __raw_writel accesses the data in the native endianess.
{in,out}_{be,le}{16,32,64} encode the endianess of the register, so
the CPU can do the right conversion.

It may still get tricky if the registers themselves appear different
depending on whether they are accessed though PLB5 or PCIe mappings.
In that case, you should probably abstract them to use either out_be32
(on PLB) or writel (on PCIe).

With the current code, I expect that you get the endianess wrong if
you plug the CAB card into a big-endian PCIe host.

> > This is probably part of the comments you removed, but why do you need
> > to map anything physically? This should all be taken care of by
> > the firmware.
> 
> On the PCI side you (typically) have 2 x 128 MB bar to map 1TB Axon space. So 
> you need to "allocate" bars/PIM, set them to target the requested PLB 
> address ... It is not a  1 to 1 mapping. It is a lot more restricted over 
> PCI-E (from the host).

You mean you need to remap the register area at run time in order to access
the whole space? What limits the BAR size on PCIe? I guess we'll have
to live with this hack then, but it looks like a potentially huge
performance bottleneck.

> Well we are not sure how the public Axon spec will be but in the one that was 
> provided to us, this is how things were defined. So we preffer to stick to 
> the spec definition and come up with translation rules rather than hard to 
> reconcile values between the spec and the code.
> 
> I guess this is a choice. Is it really unacceptable?

IMHO, it's fine to stay as close as possible to the spec if that one
is available and you can have a comment pointing to the original
definition next to it.

If your source code has to serve as a replacement for the actual
specification, I would prefer it to be readable instead.

> >
> > You can't hardcode these numbers, that immediately breaks with
> > newer kernels. Instead, read the numbers from the device tree.
> 
> Remember that I still don't have a "new" OF tree and for the time being I need 
> to support the CAB/FCAB I have. As soon as I get the new device tree and I 
> can move to 2.6.19 or later this could go I guess (assuming I come up with a 
> solution for FCAB).

ok
> > Same for the offsets. The kernel must not care about them.
> 
> Same reason ...
> 
> In addition the host side of thing (opteron based for example) has no OF tree 
> to play with. So things need to be hardcoded a bit. Note that these are Axon 
> internal addresses and should always be valid (from the host or the Cell).

Interesting point. I guess on the host side, the device is uniquely identified
at run time though its PCI ID, so having the offsets there would be common
practice and we can't do much better than that.

> > > +static inline void
> > > +dma_aux_write(struct axon_dma_ops_dma_aux_t *p_axon_dma_ops_dma_aux,
> > > +	      u32 reg, u32 value)
> > > +{
> > > +	axon_writel(value, p_axon_dma_ops_dma_aux->p_regs + reg);
> > > +	wmb();
> > > +}
> >
> > better use out_be32
> 
> Think LE ..

Ok, in this case I was referring to the wmb(), which is already
implied by out_be32. axon_writel() itself then should be either
out_be32 or writel, depending on whether it is compiled to go
to PLB or PCI mappings.

	Arnd <><