[Cbe-oss-dev] [RFC 1/9] AXON - common driver core

Sat Dec 23 03:38:49 EST 2006

On Thursday 21 December 2006 22:13, Arnd Bergmann wrote:
> On Wednesday 20 December 2006 12:13, jdubois at mc.com wrote:
> > @@ -0,0 +1,111 @@
> > +/******************************************************************
> > + * Copyright (C) 2006  Mercury Computer Systems, Inc.
> > + * 199 Riverneck Road
> > + * Chelmsford, MA 01824-2820
> > + * (978) 256-1300
> > + * webinfo at mc.com
>
> It's common to list the name and address of the actual person
> responsible for the code instead of a info at example.com style address.
> Unless that would violate your company rules, please change this.

Well, I'll investigate, but I have been asked to put this header. I'll see if 
I can change it.

> Generally, you should follow the rules from Documentation/CodingStyle
> religiously when submitting kernel code. There's nothing wrong with
> your style, we just like to keep everything looking the same in the
> kernel. I'll list my comments about this marked with [style].

OK, I'll try to convert to the religion ...

> > +static struct class *axon_class = NULL;
>
> [style] don't initialize static or global variables to NULL/0,
> the compiler will do this automatically.

OK

> > +int
> > +axon_create_class()
>
> [style] write this as
> int axon_create_class(void)

Wow, I'll have to change this type of thing everywhere.

>
> if possible, make the function static.

We tried (I believe) to make them static whenever possible. This particular 
one is global (used elsewhere in pci and bei).

> > +	if (axon_class == NULL) {
> > +		axon_class = class_simple_create(THIS_MODULE, "axon");
> > +
> > +		if (axon_class != NULL)
> > +			return 0;
> > +		else
> > +			return -1;
> > +	} else {
> > +		return 1;
> > +	}
> > +
> > +}
>
> I prefer to keep the if() statements like 'if (axon_class)'
> or 'if (!axon_class)' when testing for a NULL pointer. Benh
> disagrees with me, so do as you like ;-)
>
> Regarding return values, errors should always be reported as
> negative errno.h numbers, not '-1'.

Well, here 0 mean "created/success", 1 means "already there" and -1 means 
error. I guess I could return 0 for "already there".

> > +EXPORT_SYMBOL(axon_create_class);
> > +EXPORT_SYMBOL(axon_destroy_class);
> > +EXPORT_SYMBOL(axon_get_class);
>
> The EXPORT_SYMBOL should follow the symbol definition, not
> come as a separate block. When adding new exports, please
> use EXPORT_SYMBOL_GPL.

I'll look into it.

> > +#include <linux/module.h>
> > +#include <linux/kernel.h>
> > +#include <linux/vermagic.h>
> > +#include <linux/version.h>
> > +
> > +#include <linux/device.h>
> > +#include <linux/dmapool.h>
>
> [style] by convention, order includes alphabetically.

Ah ... OK

> > +#include <axon_utils.h>
> > +#include <axon_io.h>
> > +#include <axon_dbg.h>
> > +#include <axon_defs.h>
> > +#include <axon_bitops.h>
> > +#include <axon_dma_ops_dma_aux.h>
>
> [style] Your own headers should move to a more appropriate place.
> Ideally, this is the local directory, with '#include "axon_foo.h"'
> lines. If that doesn't work, put them into include/asm.

This can't be "include/asm" because it need to be available to all PCI-E host 
architecture. So it need to be something "common". I didn't want to go an 
pollute "include/linux" directly so I am actually seeking advice on where 
architecture independent header files related to the Axon should go.

> > +static int      axon_use_aux_dma = 1;
> > +
> > +module_param(axon_use_aux_dma, int, 0644);
> > +MODULE_PARM_DESC(axon_use_aux_dma,
> > +		 "This allows to define if the driver should try to use the Aux DMA if
> > present. By default it will try to discover and use it but you can
> > request to skip it by setting this parameter to 0");
>
> You should make this separate lines, like
>
> MODULE_PARM_DESC(axon_use_aux_dma,
> 	"This allows to define if the driver should try to use\n"
> 	"the Aux DMA if present. By default it will try to\n"
> 	"discover and use it but you can request to skip it\n"
> 	"by setting this parameter to 0");

OK

> > +static int      axon_aux_dma_is_broken = 1;
> > +
> > +module_param(axon_aux_dma_is_broken, int, 0644);
> > +MODULE_PARM_DESC(axon_aux_dma_is_broken,
> > +		 "On FCAB the AuxDMA is broken and can't cross 256 bytes bondaries
> > without corrupting data. If you know your AuxDMA is not broken you can
> > set this parameter to 0");
>
> This should be detected from a device tree property of the dma
> device.

On FCAB, the device tree is very minimal and doesn't have that much in it. I 
am not in control of the device tree. It is provided by IBM and it sounds 
like the FCAB device tree is stuck forever at that point. I'll see if I can 
come up with my own device tree using DTC and the flattened device tree.

> > +static int      axon_aux_dma_use_ben_bit = 1;
> > +
> > +module_param(axon_aux_dma_use_ben_bit, int, 0644);
> > +MODULE_PARM_DESC(axon_aux_dma_use_ben_bit,
> > +		 "Should we use the BEN bit on the DMA engine (required on FCAB)");
>
> same here.

Same comment. 

> > +static void
> > +link(struct axon_dma_ops_t *dma_ops,
> > +     struct axon_dma_pkt_t *p_dma_pkt_prev,
> > +     struct axon_dma_pkt_t *p_dma_pkt)
>
> It would be better to name the function uniquely, like 'axon_dma_link',
> even if it is static, to help interpret stack traces.

OK

> > +	*pp_dma_ops = p_dma_ops;
> > +
> > +	goto no_err;
> > +
> > +
> > +      err_dma_ops_dma_soft:
> > +	kfree(p_dma_ops);
> > +
> > +      err_dma_ops:
> > +
> > +
> > +      no_err:
> > +
> > +	return ret;
> > +
> > +}
>
> [style] Labels should not be indented at all, they go to the first column.

OK

> > +static int
> > +axon_dma_request_pkt_alloc_P(struct axon_dma_req_t *p_dma_req,
> > +			     struct axon_dma_pkt_t **pp_dma_pkt,
> > +			     int *p_index)
> > +{
> > +	int             ret = 0;
>
> It would be more common to make this return the axon_dma_pkt
> or NULL instead of 0 or error and pass the result in an argument.
>
> It's not clear what 'P' means (pointer, packet, ... ?). Maybe
> write the full word, and don't use capital letters.

the intend was to mean "Private" (static). As it is not respected in other 
files it should go.

> > +static int
> > +axon_dma_request_push_mbox_by_write(struct axon_dma_req_t *p_dma_req,
> > struct axon_dma_req_mbox_t +				    *p_dma_req_mbox)
> > +{
> > +	int             ret = 0;
> > +
> > +	struct axon_dma_pkt_t *p_dma_pkt;
> > +	struct axon_mbox_t *p_mbox = p_dma_req_mbox->p_dst_mbox;
> > +
> > +	ret = axon_dma_request_pkt_dat_alloc(p_dma_req, &p_dma_pkt);
> > +
> > +	if (ret >= 0) {
> > +
> > +		void           *p_mbox_msg = p_dma_pkt->addr;
> > +		plb_addr_t      mbox_plb_bus_dst;
> > +		dma_addr_t      mbox_cpu_bus_dst;
> > +		volatile void  *mbox_cpu_dst;
>
> Using 'volatile' in a variable declaration is practically always
> a bug. Why do you think you need it here?

This could be a pointer accross the PCI-E bus to the other side memory space. 
It has to be volatile when/if you were to write to it. Actually it is 
assigned but not used so I guess I could remove the "volatile" thing even if 
it sounds correct to me.

> > +
> > +struct axon_dma_req_t;
> > +
> > +struct axon_dmax_t {
> > +
>
> What does the _t stand for? By convention, this is often used
> for typedefs, which this isn't, so it can probably better be
> left out.

_t is for type. Hum this one is not "typedefs". I'll look into it.

> > +	struct addr_xltr_t *p_addr_xltr;
> > +
> > +
> > +	struct axon_t  *p_axon;
> > +
> > +
> > +	struct axon_mbox_t *p_axon_mbox_self;
> > +
> > +
> > +	struct axon_mbox_t *p_axon_mbox_peer;
> > +
> > +
> > +	struct axon_sms_t *p_axon_sms;
>
> Same for the p_ prefix.

pointer ...

> > +	volatile u8    *p_dma_channel;
>
> volatile seems wrong here.

it is a pointer to an IO space where live the DMA control registers. It can be 
on the PLB or on DCR depending on the DMA (Aux_DMA, DMAX).

> > +static inline void
> > +mbox_write_reg(struct axon_mbox_hw_t *p_mbox, u32 value, u32 dcr_addr)
> > +{
> > +	__raw_writel(value, (p_mbox->p_regs) + dcr_addr);
> > +
> > +}
> > +
> > +static inline   u32
> > +mbox_read_reg(struct axon_mbox_hw_t *p_mbox, u32 dcr_addr)
> > +{
> > +	return __raw_readl((p_mbox->p_regs) + dcr_addr);
> > +
> > +}
>
> __raw_writel and __raw_readl are somewhat underspecified, so I'd
> rather avoid them. out_be32 and in_be32 are probably what you
> want.

remember, it also work on LE platform ...

> > +static int
> > +map_on_local_bus(struct axon_pio_pci_t *p_axon_pio_pci, plb_addr_t dst,
> > +		 dma_addr_t * p_bus_addr, size_t size)
> > +{
> > +	int             ret = 0;
> > +
> > +	dma_addr_t      dst_bus_addr;
> > +
> > +
> > +	if (p_axon_pio_pci->p_pim != NULL) {
> > +
> > +		axon_pim_map(p_axon_pio_pci->p_pim, dst, size);
> > +	}
> > +
> > +
> > +
> > +	dst_bus_addr =
> > +	    p_axon_pio_pci->p_addr_xltr->from_plb(p_axon_pio_pci->
> > +						  p_addr_xltr, dst);
> > +
> > +
> > +	if (dst_bus_addr != AXON_INVALID_BUS_ADDR) {
> > +
> > +		*p_bus_addr = dst_bus_addr;
> > +
> > +	} else {
> > +
> > +		ret = -EIO;
> > +		dbg_log("Unable to find an existing PCI address for 0x%016"
> > +			AXON_PLB_ADDR_FMT_T " \n", dst);
> > +		*p_bus_addr = 0x0;
> > +	}
> > +
> > +	return ret;
> > +
> > +}
>
> This is probably part of the comments you removed, but why do you need
> to map anything physically? This should all be taken care of by
> the firmware.

On the PCI side you (typically) have 2 x 128 MB bar to map 1TB Axon space. So 
you need to "allocate" bars/PIM, set them to target the requested PLB 
address ... It is not a  1 to 1 mapping. It is a lot more restricted over 
PCI-E (from the host).

> > +#define AXON_READ(size,type) \
> > +static int \
> > +axon_pio_pci_read##type(struct axon_pio_t* p_axon_pio, plb_addr_t src,
> > u##size * p_data) \ +{ \
> > +	int ret = 0; \
> > +	struct axon_pio_pci_t* p_axon_pio_pci = p_axon_pio -> context; \
> > +	void* src_cpu_addr; \
> > + \
> > +	 \
> > +	src_cpu_addr = do_ioremap( p_axon_pio_pci, src, size / 8); \
> > +	if ( src_cpu_addr != NULL ) { \
> > +\
> > +		*p_data = read##type( src_cpu_addr ); \
> > +\
> > +		do_iounmap( p_axon_pio_pci, src, src_cpu_addr ); \
> > +\
> > +	} else {\
> > +		dbg_err("Unable to map IO space located on the bus 0x%p(%lld bytes)
> > \n", src_cpu_addr, (unsigned long long)size / 8);\ +		ret = -EIO;\
> > +	}\
> > +\
> > +	return ret;\
> > +}
>
> This seems like a _huge_ overhead for accessing a single register.
> If this is on the PCI bus, you should really be able to use
> the normal PCI subsystem.

Same as above, you need to allocate, set, unset, free the PIM on the PCI side.

> > +inline void
> > +axon_ring_msg_alloc(struct axon_ring_t *p_axon_ring,
> > +		    dma_addr_t * p_bus_addr, void **pp_cpu_addr)
>
> Inline functions should practically always be 'static inline'
> with gcc. As with other 'alloc' type functions, you should
> return the resulting pointer.
>
> > +static          irqreturn_t
> > +sys_int_handler(void *data, int irq, void *dev_id, struct pt_regs *regs)
>
> A function name should only start with 'sys_' if it is a system call
> handler. Better make this 'axon_sys_int_handler'.

OK

> > +#if defined( __powerpc__)
> > +
> > +#	 define BE_HI32_FROM_BE64(x) ((__u32)(  (x) >> 32))
> > +#	 define BE_LO32_FROM_BE64(x) ((__u32)(  (x) & ((1ULL<<32)-1) ))
> > +#        define LE_HI32_FROM_LE64(x) ((__u32)(  (x) & ((1ULL<<32)-1) ))
> > +#        define LE_LO32_FROM_LE64(x) ((__u32)(  (x) >> 32))
>
> I guess you could define these in an architecture neutral
> way as
>
> static inline u32 be_hi32_from_be64(u64 val)
> {
> 	return *((u32 *)&val);
> }
>
> and so on.
>
> I'm also not sure why you need this in the first place. The low-level
> I/O accessors should already care about endianess.

here we are accessing the Axon (BE) from the host (potentially LE). So we need 
to be very carefull about endianess. This is true all around the code.

> > +#	 define GET_BIT_MASK_FROM_IBM_DOC(nr,type) (1ULL << ( (sizeof(type)*8
> > - 1) - nr ))
>
> Nice one ;-)
>
> I usually avoid confusing the numbers from the docs by always specifying
> bit masks instead of bit numbers, e.g.
>
> #define FOO_BIT 10
>
> I write
>
> #define FOO 0x0040

Well we are not sure how the public Axon spec will be but in the one that was 
provided to us, this is how things were defined. So we preffer to stick to 
the spec definition and come up with translation rules rather than hard to 
reconcile values between the spec and the code.

I guess this is a choice. Is it really unacceptable?

> > +#define BExx_clr_bit(size) \
> > +static inline u##size \
> > +be##size##_clr_bit(u##size reg##size,int nr) \
> > +{ \
> > +	reg##size = reg##size & ~(GET_BIT_MASK_FROM_IBM_DOC(nr,u##size)); \
> > +	return reg##size; \
> > +}
>
> If you really need to work with bit numbers, it's better to
> use the functions from asm/bitops.h instead of defining your own.
>
> > +
> > +#if defined(AXON_DEBUG_MODULE) || defined(AXON_DEBUG)
> > +#define dbg_log(...) printk(KERN_INFO  AXON_DRIVER_NAME "=>" __VA_ARGS__
> > ) +#else
> > +#define dbg_log(...) do { } while(0)
> > +#endif
> > +
> > +#define dbg_inf(...) printk(KERN_INFO AXON_DRIVER_NAME "=>" __VA_ARGS__
> > ) +
> > +#define dbg_err(...) printk(KERN_EMERG AXON_DRIVER_NAME "=>" __VA_ARGS__
> > ) +
> > +#endif
>
> [style] Don't define these yourself, just use pr_debug, pr_info or
> (better) dev_dbg, dev_info, dev_err, which are provided by the kernel.

I'll look at this.

> > +
> > +#define AXON_PLB_XDR_CHAN0_OFFSET 0x6000006000000000ULL
> > +#define AXON_PLB_XDR_CHAN0_MASK   0xFFFFFFF000000000ULL
> > +
> > +
> > +#define AXON_PLB_MASK_FROM_BE 	  0x000000FFFFFFFFFFULL
> > +#define AXON_PLB_OFFSET_FROM_BE   0x0000010000000000ULL
> > +
> > +#define AXON_PLB_NIBBLE_MASK 	  0x000000F000000000ULL
> > +#define AXON_PLB_NIBBLE_SHIFT     24
> > +
> > +#define AXON_MPCI_INTR_COUNT 	 128
> > +
> > +
> > +#define AXON_BEI_INTR_COUNT 	 (AXON_MPCI_INTR_COUNT + 128)
> > +#define AXON_BEI_INTR_MBOX_DIRECT 	 136
> > +#define AXON_BEI_INTR_MBOX_INDIRECT 	 116
> > +#define AXON_BEI_INTR_MBOX_OVERFLOW 	 117
> > +#define AXON_BEI_INTR_AUX_DMA_CH0_ERR 	 12
> > +#define AXON_BEI_INTR_AUX_DMA_CH1_ERR 	 13
> > +#define AXON_BEI_INTR_AUX_DMA_CH2_ERR 	 14
> > +#define AXON_BEI_INTR_AUX_DMA_CH3_ERR 	 15
> > +#define AXON_BEI_INTR_DMAX_CH0           119
> > +#define AXON_BEI_INTR_DMAX_CH1           120
> > +#define AXON_BEI_INTR_DMAX_CH2           121
> > +#define AXON_BEI_INTR_DMAX_CH3           122
> > +#define AXON_BEI_INTR_DMAX_CH0_ERR       123
> > +#define AXON_BEI_INTR_DMAX_CH1_ERR       124
> > +#define AXON_BEI_INTR_DMAX_CH2_ERR       125
> > +#define AXON_BEI_INTR_DMAX_CH3_ERR       126
> > +#define AXON_BEI_INTR_DMAX_SLAVE_ERR     127
> > +
>
> You can't hardcode these numbers, that immediately breaks with
> newer kernels. Instead, read the numbers from the device tree.

Remember that I still don't have a "new" OF tree and for the time being I need 
to support the CAB/FCAB I have. As soon as I get the new device tree and I 
can move to 2.6.19 or later this could go I guess (assuming I come up with a 
solution for FCAB).

> > +#define AXON_PLB_DDR_OFFSET  	 0x0000000000000000ULL
> > +#define AXON_PLB_DDR_SIZE  	 0x0000001000000000ULL
> > +
> > +#define AXON_PLB_UTL_REGS_OFFSET  0xA00000A000004000ULL
> > +#define AXON_PLB_UTL_REGS_SIZE    0x1000
> > +
> > +#define AXON_PLB_CFG_REGS_OFFSET  0xA00000A000005000ULL
> > +#define AXON_PLB_CFG_REGS_SIZE    0x1000
> > +
> > +#define AXON_PLB_DCR_REGS_OFFSET  0x8000008000000000ULL
> > +#define AXON_PLB_DCR_REGS_SIZE    (0x240 * 4)
> > +#define AXON_PLB_PCIE_OFFSET      0xC00000C000000000ULL
> > +#define AXON_PLB_PCIE_MASK        0x00000007FFFFFFFFULL
> > +
> > +#define AXON_PLB_REGS_SLV_OFFSET  0x4000004400000000ULL
> > +#define AXON_PLB_REGS_SLV_SIZE    0x5000
> > +
> > +#define AXON_PLB_PCIE_REGS_OFFSET 0xA00000A000000000ULL
>
> Same for the offsets. The kernel must not care about them.

Same reason ...

In addition the host side of thing (opteron based for example) has no OF tree 
to play with. So things need to be hardcoded a bit. Note that these are Axon 
internal addresses and should always be valid (from the host or the Cell).

> > +
> > +typedef enum {
> > +	AXON_DMA_CHANNEL_INVALID,
> > +	AXON_DMA_CHANNEL_0,
> > +	AXON_DMA_CHANNEL_1,
> > +	AXON_DMA_CHANNEL_2,
> > +	AXON_DMA_CHANNEL_3,
> > +
> > +} axon_dma_channel_t;
>
> [style] No need for a typedef here, just make this 'enum axon_dma_channel'
>
> > +typedef enum {
> > +	AXON_WIDTH_BYTE = 0,
> > +	AXON_WIDTH_HALFWORD,
> > +	AXON_WIDTH_WORD,
> > +	AXON_WIDTH_DOUBLEWORD,
> > +	AXON_WIDTH_QUADWORD
> > +} axon_dma_xfer_width;
> > +
> > +
> > +typedef enum {
> > +
> > +	DMA_NO_INTR,
> > +
> > +
> > +	DMA_INTR_WAIT,
> > +
> > +
> > +	DMA_INTR_NOWAIT
> > +} axon_dma_intr_t;
> > +
>
> same for these
>
> > +static inline void
> > +dma_aux_write(struct axon_dma_ops_dma_aux_t *p_axon_dma_ops_dma_aux,
> > +	      u32 reg, u32 value)
> > +{
> > +	axon_writel(value, p_axon_dma_ops_dma_aux->p_regs + reg);
> > +	wmb();
> > +}
>
> better use out_be32

Think LE ..

> > +static inline   u32
> > +dma_aux_read(struct axon_dma_ops_dma_aux_t *p_axon_dma_ops_dma_aux,
> > +	     u32 reg)
> > +{
> > +	return axon_readl(p_axon_dma_ops_dma_aux->p_regs + reg);
> > +}
>
> and in_be32

think LE ...

> > +#ifndef __powerpc__
> > +#define axon_virt_to_bus(x) virt_to_bus( (x) )
> > +#else
> > +#define axon_virt_to_bus(x) virt_to_phys( (x) )
> > +#endif
>
> virt_to_bus is inherently broken, you can't use that. Instead, you
> need to use the dma mapping functions.

Yes, I need to work on this.

> > +#define AXON_MAX_PIM_REG 4
> > +
> > +struct axon_pim_reg_t {
> > +
> > +	u64             mask;
> > +	u64             base;
> > +	int             users;
> > +	u32             reg;
> > +
> > +	u32             bar_id;
> > +	u32             bar_len;
> > +};
>
> What do you need these for? Shouldn't the firmware set them all up
> correctly?

See above ... from the host you need to go through limited bars and change the 
setting of these bars to get to the required location in the Axon PLB 
address.

> > +typedef u64     plb_addr_t;
> > +typedef u64     axon_size_bus_t;
> > +
> > +#if defined(__x86_64__) || defined(__powerpc64__)
> > +typedef u64     axon_addr_ptr_t;
> > +typedef u64     axon_size_t;
> > +#elif defined(__i386__)  || ( defined(__powerpc__) &&
> > !defined(__powerpc64__) ) +
> > +typedef u32     axon_addr_ptr_t;
> > +typedef u32     axon_size_t;
> > +
> > +#else
> > +
> > +#error "Unsupported platform. Please add the right type mapping"
> > +
> > +#endif
>
> Can't you just check BITS_PER_LONG instead of the arch specific
> defines?

I'll look into it.

> > +
> > +#define MIN(i,j)               ((i) < (j) ? (i) : (j))
> > +
>
> This one is broken when the arguments have side effects.
> Please use the min() and max() macros provided by the kernel
> instead.

OK

> > +#if defined(__i386__) && !defined(__x86_64__)
> > +#	 define AXON_PLB_ADDR_FMT_T 	 "llx"
> > +#	 define AXON_PADDR_FMT_T 	 "x"
> > +#	 define AXON_VADDR_FMT_T 	 "x"
> > +#	 define AXON_DMA_ADDR_FMT_T 	 "x"
> > +#	 define AXON_U64_FMT_T 	 	 "llx"
> > +#elif defined(__powerpc64__)
> > +#	 define AXON_PLB_ADDR_FMT_T 	 "lx"
> > +#	 define AXON_PADDR_FMT_T 	 "lx"
> > +#	 define AXON_VADDR_FMT_T 	 "lx"
> > +#	 define AXON_DMA_ADDR_FMT_T 	 "lx"
> > +#	 define AXON_U64_FMT_T 	 	 "lx"
> > +#elif defined(__x86_64__)
> > +#      define AXON_PLB_ADDR_FMT_T      "llx"
> > +#      define AXON_PADDR_FMT_T         "llx"
> > +#      define AXON_VADDR_FMT_T         "llx"
> > +#      define AXON_DMA_ADDR_FMT_T      "llx"
> > +#      define AXON_U64_FMT_T           "llx"
> > +#else
> > +#	 error "Unsupported platform"
> > +#endif
>
> I think there was a better way to do this, but don't remember
> right now.

If we can make this cleaner, I am all for it.

> 	Arnd <><