[Cbe-oss-dev] [patch 04/16] Common Core of the Axon driver

Fri May 25 18:47:05 EST 2007

On Friday 25 May 2007 02:13:10 Arnd Bergmann wrote:
> On Thursday 24 May 2007, jdubois at mc.com wrote:
> > This is the main part of the driver and is common to the host
> > and the Cell attached to the Axon chip.
>
> It's very sad that much of this patch is duplicating code
> that is already there, or not using the right abstractions.
>
> If you get the feeling that you need to write infrastructure
> code, it's almost always wrong, because others have already
> done the parts you need. Please just ask other people before
> you create your own interface layers, and post your code
> early and often, that's the only way we can avoid pointless
> work and frustration from rejected code.

Well, I do feel I needed to write this infrastructure code. Let me explain. 

This core part of the Axon driver can be seen as a "nexus" driver that 
attached to all the hardware resources, abstract them to a common API 
whatever side (PCI/BEI) of the Axon you are and provide a set of more 
elaborated/common services to "clients" drivers so that these clients don't 
have to deal with hardware details (PCI or BEI) so much. These clients 
drivers (like Ethernet over PCI, soft reset, buffer driver ...) are not 
attached directly to the hardware resource but are just sharing them (MBX, 
DMA, mapping, ...) through the Nexus Axon driver. This allows more than 
one "client" driver to use DMA, MBX, ... from kernel.

Now when I load a "client" driver in the kernel, I need to notify it about all 
the Axon nexus devices available so far. In the same kind of idea if an 
Axon "nexus" device is "added" or "removed" from the kernel, I need to notify 
all the already registered "client" drivers that an Axon "nexus" driver is 
found/gone.

> > +static struct list_head axon_list;
> > +static struct list_head driver_list;
>
> You shouldn't need any global lists like these, drivers and
> devices are maintained by the driver core in Linux.

There is nothing for the "nexus" driver concept with attached "client" drivers 
I could found. I need to store this kind of info somewhere.

> > +struct axon_mbox_t *axon_peer_mbox_get(axon_t * p_axon)
> > +{
> > +	return p_axon->peer_mbox_get(p_axon);
> > +}
> > +
> > +EXPORT_SYMBOL(axon_peer_mbox_get);
>
> Why not EXPORT_SYMBOL_GPL? Shouldn't this be in the mailbox driver?

OK, I'll make everything EXPORT_SYMBOL_GPL.

About the location, it need to be "common". There are 2 type of mailboxes. The 
Axon hardware MBX on the Cell side and the software emulation on the host 
side. This code allow to point to the peer MBX regardless of the MBX type.

> > +static int axon_finish_reset(axon_t * p_axon)
> > +{
> > +	struct list_head *p_cursor;
> > +	struct list_head *p_next;
> > +	axon_driver_t *driver_tmp;
> > +
> > +
> > +	list_for_each_safe(p_cursor, p_next, &driver_list) {
> > +		driver_tmp = list_entry(p_cursor, axon_driver_t, list);
> > +
> > +		if (driver_tmp->start)
> > +			driver_tmp->start(p_axon);
> > +	}
> > +
> > +	return 0;
> > +}
>
> What are the drivers that you are iterating over? I would think that
> you should iterate over the child devices instead.

Here I am iterating over the "client" drivers to tell them the ongoing soft 
reset is done and they can start using the "nexus" driver again knowing that 
the "remote" side has been completely reseted.

> > +int axon_driver_register(axon_driver_t * p_driver)
> > +{
> > +	struct list_head *p_cursor;
> > +	struct list_head *p_next;
> > +	axon_t *axon_tmp;
> > +
> > +
> > +	list_add_tail(&p_driver->list, &driver_list);
> > +
> > +
> > +	list_for_each_safe(p_cursor, p_next, &axon_list) {
> > +		axon_tmp = list_entry(p_cursor, axon_t, list);
> > +
> > +		if (p_driver->probe)
> > +			p_driver->probe(axon_tmp);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +EXPORT_SYMBOL(axon_driver_register);
>
> The ->probe function should be called by the driver core when
> you call driver_register(&p_driver->driver).

Once again this is for the "clients" of the nexus driver. When a "client" 
driver registers I need to tell it about all the Axons (PCI or/and BEI) known 
so far.

> > +#if LINUX_VERSION_CODE > KERNEL_VERSION(2,6,9)
>
> huh?

Support for RH4 on the host ...

> > +static struct class *axon_class = NULL;
> > +
> > +int axon_create_class(void)
> > +{
> > +	if (axon_class == NULL) {
> > +		axon_class = class_create(THIS_MODULE, "axon");
> > +
> > +		if (axon_class != NULL)
> > +			return 0;
> > +		else
> > +			return -1;
> > +	} else {
> > +		return 1;
> > +	}
> > +
> > +}
> > +
> > +int axon_destroy_class(void)
> > +{
> > +	if (axon_class != NULL) {
> > +		class_destroy(axon_class);
> > +		axon_class = NULL;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
>
> this file looks entirely superfluous. It creates an abstraction
> for a case that is guaranteed to to happen.

The problem was the change of API between 2.6.9 and present kernel. So I 
needed to "unify" it in some way ... I guess this is not required for 
inclusion in main kernel tree but when I build for "antique" hosts, I need 
something.

> > +static axon_dma_pkt_t *axon_dma_request_pkt_alloc_P(axon_dma_req_t *
> > +						    p_dma_req, int *p_index)
> > +{
> > +	axon_dma_pkt_t *p_dma_pkt;
> > +	int index = p_dma_req->current_pkt;
> > +
> > +
> > +	if (index >= MAX_CMD_PKT) {
> > +		dbg_err("A DMA req cannot have more than %d packets\n",
> > +			MAX_CMD_PKT);
> > +		return NULL;
> > +	}
> > +
> > +
> > +	p_dma_pkt = &(p_dma_req->pkts[index]);
> > +
> > +
> > +	p_dma_pkt->addr =
> > +	    dma_pool_alloc(p_dma_req->p_dma_pool, GFP_ATOMIC,
> > +			   &p_dma_pkt->cpu_bus_addr);
>
> Do you plan to port this to an ISA based system? Seriously, on a modern
> system you don't want to use dma_pool interfaces. Just use the dma-mapping
> API.

This is to allocate the command packets for the DMA. It seemed appropriate ... 
For the data transferred by the DMA we are using the dma-mapping API. Do you 
think I should create/manage my own pool of DMA command packets?

> > +	if ((type != AXON_SYS_SLAVE) && (create_proc_read_entry
> > +					 ("dmax_regs", 0,
> > +					  axon_get_device_dir(p_axon),
> > +					  axon_dma_ops_dmax_reg_red_proc,
> > +					  p_dma_ops_dmax) == NULL)) {
> > +		dbg_err("Failed to create dmax_regs proc entry\n");
> > +	}
>
> Don't add new files to procfs, it's already messy enough.
> Instead add attributes with one value per file to the sysfs
> representation of your device. Of course, that requires the
> use of device abstraction in the first place.

I'll look at the sysfs thing. This is for debug help.

> > +static inline void axon_mbox_hw_write_reg(axon_mbox_hw_t * p_mbox,
> > +					  u32 value, u32 dcr_addr)
> > +{
> > +#ifdef __powerpc__
> > +	out_be32((p_mbox->p_regs) + dcr_addr, value);
> > +	wmb();
> > +#endif
> > +}
>
> If it's mapped on DCR, you should use the dcr_map/dcr_write/dcr_read
> interfaces instead of inventing your own.

I'll look at this.

> > +#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,12)
> > +static void cancel_rearming_delayed_workqueue(struct workqueue_struct
> > *wq, +					      struct work_struct *work)
> > +{
> > +	while (!cancel_delayed_work(work))
> > +		flush_workqueue(wq);
> > +}
> > +#endif
>
> Kill this. Where did this come from in the first place? There was never
> a kernel running on AXON before 2.6.18!

Well I ran Linux 2.6.16 on CAB before but it is still newer than 2.6.12. Now, 
as I told you I need to support ancient kernels (RH4 = 2.6.9). However it 
should certainly be removed to be included in actual tree. Actually this code 
is for the PCI host only and it is just a copy/paste of the same code in the 
actual kernel for antique kernels.

> > +static void axon_mbox_hw_stop_delayed_work(axon_mbox_hw_t *
> > p_axon_mbox_hw) +{
> > +	if (!p_axon_mbox_hw->irq_poller_wq)
> > +		return;
> > +
> > +	cancel_rearming_delayed_workqueue
> > +	    (p_axon_mbox_hw->irq_poller_wq, &p_axon_mbox_hw->irq_poller);
> > +
> > +	destroy_workqueue(p_axon_mbox_hw->irq_poller_wq);
> > +
> > +	p_axon_mbox_hw->irq_poller_wq = NULL;
> > +}
>
> You shouldn't destroy the workqueu when stopping a work item in it.
> Creating a workqueue is a very heavyweight operation that you should
> not do during normal operation. Do you really need your own workqueue?

This is not a "normal" operation. The MBX driver is going away so I need to 
cleanup everything.

> > +	if (create_proc_read_entry
> > +	    ("HwMbx", 0, axon_get_device_dir(p_axon),
> > +	     axon_mbox_hw_reg_read_proc, p_axon_mbox_hw) == NULL) {
> > +		dbg_err("Failed to create reg /proc file for HwMbx\n");
> > +	}
>
> Another proc entry that shouldn't be there.

I'll look at the sysfs thing. This is for debug help.

> > +static int map_on_local_bus(axon_pio_pci_t * p_axon_pio_pci,
> > +			    plb_addr_t dst, dma_addr_t * p_bus_addr,
> > +			    size_t size)
> > +{
> > +	int ret = 0;
> > +
> > +	dma_addr_t dst_bus_addr;
> > +
> > +	dbg_log("begin: dst = 0x%016" AXON_PLB_ADDR_FMT_T ", size = %d \n",
> > +		dst, (int)size);
> > +
> > +
> > +	p_axon_pio_pci->p_axon->pim_map(p_axon_pio_pci->p_pim, dst, size);
> > +
> > +	dbg_log("pim are set\n");
> > +
> > +
> > +	dst_bus_addr =
> > +	    p_axon_pio_pci->p_addr_xltr->from_plb(p_axon_pio_pci->
> > +						  p_addr_xltr, dst);
> > +
> > +
> > +	if (dst_bus_addr != AXON_INVALID_BUS_ADDR) {
> > +
> > +		dbg_log("processor address = 0x%016" AXON_PLB_ADDR_FMT_T
> > +			" \n", dst_bus_addr);
> > +
> > +		*p_bus_addr = dst_bus_addr;
> > +
> > +	} else {
> > +
> > +		ret = -EIO;
> > +		dbg_log("Unable to find an existing PCI address for 0x%016"
> > +			AXON_PLB_ADDR_FMT_T " \n", dst);
> > +		*p_bus_addr = 0x0;
> > +	}
> > +
> > +	return ret;
> > +}
>
> What does this do that can't be done by the firmware?

This is common code used on the host and the Cell. On the host you are getting 
2 x 128M PCI Bars to MAP the 1TB Axon memory map. So you need to reprogram 
the mapping on the fly depending on you target address on the PLB when doing 
PIO. We don't do much PIO (we try to use the DMA as much as possible) but 
when required you need the support.

> > +#define AXON_READ(size,type) \
> > +static int axon_pio_pci_read##type(axon_pio_t* p_axon_pio, plb_addr_t
> > src, u##size * p_data) \ +{ \
> > +	int ret = 0; \
> > +	axon_pio_pci_t* p_axon_pio_pci = p_axon_pio -> context; \
> > +	void* src_cpu_addr; \
> > + \
> > +	 \
> > +	src_cpu_addr = do_ioremap( p_axon_pio_pci, src, size / 8); \
> > +	if ( src_cpu_addr != NULL ) { \
> > +\
> > +		*p_data = read##type( src_cpu_addr ); \
> > +\
> > +		do_iounmap( p_axon_pio_pci, src, src_cpu_addr ); \
> > +\
> > +	} else {\
> > +		dbg_err("Unable to map IO space located on the bus 0x%p(%lld bytes)
> > \n", src_cpu_addr, (unsigned long long)size / 8);\ +		ret = -EIO;\
> > +	}\
> > +\
> > +	return ret;\
> > +}
>
> Why do you need to access the I/O space? Are there registers that are not
> in memory BARs? Note that on powerpc, we ioremap the whole I/O space
> at boot time, so you can simply call the out{b,w,l} macros to write into
> them, and on x86, you don't ioremap I/O ports in the first place.

See above. You need to reprogram the PIM to target the requested address on 
PCI host.

> > Index: linux-2.6.21/drivers/axon/common/axon_proc.c
>
> Just kill this whole file please.
>
> > Index: linux-2.6.21/drivers/axon/common/axon_sys.c~
> > ===================================================================
> > --- /dev/null
> > +++ linux-2.6.21/drivers/axon/common/axon_sys.c~
>
> This looks like a backup file that you accidentally added to your
> quilt, please remove.

Yes, sorry. There are 2 or 3 such instances in my patches. I was not careful 
enough.

> 	Arnd <><