Xilinx devicetrees

Wed Nov 28 11:28:33 EST 2007

> -----Original Message-----
> From: John Williams [mailto:jwilliams at itee.uq.edu.au] 
> Sent: Tuesday, November 27, 2007 3:55 PM
> To: Stephen Neuendorffer
> Cc: David H. Lynch Jr.; linuxppc-embedded; Michal Simek
> Subject: Re: Xilinx devicetrees
> 
> 
> MicroBlaze is a highly configurable CPU in terms of its 
> instruction set, 
> features and so on.  To make use of this, it is essential that each 
> kernel image be compiled to match the CPU configuration.  While a 
> generic kernel, making use of no features (MUL, DIV, Shift, 
> MSR ops etc) 
> would run on any CPU, performance would be abysmal.

I think the userspace is actually much more critical than the kernel for
most of these things (with the exception of msrset/msrclr, and the
barrel shifter perhaps).  Unfortunately, even if you implement an
alternatives-style mechanism for the kernel, you're still hosed for
userspace.  Once I a big enough system, it's just unfeasible to
recompile everything anyway.  I think this is where autoconfig starts to
break down.

> In my view it's not acceptable to present these as options 
> for the user 
> to select at kernel config time. With N yes/no parameters, there is 1 
> correct configuration, and 2^N-1 incorrect ones.  The odds of 
> the user 
> falling upon a configuration that at worst fails to boot, or 
> at best is 
> not optimally matched to the hardware, are high.

Yes.  Autoconfig does handle this in a fairly nice way.

> This same issue also applies to C libraries and apps - they must be 
> compiled with prior knowledge of the CPU.  This is why our 
> microblaze-uclinux-gcc toolchain, with multilib'd uClibc, is 
> almost 400meg!
> 
> Wrapping every mul, div, shift etc in a function call is clearly not 
> feasible.  Things like the msrset/msrclr ops have a modest but 
> measurable impact on kernel code size and performance - it's just not 
> reasonable to add any level of indirection in there.
> 
> I have thought about dynamic (boot-time) code re-writing as one 
> possibility here, but it very quickly gets nasty.  All of the 
> "optimised" opcodes (MUL/DIV/Shift etc) are smaller than 
> their emulated 
> counterparts, so in principle we could re-write the text segment with 
> the optimised opcode, then NOP pad, but that's still inefficient.  As 
> soon as we start talking about dynamic code relocation, re-writing 
> relative offsets in loops, ... yuck..  We'd be putting half of mb-ld 
> into the arch early boot code (or bootloader...)
> 
> The opposite approach, to build with all instructions enabled and 
> install exception handlers to deal with the fixups, is also 
> pretty awful.

It's not nice, I agree.  I think the key principle should be that I
should be able to get a system working as quickly as possible, and I
might optimize things later.  One thing that device trees will allow is
for *all* the drivers to get compiled in to the kernel, and only as a
late stage operation does a designer need to start throwing things away.
Using traps I can easily start with a 'kitchen sink' design, and start
discarding processor features, relying on the traps.  When I get low
enough down on the performance curve, I can uas an autoconfig-type
mechanism to regain a little of what I lost by optimizing away the trap
overhead. 

Personally, I think the easiest way out of all this is to just have less
configurability.  For microblaze in general, this is too much of a
restriction, but microblaze used as a control processor running linux,
there are probably just a few design points that really make sense
(probably size optimized: no options except maybe msrset/msrclr, and the
kitchen sink).  If we go that far, we don't really need people to ever
run autoconfig, or kernels, or anything.  Especially considering there
is no easy way of selecting which of the 2^N design points I want
*anyway*. :)

> I find myself asking the question - for what use cases does 
> the current 
> static approach used in MicroBlaze (with the PetaLinux BSP / 
> Kconfig.auto) *not* work?
> 
> One compromise approach might be to have a script in 
> arch/microblaze/scripts, called by the arch Makefile, that 
> cracks open 
> the DT at build time and extracts appropriate cpu flags.

Hmm... interesting idea, although parsing the source is likely
difficult...  It's probably not worth it to go this far, I think.   As
you say, why doesn't autoconfig of today work fine for this.

> Finally, what is the LKML position on DT files going into the kernel 
> source tree?

Source .dts go in and get compiled to binary blobs at compile time.  The
'big' recent controversy is whether the source->binary compiler dtc
should be mirrored in the Linux tree or not.

Steve