Xilinx devicetrees
John Williams
jwilliams at itee.uq.edu.au
Wed Nov 28 10:55:28 EST 2007
Hi folks,
Stephen Neuendorffer wrote:
> > Binding it to a kernel, is a non-starter for us.
>
> I agree that this is not the best way of leveraging the power of device
> trees. The point is that by using a device tree, you haven't lost
> anything you can do currently. In the future we might have one kernel
> which supports all versions of all our IP, along with all flavors of
> microblaze and powerpc... You would only ever need to recompile this
> kernel as a final optimization, if at all.
I strongly support the OF / device tree work being done, from its own
perspective and also as a path to MicroBlaze/PPC unification, however
there is one critical difference that I have not seen adequately
addressed yet.
MicroBlaze is a highly configurable CPU in terms of its instruction set,
features and so on. To make use of this, it is essential that each
kernel image be compiled to match the CPU configuration. While a
generic kernel, making use of no features (MUL, DIV, Shift, MSR ops etc)
would run on any CPU, performance would be abysmal.
In my view it's not acceptable to present these as options for the user
to select at kernel config time. With N yes/no parameters, there is 1
correct configuration, and 2^N-1 incorrect ones. The odds of the user
falling upon a configuration that at worst fails to boot, or at best is
not optimally matched to the hardware, are high.
This same issue also applies to C libraries and apps - they must be
compiled with prior knowledge of the CPU. This is why our
microblaze-uclinux-gcc toolchain, with multilib'd uClibc, is almost 400meg!
Wrapping every mul, div, shift etc in a function call is clearly not
feasible. Things like the msrset/msrclr ops have a modest but
measurable impact on kernel code size and performance - it's just not
reasonable to add any level of indirection in there.
I have thought about dynamic (boot-time) code re-writing as one
possibility here, but it very quickly gets nasty. All of the
"optimised" opcodes (MUL/DIV/Shift etc) are smaller than their emulated
counterparts, so in principle we could re-write the text segment with
the optimised opcode, then NOP pad, but that's still inefficient. As
soon as we start talking about dynamic code relocation, re-writing
relative offsets in loops, ... yuck.. We'd be putting half of mb-ld
into the arch early boot code (or bootloader...)
The opposite approach, to build with all instructions enabled and
install exception handlers to deal with the fixups, is also pretty awful.
I find myself asking the question - for what use cases does the current
static approach used in MicroBlaze (with the PetaLinux BSP /
Kconfig.auto) *not* work?
One compromise approach might be to have a script in
arch/microblaze/scripts, called by the arch Makefile, that cracks open
the DT at build time and extracts appropriate cpu flags.
Finally, what is the LKML position on DT files going into the kernel
source tree?
Regards,
John
More information about the Linuxppc-embedded
mailing list