[RFC 0/3] extend kexec_file_load system call

Vivek Goyal vgoyal at redhat.com
Wed Jul 13 00:18:11 AEST 2016

On Tue, Jul 12, 2016 at 04:02:46PM +0200, Arnd Bergmann wrote:
> On Tuesday, July 12, 2016 8:25:48 AM CEST Eric W. Biederman wrote:
> > AKASHI Takahiro <takahiro.akashi at linaro.org> writes:
> > 
> > > Device tree blob must be passed to a second kernel on DTB-capable
> > > archs, like powerpc and arm64, but the current kernel interface
> > > lacks this support.
> > >   
> > > This patch extends kexec_file_load system call by adding an extra
> > > argument to this syscall so that an arbitrary number of file descriptors
> > > can be handed out from user space to the kernel.
> > >
> > > See the background [1].
> > >
> > > Please note that the new interface looks quite similar to the current
> > > system call, but that it won't always mean that it provides the "binary
> > > compatibility."
> > >
> > > [1] http://lists.infradead.org/pipermail/kexec/2016-June/016276.html
> > 
> > So this design is wrong.  The kernel already has the device tree blob,
> > you should not be extracting it from the kernel munging it, and then
> > reinserting it in the kernel if you want signatures and everything to
> > pass.
> > 
> > What x86 does is pass it's equivalent of the device tree blob from one
> > kernel to another directly and behind the scenes.  It does not go
> > through userspace for this.
> > 
> > Until a persuasive case can be made for going around the kernel and
> > probably adding a feature (like code execution) that can be used to
> > defeat the signature scheme I am going to nack this.
> > 
> > Nacked-by: "Eric W. Biederman" <ebiederm at xmission.com>
> > 
> > I am happy to see support for other architectures, but for the sake of
> > not moving some code in the kernel let's not build an attackable
> > infrastructure.
> > 
> For historic context, the flattened devicetree format that we now use
> to pass data about the system from boot loader to kernel was initially
> introduced specifically for the purpose of enabling kexec:
> On Open Firmware, the DT is extracted from running firmware and copied
> into dynamically allocated data structures. After a kexec, the runtime
> interface to the firmware is not available, so the flattened DT format
> was created as a way to pass the same data in a binary blob to the new
> kernel in a format that can be read from the kernel by walking the
> directories in /proc/device-tree/*.

So this DT is available inside kernel and running kernel can still
retrieve it and pass it to second kernel?

> There are a couple of reasons for modifying the devicetree:
> - For kboot/petitboot, you can have a kernel that is not booted through
>   DT at all but hardwired to a particular machine, and that passes
>   a DT for the entire hardware to the kernel that you actually want to
>   run.
> - for kdump, you need to tell the new kernel about the modified location
>   of the memory, so the dump kernel doesn't overwrite the contents
>   it wants to dump

In x86 we do this with the help of kernel command line options.

> - we typically ship devicetree sources for embedded machines with the
>   kernel sources. As more hardware of the system gets enabled, the
>   devicetree gains extra nodes and properties that describe the hardware
>   more completely, so we need to use the latest DT blob to use all
>   the drivers
> - in some cases, kernels will fail to boot at all with an older version
>   of the DT, or fail to use the devices that were working on the
>   earlier kernel. This is usually considered a bug, but it's not rare
> - In some cases, the kernel can update its DT at runtime, and the new
>   settings are expected to be available in the new kernel too, though
>   there are cases where you actually don't want the modified contents.

I am assuming that modified DT and unmodifed one both are accessible to
kernel. And if user space can make decisions which modfied fields to use
for new kernels and which ones not, then same can be done in kernel too?

> 	Arnd

More information about the Linuxppc-dev mailing list