[PATCH] powerpc/pseries: Add papr-platform-dump character driver for dump retrieval

Haren Myneni haren at linux.ibm.com
Thu Dec 5 13:14:06 AEDT 2024


On Wed, 2024-12-04 at 17:57 +0100, Michal Suchánek wrote:
> On Mon, Dec 02, 2024 at 08:40:05PM -0800, Haren Myneni wrote:
> > On Wed, 2024-11-27 at 10:11 +0100, Michal Suchánek wrote:
> > > On Tue, Nov 26, 2024 at 12:40:20PM -0800, Haren Myneni wrote:
> > > > On Wed, 2024-11-27 at 00:42 +0530, Mahesh J Salgaonkar wrote:
> > > > > On 2024-11-23 21:20:39 Sat, Haren Myneni wrote:
> > > > > [...]
> > > > > > +static ssize_t papr_platform_dump_handle_read(struct file
> > > > > > *file,
> > > > > > +		char __user *buf, size_t size, loff_t *off)
> > > > > > +{
> > > > > > +	struct ibm_platform_dump_params *params = file-
> > > > > > > private_data;
> > > > > > +	u64 total_bytes;
> > > > > > +	s32 fwrc;
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * Dump already completed with the previous read calls.
> > > > > > +	 * In case if the user space issues further reads,
> > > > > > returns
> > > > > > +	 * -EINVAL.
> > > > > > +	 */
> > > > > > +	if (!params->buf_length) {
> > > > > > +		pr_warn_once("Platform dump completed for dump
> > > > > > ID
> > > > > > %llu\n",
> > > > > > +			(u64) (((u64)params->dump_tag_hi << 32)
> > > > > > +				params->dump_tag_lo));
> > > > > > +		return -EINVAL;
> > > > > > +	}
> > > > > > +
> > > > > > +	if (size < SZ_1K) {
> > > > > > +		pr_err_once("Buffer length should be minimum
> > > > > > 1024
> > > > > > bytes\n");
> > > > > > +		return -EINVAL;
> > > > > > +	}
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * The hypervisor returns status 0 if no more data
> > > > > > available to
> > > > > > +	 * download. Then expects the last RTAS call with NULL
> > > > > > buffer
> > > > > > +	 * to invalidate the dump which means dump will be
> > > > > > freed in the
> > > > > > +	 * hypervisor.
> > > > > > +	 */
> > > > > > +	if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE)
> > > > > > {
> > > > > > +		params->buf_length = 0;
> > > > > > +		fwrc = rtas_ibm_platform_dump(params, 0, 0);
> > > > > > +		/*
> > > > > > +		 * Returns 0 (success) to the user space so
> > > > > > that user
> > > > > > +		 * space read stops.
> > > > > 
> > > > > Does this implicitly invalidates the dump irrespective of
> > > > > whether
> > > > > userspace has requested or not ?
> > > > 
> > > > No, the RTAS call from the last read() will invalidate the
> > > > dump.
> > > > Expect
> > > > the user space make the read() until returns 0 which means the
> > > > last
> > > > read() will return 0 after invalidate the dump. 
> > > > 
> > > > size_t read() 
> > > > {
> > > >    if (params->status == RTAS_IBM_PLATFORM_DUMP_COMPLETE) {
> > > >            RTAS call to invalidate dump
> > > >            return 0;
> > > >    }
> > > >    get the data from RTAS call
> > > >    If the RTAS call return status is DUMP_COMPLETE
> > > >              params->status = RTAS_IBM_PLATFORM_DUMP_COMPLETE
> > > >    return bytes_written
> > > > }
> > > > 
> > > > If the RTAS call returns DUMP_COMPLETE, the hypervisor expects
> > > > one
> > > > more
> > > > RTAS call to invalidate the dump which is done as part of the
> > > > last
> > > > read()
> > > > 
> > > > > Copy-pasting bellow code snippet from librtas side patch
> > > > > posted
> > > > > by
> > > > > you to
> > > > > librtas-devel mailing list:
> > > > > + /*
> > > > > + * rtas_platform_dump() is called with buf = NULL and length
> > > > > = 0
> > > > > + * for "dump complete" RTAS call to invalidate dump.
> > > > > + * For kernel interface, read() will be continued until the
> > > > > + * return value = 0. Means kernel API will return this value
> > > > > only
> > > > > + * after issued "dump complete" call. So nothing to do
> > > > > further
> > > > > + * for the last RTAS call.
> > > > > + */
> > > > > + if (buffer == NULL)
> > > > > + return 0;
> > > > > 
> > > > > If I understand this correctly, it looks like calling
> > > > > rtas_platform_dump() with buf = NULL and length = 0, now does
> > > > > nothing.
> > > > Following the same read() ABI - expects to make calls until
> > > > returns
> > > > size 0.
> > > > 
> > > > The current usage of rtas_platform_dump() in ppc64-
> > > > diag/rtas_errd/extract_platdump.c
> > > > 
> > > > dump_complete = rtas_platform_dump(dump_tag, 0, dump_buf,
> > > > DUMP_BUF_SZ,
> > > >                                         &seq_next, &bytes);
> > > > 
> > > > while (dump_complete) {
> > > > 
> > > >    dump_complete = rtas_platform_dump(dump_tag, seq, dump_buf,
> > > >  				DUMP_BUF_SZ, &seq_next,
> > > > &bytes);
> > > >                                 
> > > > }
> > > > 
> > > > ret = rtas_platform_dump(dump_tag, seq, NULL, 0, 
> > > >                                 &seq_next, &bytes);
> > > > 
> > > > we need to support both new and old interfaces and not changing
> > > > the
> > > > above code which uses librtas API.
> > > > 
> > > > So the new rtas_platform_dump() interface
> > > > {
> > > >  if the buffer == NULL 
> > > >      return - nothing to do here. Dump is invalidated with the
> > > > previous
> > > > rtas_platform_dump()   
> > > >  
> > > >  size = read()
> > > >  if size == 0 
> > > >       dump complete and invalidate the dump
> > > >       return 0
> > > > 
> > > >   return 1;
> > > > }
> > > 
> > > No EOF?
> > 
> > read() returns size, 0 or < 0. Returns 0 is like EOF. 
> 
> Indeed, looks like the special EOF value is only added by libc, sorry
> about the noise.
> 
> > > So no standard file handling code can use this FD?
> > 
> > Yes, providing support for FD from ioctl for the following reasons:
> > 
> > - get-vpd char driver is providing only for FD from ioctl. So
> > thought
> > of using same interface for platform-dump so that having consitent
> > interface for all RTAS function char drivers. 
> > 
> > - file->private_data is assigned to miscdevice in misc_register and
> > also assigned to some other miscdevice struct in driver specific
> > code.
> > So was thinking of not following semantics in the existing code if
> > I
> > private_data to save internal param struct.
> > 
> > Please let me know if you prefer to use FD from open() for
> > platform-
> > dump read().
> 
> There is no way to specify the dump id with open() of the character
> device. So for open() other mechanism would have to be used, such as
> sysfs/debugfs which can list the IDs as filenames, or something
> similar.

you mentioned about standard file handling code using FD from open. So
I was referring to passing dump ID with ioctl as in the current patch,
but using FD returned from open() for read() calls like:
fd = open()
ret =  ioctl(fd,.. dumpid)
read(fd, buf, size)

As in get-vpd, we will have one other RTAS stream function get-indices. 
I will add this support later. So I am proposing same interfaces for
RTAS functions as provided by papr-vpd so that same type of interfaces
will be used across all RTAS functions

fd = open()
devfd = ioctl(fd,.. dumpid)
read(devfd, buf, size)    --> will just returns 0 if the previous RTAS
call returns dump complete status. But will not invalidate the dump.
ret = ioctl(devfd, .. dumpid) - can be used to invalidate dump as you
suggested.  

> 
> > > But also the size 0 read both indicates the EOF and invalidates
> > > the
> > > dump, these should be separate.
> > > 
> > > Which differs from the hypervisor API that makes it impossible to
> > > save
> > > the dump without deleting it, and introduces a regression.
> > 
> > The hypervisor API says to invalidate the dump after saving it. In
> > the
> > current interface it does - The user space makes the last read()
> > (for
> > return 0) after saving the complete dump. 
> > 
> > All read() calls return size (> 0) == RTAS calls for dump
> > read() expects return 0 == same RTAS invalidate dump
> > 
> > So the only difference is if the user does not call to invalidate
> > dump
> > explicitly even though saved the dump, but we do not have the
> > current
> > usage, Only the extract-dump command is the only usage now and
> > please
> > note that this command is used for non-HMC manages system. It is
> > not
> > used on HMC managed system which has its own command gets the dump
> > directly from the hypervisor.    
> 
> However, the hypervisor does provide the option to read the dump
> without
> invalidating it.
> 
> At the very least this option is useful for testing. Not making it a
> separate call makes it needlessly difficult for people testing the
> feature.
> 
> If as you say this is used exclusively by extract-dump, and it
> already
> calls an IOCTL to obtain a FD it can call another IOCTL on it to do
> the
> invalidation. There is hte corner case that the kernel might need to
> finish the reading of the dump on its own if the userspace did not to
> perform the IOCTL, or return an error if the dump is not fully read
> yet.

kernel should not read the remaining dump if the user space did not
issue or returns an error if the complete dump is not read. should
leave it to the user space as it is doing now. As Mahesh pointed
before, what if the user space gets exited for some reason in the
middle of dump collection. 

should be same before but one more ioctl to invaldate the dump

read()
{
   if the status is 'dump complete' 
         return 0;

   if the status is 'dump invalidated'
         return -EINVAL  -- means user space issued ioctl to invalidate
before. should not expect this but in case

   RTAS call to get the dump
   return size;
}

ioctl(devfd, ...) 
{
  if the status is 'dump complete'
      RTAS call to invalidate dump
      return 0;

  return -EINVAL --- expects the user space to issue this ioctl after
read() returns 0
}
      
> 
> > > If you are doing IOCTLs anyway the invalidation could be an
> > > IOCTL. Or
> > > you could really follow the RTAS ABI and only incalidate if the
> > > user
> > > passes NULL and 0.
> > 
> > I could use this ioctl interface to invalidate the dump.
> > 
> > devfd = ioctl(fd ..) for dump ID
> > read(devfd)
> > ret = ioctl(devfd ...) to invalidate dump 
> > 
> > I will make changes if you are OK with this interface
> 
> Yes, for the ioctl interface this looks like a better option.

Thanks for your suggestion. As I mentioned above, will add this change.
> 
> > > But more generally the previously added RTAS wrappers do not
> > > closely
> > > follow the RTAS ABI, and do accumulation of read data in the
> > > kernel,
> > > passing the whole buffer to userspace afterwards.
> > > 
> > > Why cannot it be done in this case?
> > 
> > platform-dump RTAS returns more data unlike in get-vpd. In one case
> > noticed 3G data which is not the ideal case to save in the kernel
> > buffer within ioctl. 
> > 
> > Also platform-dump RTAS function supports interleave calls for
> > multiple
> > dump IDs at the same time unlike in get-vpd case.
> 
> It would be nice to document why this call is incompatible with the
> existing implementations somewhere in the commit message or a
> comment.

I think mentioned about interleave calls. Sure will update in comments.

> 
> > > Even more generally if the dump IDs are stable these could be
> > > listed
> > > in
> > > some sysfs or debugfs directory, and provide standard file
> > > operations,
> > > including unlink() to remove the dump.
> > 
> > dump IDs are not stable. The hypervisor can have several dumps with
> > different IDs at the same time. so unlink() can not be used.
> 
> I mean stable in the sense that same dump corresponds always to the
> same
> ID. Not in the sense that new dumps cannot be addded, and old
> removed.
> 
> It is indeed unclear what is the lifecycle of these dump IDs, and how
> they are created, and how the userspace is obtaining them.

Dump IDs can be 64bit value and so I think they may not be reused.
Whenever the hypervisor has the dump ot if the user initiates the dump
with BMC interface, notifies to the partition with the dump ID and
automatically executes extract_platdump command.  

We should be OK Since using ioctl to invalidate

fd = open()
devfd = ioctl(fd,.. dumpid)
read(devfd, buf, size)
ret = ioctl(devfd, .. dumpid)

Will implement the above change. Please let me know if you have any
concerns. 

Thanks
Haren

> 
> Thanks
> 
> Michal
> 
> > > With the bonus that at least one of these filesystems has some
> > > provisions for confidentiality lockdown. The implementation could
> > > use
> > > that to make the dumps unavailable in confidentiality lockdown
> > > level
> > > if
> > > the dumps are considered confidential without reimplementing the
> > > check.
> > 
> > We are adding new driver (interfaces) to support lockdown.
> > Otherwise
> > the current usage is sufficient. But we could modify to restrict
> > the
> > interface for confidentiality lockdown in future if we have that
> > restriction.
> > 
> > Thanks
> > Haren
> > 
> > > Thanks
> > > 
> > > Michal



More information about the Linuxppc-dev mailing list