[RFC PATCH 2/3] discover: Add support for device-mapper snapshots

Cyril Bur cyril.bur at au1.ibm.com
Thu Apr 30 04:46:26 AEST 2015


On Wed, 2015-04-29 at 10:16 +1000, Samuel Mendoza-Jonas wrote:
> On 29/04/15 02:25, Cyril Bur wrote:
> > On Tue, 2015-04-28 at 10:51 +1000, Samuel Mendoza-Jonas wrote:
> >> On 28/04/15 02:09, Cyril Bur wrote:
> >>> On Mon, 2015-04-27 at 15:52 +1000, Samuel Mendoza-Jonas wrote:
> >>>> Device Mapper allows the creation of CoW snapshots backed by ramdisks.
> >>>> This will allow Petitboot to perform potentially destructive actions
> >>>> such as filesystem recovery without affecting the backing disk.
> >>>> This commit adds several functions to create and destroy these snapshots
> >>>> and to keep track of available ramdisks.
> >>>>
> >>> Hey Sam,
> >>>
> >>> Been waiting to see how you did this, looks nice!
> >>>
> >>> Got a few comments.
> >>>
> >>> Firstly, is there a real advantage to the in_use flag? It looks like you
> >>> create and destroy the DM device regardless, all you're saving is a few
> >>> allocations? Could you do it in such a way that you could save the
> >>> DM_DEVICE_CREATE/DM_DEVICE_REMOVE actions, I feel like they're the most
> >>> costly.
> >>
> >> Not strictly required, more an artifact of some older checking. I'm not sure I
> >> follow you on avoiding the DM_DEVICE_* actions - there are two things we
> >> essentially keep track of:
> > 
> > Going to admit right off the bat that I don't really know how any of the
> > DM really works.
> > 
> >> - Available /dev/ramX ramdisks - these are not the snapshots themselves just
> >> the backing memory device. Petitboot boots up with 16 of these (IIRC) so we
> >> use these first before making new nodes.
> > 
> > If you've got 16 existing /dev/ramX that you use as backing devices,
> > shouldn't there be code somewhere which inits handler->ramdisks with 16
> > entries?
> 
> More accurately, the kernel that Petitboot ships with in the current FSP firmware(s)
> happens to initialise 16 ramdisks at boot - Petitboot itself doesn't actually
> know this. What happens instead is that Petitboot receives a udev event for each
> one (which leads to device_handler_add_ramdisk()).
> 
Ah yes over in 3/3. I'm with you now.

> > 
> >> - A snapshot for every eligible disk: We DM_DEVICE_CREATE a snapshot when a
> >> new valid disk is recognised, and DM_DEVICE_REMOVE it when the disk is unmounted.
> >> There's not a way to 'avoid' the actions since the snapshot exists for the life
> >> of the disk as far as Petitboot is concerned.
> >>
> > Right, what I was getting at is that in the event of someone repeatedly
> > mounting and unmounting you're going to create and remove
> > the /dev/mapper/%s a bunch of times, can you optimise this to keep
> > the /dev/mapper/%s around?
> > 
> > If not (which may be entirely possible... I'll take your word for it),
> > it looks like ramdisk.in_use corresponds to ramdisk.snapshot being NULL
> > or not... (provided you do ramdisk->snapshot = NULL before the out: in
> > device_handler_create_snapshot() ).
> 
> Strictly speaking, mounting/unmounting the snapshot doesn't require us to destroy
> the snapshot (indeed we keep them around when remounting for read-write).
> 
> However if we assume Petitboot is the only actor unmounting things the only other
> two cases are:
> 	- When the device/snapshot is unmountable (no fs) and we throw it away.
> 	- When we drop all discover_devices and reinit.
> 
> I'm open to suggestion on the second idea, but the problem is that when we reinit
> we free all our discover_devices, and any knowledge we had of the snapshots goes
> out the window. We *could* check that a snapshot already exists when trying to
> make one, but this gets a little messy, and arguably reinits are rare enough
> that this won't be a performance hit.
> 
Yeah sounds like we're heading down the path of work for no real gain...

> > 
> >> However I don't think I've read your question right so let me know :) 
> >>
> > I believe I have a reputation for bad expression, probably my bad.
> >>>
> >>>> No functionality change in this patch.
> >>>>
> >>>> Signed-off-by: Samuel Mendoza-Jonas <sam.mj at au1.ibm.com>
> >>>> ---
> >>>>  discover/Makefile.am      |   4 ++
> >>>>  discover/device-handler.c | 176 ++++++++++++++++++++++++++++++++++++++++++++++
> >>>>  discover/device-handler.h |   6 ++
> >>>>  3 files changed, 186 insertions(+)
> >>>>
> >>>> diff --git a/discover/Makefile.am b/discover/Makefile.am
> >>>> index 7808110..c6bc8af 100644
> >>>> --- a/discover/Makefile.am
> >>>> +++ b/discover/Makefile.am
> >>>> @@ -58,6 +58,10 @@ discover_pb_discover_LDADD = \
> >>>>  	$(core_lib) \
> >>>>  	$(UDEV_LIBS)
> >>>>  
> >>>> +discover_pb_discover_LDFLAGS = \
> >>>> +	$(AM_LDFLAGS) \
> >>>> +	-ldevmapper
> >>>> +
> >>>>  discover_pb_discover_CPPFLAGS = \
> >>>>  	$(AM_CPPFLAGS) \
> >>>>  	-DLOCAL_STATE_DIR='"$(localstatedir)"' \
> >>>> diff --git a/discover/device-handler.c b/discover/device-handler.c
> >>>> index f411d9f..69cd699 100644
> >>>> --- a/discover/device-handler.c
> >>>> +++ b/discover/device-handler.c
> >>>> @@ -10,6 +10,8 @@
> >>>>  #include <sys/wait.h>
> >>>>  #include <sys/mount.h>
> >>>>  
> >>>> +#include <libdevmapper.h>
> >>>> +
> >>>>  #include <talloc/talloc.h>
> >>>>  #include <list/list.h>
> >>>>  #include <log/log.h>
> >>>> @@ -46,6 +48,12 @@ enum default_priority {
> >>>>  	DEFAULT_PRIORITY_DISABLED	= 0xff,
> >>>>  };
> >>>>  
> >>>> +struct ramdisk_device {
> >>>> +	const char	*path;
> >>>> +	bool		in_use;
> >>>> +	char		*snapshot;
> >>>> +};
> >>>> +
> >>>>  struct device_handler {
> >>>>  	struct discover_server	*server;
> >>>>  	int			dry_run;
> >>>> @@ -57,6 +65,9 @@ struct device_handler {
> >>>>  	struct discover_device	**devices;
> >>>>  	unsigned int		n_devices;
> >>>>  
> >>>> +	struct ramdisk_device	**ramdisks;
> >>>> +	unsigned int		n_ramdisks;
> >>>> +
> >>>>  	struct waitset		*waitset;
> >>>>  	struct waiter		*timeout_waiter;
> >>>>  	bool			autoboot_enabled;
> >>>> @@ -747,6 +758,171 @@ void device_handler_add_device(struct device_handler *handler,
> >>>>  		network_register_device(handler->network, device);
> >>>>  }
> >>>>  
> >>>> +void device_handler_add_ramdisk(struct device_handler *handler,
> >>>> +		const char *path)
> >>>> +{
> >>>> +	struct ramdisk_device *dev;
> >>>> +	unsigned int i;
> >>>> +
> >>>> +	if (!path)
> >>>> +		return;
> >>>> +
> >>>> +	for (i = 0; i < handler->n_ramdisks; i++)
> >>>> +		if (!strcmp(handler->ramdisks[i]->path, path))
> >>>> +			return;
> >>>> +
> >>>> +	dev = talloc_zero(handler, struct ramdisk_device);
> >>>> +	dev->path = talloc_strdup(handler, path);
> >>>
> >>> You've not checked for NULL when using many of the talloc_* functions?
> >>> Is there a good reason for this? I feel like you might be opening
> >>> yourself up to mystery segfault.
> >>> Of course on the other hand, I'll probably end up agreeing that there
> >>> really isn't much that can be done if they ever fail.
> >>
> >> I choose to plead RFC :P But as you say this one in particular is tricky - 
> >> if we fail we probably can't make a snapshot, so we *could* mount the
> >> disk itself - but then we throw away any read-only guarantee.
> > 
> > Is throwing away the read-only guarantee (without explicit user consent)
> > ever a good idea? Could petitboot prompt? Or have a "I really know what
> > I'm doing here" mode.
> 
> No, it's a terrible idea :)
> Something like a pre-boot popup that says "Want to save changes to disk?" for
> GRUB updates etc, could work, but
> a) Autoboot is a problem
> b) Someone will undoubtedly have an aneurysm when they first see that pop-up
> and I'll get a bug report about bootloaders never touching disks
> 
> > 
> >> Maybe something like this (later on in the mounting code)
> >> 	if (snapshot available)
> >> 		mount(snapshot)
> >> 	else
> >> 		mount(disk) with -norecovery
> >>>
> >>>> +
> >>>> +	i = handler->n_ramdisks++;
> >>>> +
> >>>> +	handler->ramdisks = talloc_realloc(handler, handler->ramdisks,
> >>>> +				struct ramdisk_device *, handler->n_ramdisks);
> >>>> +	handler->ramdisks[i] = dev;
> >>>
> >>> So could be just me, these three lines caused me a minifreakout, it took
> >>> me a second to see that it's correct, any reason you're incrementing
> >>> n_ramdisks before you resize the allocation? What if the allocation
> >>
> >> Faith in my memory allocator? :)
> >>
> > I'm going to go and take (almost) all the ram out of a Tuleta, then
> > we'll see :p.
> > 
> > 
> >>> fails? Why even use i? Might I suggest:
> >>>
> >>> handler->ramdisks = talloc_realloc(handler, handler->ramdisks,
> >>> 		struct ramdisk_device *, handler->n_ramdisks + 1);
> >>> if (!hander->ramdisks) {
> >>> 	/* I still think you should freak out */
> >>> }
> >>>
> >>> handler->ramdisks[handler->n_ramdisks++] = dev;
> >>
> >> Yep, recovers better.
> >>
> >>>
> >>>> +}
> >>>> +
> >>>> +void device_handler_create_snapshot(struct device_handler *handler,
> >>>> +		struct discover_device *device)
> >>>> +{
> >>>> +	unsigned int sectors, i;
> >>>> +	struct dm_task *task;
> >>>> +	char *ttype, *params;
> >>>> +	uint32_t cookie = 0;
> >>>> +	uint16_t flags = 0;
> >>>> +	const char *tmp;
> >>>> +	
> >>>> +	tmp = discover_device_get_param(device, "ID_PART_ENTRY_SIZE");
> >>>> +	if (!tmp) {
> >>>> +		pb_log("Could not retrieve sector size for %s\n",
> >>>> +		       device->device_path);
> >>>> +		return;
> >>>> +	}
> >>>> +
> >>>> +	sectors = strtoul(tmp, NULL, 0);
> >>>> +	if (!sectors) {
> >>>> +		pb_log("Error reading sector count for %s\n",
> >>>> +		       device->device_path);
> >>>> +		return;
> >>>> +	}
> >>>> +
> >>>> +	pb_log("Creating a snapshot for %s, %u sectors, named %s\n",
> >>>> +	       device->device_path, sectors, device->device->id);
> >>>> +
> >>>> +	/* Check if free ramdisk exists */
> >>>> +	for (i = 0; i < handler->n_ramdisks; i++)
> >>>> +		if (!handler->ramdisks[i]->in_use)
> >>>> +			break;
> >>>> +
> >>>> +	/* Otherwise create a new one */
> >>>> +	if (i == handler->n_ramdisks) {
> >>>> +		char *name = talloc_asprintf(handler, "/dev/ram%d",
> >>>> +				handler->n_ramdisks);
> >>>> +		dev_t id = makedev(1, handler->n_ramdisks);
> >>>> +		if (mknod(name, S_IFBLK, id)) {
> >>>> +			if (errno == EEXIST) {
> >>>> +				/* We haven't yet received updates for existing
> >>>> +				 * ramdisks - add and use this one */
> >>>> +				pb_debug("Using untracked ramdisk %s\n", name);
> >>>> +			} else {
> >>>> +				pb_log("Failed to create new ramdisk %s: %s\n",
> >>>> +				       name, strerror(errno));
> >>>> +				return;
> >>>> +			}
> >>>> +		}
> >>>> +		device_handler_add_ramdisk(handler, name);
> >>>> +		talloc_free(name);
> >>>> +	}
> >>>> +
> >>> Perhaps have the above code be something along the lines of
> >>> get_backing_ramdisk(...)
> >>
> >> Could do although this doesn't get called anywhere else - snapshot-merges
> >> may change that however.
> >>
> > I just think device_handler_create_snapshot() would flow better, having
> > said that, its not bad as is... but if you do envisage a scenario where
> > such a new function would get called from two places...
> > 
> 
> And I suspect it probably will, I'll try to minimise any code duplication when
> bringing in snapshot-merge - most likely I'll split the devmapper stuff out
> into a separate file.
> 
> > 
> > Cyril.
> > 
> >>>
> >>>
> >>>> +	ttype = talloc_asprintf(handler,  "snapshot");
> >>>> +	params = talloc_asprintf(handler, "%s %s N 1",
> >>>> +		 device->device_path, handler->ramdisks[i]->path);
> >>>> +
> >>>> +	task = dm_task_create(DM_DEVICE_CREATE);
> >>>> +	if (!task) {
> >>>> +		pb_log("Error creating new dm-task\n");
> >>>> +		goto err1;
> >>>> +	}
> >>>> +
> >>>> +	if (!dm_task_set_name(task, device->device->id))
> >>>> +		goto err2;
> >>>> +
> >>>> +	/* Set the table for this dm-device */
> >>>> +	if (!dm_task_add_target(task, 0, sectors, ttype, params))
> >>>> +		goto err2;
> >>>> +
> >>>> +	if (!dm_task_set_add_node(task, DM_ADD_NODE_ON_CREATE))
> >>>> +		goto err2;
> >>>> +
> >>>> +	/* Petitboot's libdm isn't compiled with --enable-udev_sync, so we set
> >>>> +	 * empty cookie and flags */
> >>>> +	if (!dm_task_set_cookie(task, &cookie, flags))
> >>>> +		goto err2;
> >>>> +
> >>>> +	if (!dm_task_run(task)) {
> >>>> +		pb_log("Error executing dm-task\n");
> >>>> +		goto err2;
> >>>> +	}
> >>>> +
> >>>> +	/* This is the magic incantation to get /dev/mapper/sdaX appearing */
> >>>> +	dm_udev_wait(cookie);
> >>>> +
> >>>> +	device->ramdisk = handler->ramdisks[i];
> >>>> +	handler->ramdisks[i]->in_use = true;
> >>>> +	handler->ramdisks[i]->snapshot = talloc_asprintf(handler, "/dev/mapper/%s",
> >>>> +						device->device->id);
> >>>> +
> >>>> +	pb_log("Snapshot created for device %s\n", device->device->id);
> >>>> +
> >>>> +err2:
> >>>> +	dm_task_destroy(task);
> >>>> +err1:
> >>>> +	talloc_free(params);
> >>>> +	talloc_free(ttype);
> >>>> +}
> >>>> +
> >>>> +int discover_device_destroy_snapshot(struct discover_device *device)
> >>>> +{
> >>>> +	/* Assume we've already unmounted the snapshot */
> >>>> +	struct dm_task *task;
> >>>> +	uint32_t cookie = 0;
> >>>> +	uint16_t flags = 0;
> >>>> +	int rc = -1;
> >>>> +
> >>>> +	if (!device->ramdisk)
> >>>> +		return 0;
> >>>> +
> >>>> +	task = dm_task_create(DM_DEVICE_REMOVE);
> >>>> +	if (!task) {
> >>>> +		pb_log("Could not create dm_task DM_DEVICE_REMOVE\n");
> >>>> +		return -1;
> >>>> +	}
> >>>> +
> >>>> +	if (!dm_task_set_name(task, device->device->id)) {
> >>>> +		pb_log("No dm-snapshot named '%s'\n", device->device->id);
> >>>> +		goto out;
> >>>> +	}
> >>>> +
> >>>> +	/* Petitboot's libdm isn't compiled with --enable-udev_sync, so we set
> >>>> +	 * empty cookie and flags */
> >>>> +	if (!dm_task_set_cookie(task, &cookie, flags))
> >>>> +		goto out;
> >>>> +
> >>>> +	if (!dm_task_run(task)) {
> >>>> +		pb_log("Unable to remove snapshot '%s'\n", device->device->id);
> >>>> +		goto out;
> >>>> +	}
> >>>> +
> >>>> +	/* Wait for /dev/mapper/ entries to be removed */
> >>>> +	dm_udev_wait(cookie);
> >>>> +
> >>>> +	device->ramdisk->in_use = false;
> >>>> +	device->ramdisk = NULL;
> >>>> +	rc = 0;
> >>>> +out:
> >>>> +	dm_task_destroy(task);
> >>>> +	return rc;
> >>>> +}
> >>>> +
> >>>>  /* Start discovery on a hotplugged device. The device will be in our devices
> >>>>   * array, but has only just been initialised by the hotplug source.
> >>>>   */
> >>>> diff --git a/discover/device-handler.h b/discover/device-handler.h
> >>>> index b592c46..2425b7b 100644
> >>>> --- a/discover/device-handler.h
> >>>> +++ b/discover/device-handler.h
> >>>> @@ -26,6 +26,7 @@ struct discover_device {
> >>>>  
> >>>>  	char			*mount_path;
> >>>>  	const char		*device_path;
> >>>> +	struct ramdisk_device	*ramdisk;
> >>>>  	bool			mounted;
> >>>>  	bool			mounted_rw;
> >>>>  	bool			unmount;
> >>>> @@ -72,6 +73,11 @@ struct discover_device *discover_device_create(struct device_handler *handler,
> >>>>  		const char *id);
> >>>>  void device_handler_add_device(struct device_handler *handler,
> >>>>  		struct discover_device *device);
> >>>> +void device_handler_add_ramdisk(struct device_handler *handler,
> >>>> +		const char *path);
> >>>> +void device_handler_create_snapshot(struct device_handler *handler,
> >>>> +		struct discover_device *device);
> >>>> +int discover_device_destroy_snapshot(struct discover_device *device);
> >>>>  int device_handler_discover(struct device_handler *handler,
> >>>>  		struct discover_device *dev);
> >>>>  int device_handler_dhcp(struct device_handler *handler,
> >>>
> >>
> >>
> > 
> 
> 




More information about the Petitboot mailing list