[ccan] help with gracefully dealing with alloc failure in a recursive function

Fri Oct 7 11:40:24 EST 2011

[resent to list]

I have a crazy-talk idea inspired by your question.

If your parser can take ownership of the lifetime of the XML data in memory
then it might be possible to hack up the XML (by replacing the quotes at the
end of strings with \0's, as an example) and then use it as a
"pre-allocated" hunk of memory that already contains all your data; you just
need to pull out the locations of this data (the attribs and tag data) and
put them into pointers into the data structures that you're using to
represent the XML tree.  It doesn't solve unwinding a failed alloc through a
recursive call, but it makes the likelihood of an alloc failing as you'll
only be alloc'ing the space to hold your XML node data structures which will
only contain pointers and other scalars to describe the pointed-to data.

Does anyone want to contemplate the relationship between the size of an XML
document and the equivalent representation in data structures + data that
they point to?  I suspect that with 4/8 byte pointers pointing to everything
vs. the verbosity of XML syntax it probably comes out close to even.  I ask
because I wonder if one could reason about the maximum amount of memory
required to store an arbitrary XML document of a given size.

BG

On Thu, Oct 6, 2011 at 2:54 PM, Daniel Burke <dan.p.burke at gmail.com> wrote:

> I'm wondering what a commonly acceptable method of handling this failure
> would be, my Google-Fu's not giving me answers I like, so   I'm turning to
> the collective wisdom of this list. I suspect my knowledge of other
> languages is poisoning my thought process.
>
> So parsing XML in a recursive function, with a structure that contains the
> relevant state of the task. My initial plan is to add a variable to the
> structure named "failed", and if an alloc fails I set it, and then test this
> after every function call that can fail, trying to bail out to the head
> function ASAP, where I call the free function on the partial tree I've
> created so far.
>
> This puts a lot of ugly checking code in what is presently on the clean
> side of what I typically write. Most other languages I'd raise an exception
> and deal with the failure once.
>
> I've a few existing Linux Kernel style Goto-Exceptions to keep all the
> error code together, and not spread throughout the meat of the functions,
> however my understanding is that it's a Bad Thing (tm) to goto across
> functions, as depending on compiler/flags there's going to have to be some
> stack twiddling, and while my inner assembly programmer says just store SI
> in the data structure, every other bone in my body is telling me this is a
> capitol offense.
>
> Should I bite the bullet and turn my pretty 1 page function into a 3 page
> function with lots of checking, or is there a clever/easy way to quickly
> bail?
>
>
> regards,
>
> dan
> --
> "Within C++, there is a much smaller and cleaner language struggling to get
> out"
> --Bjarne Stroustrup
>
>
> _______________________________________________
> ccan mailing list
> ccan at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/ccan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/ccan/attachments/20111006/d12f3ef6/attachment.html>