[ccan] tok_itr pull request
Rusty Russell
rusty at rustcorp.com.au
Wed Apr 3 11:03:25 EST 2013
anto canto <ar.cantor at gmail.com> writes:
> Hello Rusty,
Hi Anto!
> I created a pull request on github for a small module which iterates over
> token strings. If you don't want to pull it that's fine, but since I
> haven't gotten a response from you on github I wanted to point it out here
> on the mailing list.
Hmm, I missed another one. Ah, github doesn't put my email address in
the to: header, which was defeating my filters. Fixed, thanks.
Now, as to your module. Since it's your first contribution, I'll give
as much advice as I can. You don't need to follow *any* of it if you
don't want to, though.
> +++ b/ccan/tok_itr/_info
> @@ -0,0 +1,41 @@
> +#include <string.h>
> +#include "config.h"
> +
> +/**
> + * tok_itr - iterate over tokens in a string
> + *
> + * The list header contains routines for manipulating double linked lists.
> + * It defines two types: struct list_head used for anchoring lists, and
> + * struct list_node which is usually embedded in the structure which
> is place> d
> + * in the list.
This is the wrong description :)
> diff --git a/ccan/tok_itr/tok_itr.c b/ccan/tok_itr/tok_itr.c
> new file mode 100644
> index 0000000..63db8f6
> --- /dev/null
> +++ b/ccan/tok_itr/tok_itr.c
> @@ -0,0 +1,49 @@
> +/* Licensed under LGPL - see LICENSE file for details */
> +#include "tok_itr.h"
> +
> +void tok_itr_init(struct tok_itr *itr, const char *list, char delim) {
> +
> + itr->delim = delim;
> + itr->curr = list;
> +}
> +
> +int tok_itr_end(const struct tok_itr *itr) {
> + return (itr->curr == NULL);
> +}
Shouldn't this return bool?
> +size_t tok_itr_val(const struct tok_itr *itr, char *val, size_t size) {
> + char *next;
> + size_t amt, len;
> +
> + if(size < 1)
> + return 0;
I thought this function was supposed to return the length of the token,
but here it's returning 0.
> + next = strchr(itr->curr, itr->delim);
> + if(next == NULL)
> + len = strlen(itr->curr);
> + else if(next == itr->curr)
> + len = 0;
> + else /* next > itr->curr */
> + len = next - itr->curr;
Note that the "if(next == itr->curr)" is redundant, as the else
statement does the right thing.
> + /* only want to copy at most size - 1 bytes because
> + * we need to save the last spot for '\0' */
> + size--;
> + amt = (len > size)? size : len;
> +
> + if(amt > 0)
> + strncpy(val, itr->curr, amt);
This can be a memcpy, since you know amt is < strlen(itr->curr).
> +
> + /* just copied bytes 0->(amt-1) for total of *amt* bytes.
> + * now add the null byte to pos at *amt* */
> + val[amt] = '\0';
> +
> + return len;
I'm uncomfortable with this, because (as your examples show) it usually
means people will truncate tokens. I think we can do better.
How about we allocate if the token doesn't fit in the buffer, and cache
the pointer in struct tok_itr?
struct tok_itr {
char delim;
const char *curr;
size_t alloc_len;
char *alloc_buf;
};
static inline void tok_itr_init(struct tok_itr *itr,
const char *list, char delim)
{
itr->delim = delim;
itr->curr = list;
itr->alloc_len = 0;
itr->alloc_buf = NULL;
}
Now we can make tok_itr the fundamental function:
/* Returns buf, or an allocated buffer if it doesn't fit. Returns NULL
* on no-more-tokens or malloc failure. */
char *tok_itr(struct tok_itr *itr, char *buf, size_t buflen)
{
char *next;
size_t toklen;
if (!itr->curr)
goto cleanup;
next = strchr(itr->curr, itr->delim);
if (next == NULL)
toklen = strlen(itr->curr);
else
toklen = next - itr->curr;
/* Can't fit in their buffer (incl nul terminator)? */
if (buflen < toklen + 1) {
/* Resize/allocate our internal buffer */
if (itr->alloc_len < toklen + 1) {
char *a = realloc(itr->alloc, toklen + 1);
if (!a)
goto cleanup;
itr->alloc = a;
itr->alloc_len = toklen + 1;
}
buf = itr->alloc;
}
memcpy(buf, itr->curr, toklen);
buf[toklen] = '\0';
itr->curr = next;
return buf;
cleanup:
free(itr->alloc);
return NULL;
}
If you really want a foreach macro, it becomes:
#define TOK_ITR_FOREACH(buf, buf_size, ptr, list, delim, tok_itr_ptr) \
for (tok_itr_init(tok_itr_ptr, list, delim), \
ptr = tok_itr(tok_itr_ptr, buf, buf_size); \
ptr; \
ptr = tok_itr(tok_itr_ptr, buf, buf_size))
There need to be more brackets in here to protect complex expressions,
though, at minimum around (ptr) (which, admittedly, is unlikely to be a
complex expression):
for (tok_itr_init(tok_itr_ptr, list, delim), \
(ptr) = tok_itr(tok_itr_ptr, buf, buf_size); \
(ptr); \
(ptr) = tok_itr(tok_itr_ptr, buf, buf_size))
Though I generally prefer my arguments in "context first" order, with
trivia at the end, which means tok_itr_ptr and list are probably needed
at the front:
TOK_ITR_FOREACH(tok_itr_ptr, list, delim, ptr, buf, buf_size)
Example usage:
TOK_ITR_FOREACH(&itr, "this is a test", ' ', p, NULL, 0) {
This is still a bit messy, so maybe the buffer should be in token setup,
not in the iteration. We have space in the structure, so this is easy:
struct tok_itr {
char delim;
bool allocated;
const char *curr;
size_t buf_len;
char *buf;
};
static inline void tok_itr_init(struct tok_itr *itr, char *buf, size_t buflen,
const char *list, char delim)
{
itr->delim = delim;
itr->allocated = false;
itr->curr = list;
itr->buf_len = buflen;
itr->buf = buf;
}
This makes usage pretty simple, with no looping macro required.
static void print_split_by_colon(const char *string)
{
struct tok_itr itr;
char buf[32], *p;
tok_itr_init(&itr, buf, sizeof(buf), string, ';');
while (p = tok_itr(&itr))
printf("%s\n", p);
}
Cheers,
Rusty.
PS. Completely untested, of course!
More information about the ccan
mailing list