[ccan] htable license change request for Ruby

Wed May 21 18:32:06 EST 2014

Rusty Russell <rusty at rustcorp.com.au> wrote:
> Eric Wong <normalperson at yhbt.net> writes:
> > Hello Rusty!  I am interested in using htable with mainline Ruby.
> > Ruby is 2-clause BSD-licensed nowadays[1].
> >
> > My preliminary switch of one big st_table to htable beat out my proposed
> > container_of-based hash implementation (ihash) in memory usage:
> >
> >     ihash:  ~100K reduction
> >     htable: ~150K reduction
> >     https://bugs.ruby-lang.org/issues/9841#change-46823
> 
> Do you have performance measurements?  It's easy to make a smaller hash
> table, by filling it more :)

Thanks for the response!

I haven't looked too hard at performance, yet.  Most of the paths I'm
modifying are not extremely speed-critical and not easy to measure for
speed.  However, reducing overall allocations (and fragmentation) should
help speed in some cases.

The current hash table, st, is order-preserving(!) and requires a
48-byte allocation for every entry in "unpacked" tables with more
than 5 entries.

My proposed ihash implementation is chained using a singly linked-list
via container_of, so this adds 8-byte overhead to some elements and
requires no extra malloc in the Ruby method/constant tables.

st and ihash only double in size when load factor is 5(!) or more.  I'm
new to htable, but it looks like htable cannot have a load factor beyond
one, so htable even uses more space for the table itself, but introduces
no chaining pointer overhead for each element.

> The other considerations with htable:
> - It grows atomically, which can suck if you're after RT response.
>   This one is possible to fix, but tradeoffs abound.  You may also have
>   simply hit a sweet spot for htable: do you have reliable memory
>   reports for a range of uses?

st and ihash have the same growth behavior.  I just got a rough idea of
the savings based on using time(1).  However, I know the data structures
of Ruby and the numbers of what I'm changing (see below).

> - It doesn't shrink.

Also true of st and ihash.  Not an issue as I'm using it for tables that
are mostly mostly static and long-lived.

> - Assuming you use a nicely randomly seeded universal hash function :)

Yep, we're currently on SipHash-2-4 for strings.  This was designed to
be hash well even when given malicious inputs.

For ID/symbol lookups, I already needed to do some tuning of the hash
functions to get acceptable performance with st when we switched to
power-of-two sizing.

> > I noticed in commit d4941bf8047d16007f19a3b5b2211e1e7571f068
> > (htable: relicense under LGPL), there is a chance of relicensing
> > htable under public domain or BSD:
> >
> >     So this is the simplest fix.  I might relicense under PD or BSD later,
> >     since the likely module should probably have an even more liberal
> >     license.
> >
> > Therefore I kindly request the htable license be changed to a permissive
> > license so it may be considered for use with Ruby.
> 
> Sure, but let's make sure it's actually a good idea first :)
> 
> Thanks!
> Rusty.

I'll work more on getting parts converted to htable and performance
measurements soonish.  However, I know some numbers off the top of my
head.  (All my numbers here are based on 64-bit systems).

Here's a table of per-entry overheads in Ruby with the various
hash table implementations:

table \ impl.| st.unpacked  | st.packed  | ihash       | htable
-------------+--------------+------------+-------------+-----------
fstring      | malloc(48)   | never      | malloc(24)  | 0
casefold     | malloc(48)   | never      | 8           | 0
^=^=^=^=^=^=^=^=^= the above are order agnostic =^=^=^=^=^=^=^=^=^=
=v=v=v=v=v=v=v=v= the below may require ordering =v=v=v=v=v=v=v=v=v
symbols      | malloc(48)*2 | never      | malloc(40)  | malloc(16)
methods      | malloc(48)   | irrelevant | 8           | 0
constants    | malloc(48)   | irrelevant | 8           | 0

Notes:
* malloc(X) denotes additional malloc call + allocation size required,
  this does not account for any internal overheads of malloc.
* bare numbers mean the overhead is added to an existing allocated
  struct with no additional calls to malloc
* I've only implemented fstring for htable so far
* The fstring table I converted is around 1000 entries at minimum,
  and possibly >10K on some larger code bases.
* st.packed only applies to st tables with <=5 entries, where each entry
  is packed with 24-bytes/entry overhead into the core hash table
  itself.  This only affects method/constant tables, as the bigger
  tables do not get packed.
* symbols have bidirectional mapping, so there are two tables for
  symbols: id=>sym and sym=>id
  ihash and htable may share the same allocated space for both tables.
* fstring and symbols are typically >=3K entries
* casefold tables are static, the big ones add up to ~2.5K entries
* I am unsure of the order-preservation requirements for symbols,
  methods and constants.
  If order-preservation is required, ihash and htable require an
  additional 16 bytes per-entry for a struct list_node.
  (ref: http://bugs.ruby-lang.org/issues/9614)
* method and constant tables always has per-entry allocation overhead
  for each entry, thus ihash may embed its chaining node into that
  allocation without additional malloc calls.  Thus st.packed is
  not relevant to method or constant tables.
* There are several method and constant tables (typically tens of them,
  maybe hundreds).  Some of these tables may have zero entries, some may
  have <=5 (where they may be packed if st), but some may have tens of
  entries (but typically not hundreds).
* st.packed probably benefits instance/class variable tables right now,
  but I haven't checked those very closely.