On 19/12/2007, <b class="gmail_sendername">Michael Cohen</b> <<a href="mailto:scudette@gmail.com">scudette@gmail.com</a>> wrote:<div><span class="gmail_quote"></span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Hi Rusty,<br> Thanks for taking the initiative...<br><br>> 1) Domain name. <a href="http://ccan.org">ccan.org</a> would be ideal: seems to point to a generic ISP<br>> page, whois says Community Care Ambulance Network in Ohio, which seems
<br>> legit at first glance. <a href="http://ccan.com">ccan.com</a> seems to be for sale by a squatter,<br>> perhaps we could do some deal where we buy <a href="http://ccan.com">ccan.com</a> for the ambos and get
<br>> <a href="http://ccan.org">ccan.org</a>?<br><br>What about <a href="http://ccan.com.au">ccan.com.au</a>? or maybe a hyphenated version (<a href="http://ccan-repo.org">ccan-repo.org</a><br>for example?)<br><br>> 2) Namespacization. If someone can't use (say) container_of because their
<br>> existing project uses that name, it'd be nice to rename it to<br>> ccan_container_of. Of course, this means any other ccan modules which use<br>> this need changing too.<br>I wonder if that is an integration issue? Clearly linking in C will
<br>become an issue due to name clashes. We should aim to have all ccan<br>modules linked internally with strict symbol exportation controls<br>(most things static etc). Things like macros and exported functions<br>will need to be prefixed by something. (maybe ccan_ ).
</blockquote><div><br>I suspect this will be one of the trickiest elements of the design.<br><br>Keep in mind that one of the nicest features of this sort of repository is that once it becomes useful, network effects kick in and you often suck in other existing code.
<br><br>So what if libpng needs to be a CCAN package?<br><br>As package numbers increase, so does the length of the package names.<br><br><a href="http://search.cpan.org/recent">http://search.cpan.org/recent</a><br><br>That is the upload stream for CPAN modules, note that some of them have named that are very very long. Will it be feasible to have names like
<br><br>foo__bar__baz__openoffice__plugins__etc<br><br></div>But my C knowledge lacks here, so keeping the namespaces and amount of typing under control is somewhere C experts need to come up with ideas.<br><br>My naive solution would be potentially something like an implicit macro that maps "
namespace.variable" to ccan__foo__bar__baz_.....__variable or something.<br><br>Code outside of the "current namespace" would use the explicit version...<br><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
> I thought about making the ccan_ versions canonical and providing macros<br>> to map to the normal names, but it breaks the golden rule:<br>><br>> This code must not be ugly.<br>><br>> So I chose the harder road of supplying a C program to actually rewrite the
<br>> module and anything which depends on it. Not quite finished yet, but it<br>> has the merit of placing the burden of ugliness where it belongs.<br>This might produce un-ugly code but would make it difficult to
<br>integrate future versions of ccan into your project. This is because<br>the version in your project would be different than the version in<br>ccan and its hard to track these differences. I think it would be less<br>ugly if the user ultimately drops in newer versions of ccan files
<br>instead of having to re-generate them.</blockquote><div><br>CXAN projects tend to lead to fairly rich installation clients, due to issues like mirror selection, dependency resolution and auto-mated testing.<br><br>So having some amount of code auto-munging is not too big a deal.
<br><br>For example, the JSAN strips out all the documentation at install time to compile into separate files, because the code will need to go over the internet and the docs add a lot of bloat.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
> 3) Minimum compiler requirements. Seems like Microsoft's C compiler doesn't<br>> do vararg macros, let alone any other C99 stuff. This makes it impossible<br>> to have nice code for some things, so for the moment I'm still sticking
<br>> with my intention of requiring (most of) C99, and Windows coders will have<br>> to hope they catch up, or we do some horrible mangling at the user end<br>> (like namespacization).<br>I dont think thats a great issue since gcc is also available for
<br>windows - I somehow dont expect hard core windows coders to be using<br>ansi c (if they use MSVC they are more likely to use c++).</blockquote><div><br>I concur. MinGW support should be all any Windows people will need.
<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> 4) Licensing: so far I'm restricting it to with BSD no advertising, or GPLv2+.
<br>> More than that makes life too tricky for people wanting to use the code.<br>We need a consistant license for everything - unfortunately for<br>testing etc we our selves will be linking to our own code so if we
<br>have a mix of licenses we might be breaking them ourselves.</blockquote><div><br>This is going to be very tricky... it does depend to some degree on what a CCAN package provides. If you are building libraries from CCAN packages instead of inlined stuff, different issues apply.
<br><br>My gut instinct is to make sure that each uploaded package HAS a license defined in metadata, and that the license falls within a set we accept, and from there implement programmatic license tracking into the installer client.
<br><br>I'm unsure how difficult programmatically modelling license compatibility would be though, but it would seem to be within the realm of the possible.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
> 5) Uploading, automated testing, etc. No idea here, plan to steal as much of<br>> CPAN as possible for this.<br>I think we need to use something bzr or darcs where people can send<br>patches to the maintainer. These need to be tested before added for
<br>inclusion into the repo. There need to be some requirements before a<br>module goes in (like tests, license etc).</blockquote><div><br>This is one area that I would really like to be quite insistent on.<br><br>The official canonical and authorative packages should be tarballs and only tarballs.
<br><br>This achieves any number of benefits, for example.<br><br>1. It makes the mirroring and distribution of code easy.<br><br>Both CPAN and JSAN at their core can be expressed as a single FTP server with files in it.<br>
<br>CPAN has even had to largely abandon mirroring via rsync, as the load is too intensive on the master mirror servers.<br><br>It also makes the packages stable over time.<br><br>2. It allows choice.<br><br>Developers care a lot about their development tools. It is a REALLY bad idea to try to force the user to use someone else's tools.
<br><br>You may want to know ABOUT a repo, that's totally fine.<br><br>But you describe the repository type and location in package metadata, you don't actually release anything this way.<br><br>3. It simplifies the server indexing and management enormously.
<br><br>By just opening up the tarball and looking inside it, you keep the indexer much more sane.<br><br>It also lets third-party systems much more able to do analysis and support tasks on the code.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
> 6) I have documentation-extraction code, but it doesn't do any xml-style guff<br>> (the Linux kernel's system on which this is loosely based is perl based, so<br>> maybe we should just steal that for the documentation, as it does various
<br>> different output forms).<br>I personally dont like automated documentation generation - most of<br>the time it tells you exactly what you would be able to read by just<br>looking at the code and it allows people to be sloppy about writing
<br>documentation (hmm i can just run doxygen on this later - no need to<br>document). I think reams of doxygen generated code from a thinly<br>documented source tree is worse than do documentation at all.</blockquote><div>
<br>If you don't like automated documentation generation then you may just not have seen it done well.<br><br>For example, here's the web version of the generated documentation for one of my CPAN modules.<br><br><a href="http://search.cpan.org/perldoc?PPI">
http://search.cpan.org/perldoc?PPI</a><br><br>That is the generated from the inline docs of the top namespace... here is the full list of namespaces for the package.<br><br><a href="http://search.cpan.org/~adamk/PPI-1.201/">
http://search.cpan.org/~adamk/PPI-1.201/</a><br><br>Here is the typical documentation for a few sample more-thinly documented namespaces.<br><br><a href="http://search.cpan.org/~adamk/PPI-1.201/lib/PPI/Statement/Include.pm">
http://search.cpan.org/~adamk/PPI-1.201/lib/PPI/Statement/Include.pm</a><br><br><a href="http://search.cpan.org/~adamk/PPI-1.201/lib/PPI/Element.pm">http://search.cpan.org/~adamk/PPI-1.201/lib/PPI/Element.pm</a><br><br>The same source documentation is also used to compile man pages for each CPAN namespace.
<br><br>The problem of bad standards of documentation is one I think that needs to be solved culturally and by convention.<br><br>If the kernel standards provide a reasonably well-known standard for per-function documentation, then perhaps we need to look at extending it to allow all the additional man-like data to be added in there as well.
</div></div><br>Adam K<br>