<div dir="ltr">Hi Gregory,<div><br></div><div>Quick question of clarification: Are the contents of these files the serialized witness program input to the evaluator?</div><div><br></div><div>I am currently (although infrequently, sadly) working on a test harness that compares the Jet implementations and the Haskell EDSL spec for the purposes of verification. It seems like I could also pull these in and test them against the Haskell tools and see if they succeed/fail in the same ways. That said, the segfault failures will not be replicable in the Haskell tools, but perhaps the segmentation faults will be eliminated in the C implementation in the future. My naive first take is that you are right about the "failing cleanly" assessment, but perhaps Russell has a reason for the current implementation being the way that it is right now.</div><div><br></div><div>If we want to do more intelligent fuzzing (well-typed programs), it may be easier to set those kinds of tests up in Haskell's quickcheck and send them over the FFI boundary to the C evaluator.</div><div><br></div><div>Cool to see more people interested in this project! Thanks for the data!</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Nov 11, 2019 at 8:30 PM Gregory Maxwell <<a href="mailto:gmaxwell@gmail.com">gmaxwell@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Greetings,<br>
<br>
Now that there is a fast C implementation I thought I would throw some<br>
coverage-guided fuzzing at it.<br>
<br>
I made a trivial test harness that uses<br>
elements_simplicity_execSimplicity and testTx1 from test.c as the<br>
context, just like the ordinary tests in test.c.<br>
<br>
I grinded on it with a 1 second execution upper time bound starting<br>
from a few hundred thousand randomly generated short inputs for about<br>
a cpu-core-year.<br>
<br>
The result is a collection of 7371 inputs extracted from a few billion<br>
attempts that collectively hit the as broad a coverage footprint as I<br>
observed in this test run:<br>
<br>
<a href="https://people.xiph.org/~greg/simplicityC_tests.tar.xz" rel="noreferrer" target="_blank">https://people.xiph.org/~greg/simplicityC_tests.tar.xz</a> (78kb, expands to 7.3mb)<br>
<br>
The files are named based on the result of the execution, e.g.<br>
successes vs various failures and roughly where they failed.<br>
<br>
Collections like this are useful for comparing other implementations,<br>
quickly testing new changes, or as a starting point for new<br>
coverage-directed fuzzing runs.<br>
<br>
While setting up testing I found that the non-elements harness (e.g.<br>
test.c:test_program) segfaults when elements operations are used and<br>
no transaction context is provided, which is kind of surprising. I<br>
would have expected that either a null tx pointer should always fail<br>
(api misuse) or the elements operations should become cleanly<br>
unavailable. This might just be me misunderstanding the intended<br>
interface. But I would have guessed that there would be different<br>
entry points with different required arguments for different kinds of<br>
contexts, and use of the wrong opcodes for your context would just<br>
make execution fail like an executed assertLR node.<br>
<br>
I also found two distinct assertion failures which probably trace back<br>
to a single faulty elements operation. See issues #21 and #22.<br>
<br>
I was surprised by the rather large percentage of inputs that made it<br>
through type inference. I have no idea how efficient the serialization<br>
will be for actual programs people want to use, but it's at least<br>
efficient for something. :)<br>
<br>
All in all I was pretty impressed by how robust the implementation<br>
appears to be even at this early stage of development.<br>
<br>
I don't, however, think this fuzzing effort is particularly good at<br>
this point. I'm aware of the following shortcomings of my efforts:<br>
<br>
* The initial starting points would ideally include a large number of<br>
well formed programs. It would be good to start from a manually<br>
constructed set of tiny tests that check each opcode and combinator.<br>
Unfortunately the provided examples are mostly currently too slow to<br>
use as good fuzzing vectors, so I just used random ones.<br>
<br>
* The bulk of the fuzzing should take care to avoid making stupid<br>
serialization syntax errors that waste time without testing anything<br>
useful.<br>
Including, for example, not emitting reserved or unimplemented opcodes<br>
or running the reader past the end of the file.<br>
<br>
* The implementation really needs a static runtime bound for this<br>
testing. With that, I could set it reasonably low, testing would run<br>
much faster, and any test case which still took a long time would be<br>
worth investigating.<br>
<br>
* There are probably a number of trivial optimizations that might make<br>
sense for the sake of fuzzing performance that wouldn't otherwise be<br>
worthwhile for real code, e.g. observing during type inference that<br>
the program is trivial in various ways and skipping execution. Though<br>
it would be good to also fuzz with those optimizations off, right now<br>
it seems to waste a lot of time on various trivial programs that<br>
aren't too fuzzing-interesting.<br>
<br>
* I didn't fuzz with additional tools like ubsan or by adding any<br>
additional instrumentation beyond the existing asserts.<br>
<br>
* My collection doesn't include any of the test cases that don't<br>
complete in a second. Manually inspecting some of them suggests that<br>
they many would run fine but would just take an eternity to complete.<br>
<br>
* The success/failure state of each of the vectors in the collection<br>
should be at least checked against another implementation, which I<br>
didn't do. (but someone could do for me, e.g. with the haskell<br>
tools?).<br>
<br>
I'll be happy to give it another go later (e.g. once static runtime<br>
analysis is implemented), but I thought even with the above<br>
limitations this collection might be useful to people.<br>
<br>
Cheers,<br>
-- <br>
Simplicity mailing list<br>
<a href="mailto:Simplicity@lists.ozlabs.org" target="_blank">Simplicity@lists.ozlabs.org</a><br>
<a href="https://lists.ozlabs.org/listinfo/simplicity" rel="noreferrer" target="_blank">https://lists.ozlabs.org/listinfo/simplicity</a><br>
</blockquote></div>