[Simplicity] Translating a Simplicity Dialect to MAST-enabled Bitcoin SCRIPT

Thu Dec 6 14:44:10 AEDT 2018

I've taken the liberty of replying on the mailing list to your question (we
went off list only because of my failure to reply-all before).

The way I imagine working with Simplicity is as follows.

Someone develops a Simplicity program that they want to commit funds to
(let's use Bitcoin as our example Blockchain here, but it could be
Elements, or Liquid, etc.).
This Simplicity program can be represented compactly as a DAG, where
identical subexpressions are shared.  This sharing is unrelated to purity
of Simplicity (in fact, outside of core Simplicity there are many impure
Simplicity combinators), rather sharing is possible simply by virtue of the
fact that identical subexpressions are identical.  All this development
work happens off-chain.

Once the agent is satisfied with their Simplicity program, they fill the
witness values with dummy data and compute the program's "Commitment Merkle
Root".   (The Commitment Merkle Root excludes witness values from its
computation.) This is a recursive computation over the abstract syntax
tree. However because identical subexpressions have identical commitment
merkle roots (because the merkle root of a subexpression is only a function
of the subexpression, and not of its overall position in the syntax tree),
the results of these intermediate merkle roots can be shared during
comptuation.  This means that the time it takes to compute the program's
Commitment Merkle Root is proportional to the size of the program as a DAG
(as opposed to the size of the program as a tree).

The agent constructs a Bitcoin transaction that sends funds to the
Commitment Merkle Root, which is embedded in the output's scriptPubKey via
a new Segwit version, or indirectly via a Taproot version.

Later, when the agent wishes to redeem these funds, they must create a
transaction, and take their Simplicity program and

1) fill in suitable witness data in the program's witness nodes,
2) give the program a test run to make sure it passes successfully when
given their prospective transaction,
3) prune any unused branches from their program that was not used in the
above test execution (by converting case expressions into either assertl or
assertr expressions).  Note: due to sharing it is possible that both
branches of a case expression are executed, each one during different
passes.  Such case expressions are not pruned.

Once this is done they take that pruned "witness program" and compute its
Witness Merkle Root and run the static analysis to determine the programs
cost (in terms of program size, and bounds on the time and memory use).
This data is added as the witness data to the transaction.  Unlike the
Commitment Merkle Root, the Witness Merkle root includes the witness values.

The agent broadcasts this transaction to their peers.  Before accepting
this data, their peers need the "witness program" to validate it.  The
agent serialize the witness program as a DAG and transmits it to their
peers, possibly abbreviating common sub-expressions that they have
negotiated with their peers as ones they are both familiar with.  The
agent's peers receive the witness program and computes its Commitment
Merkle Root, Witness Merkle Root, and the static analysis of the execution
costs and validates that the Witness Merkle Root and static analysis
results match the transaction's witness data, and the Commitment Merkle
root matches the value in the UTXO.  (The peers also make sure that the
static analysis results are within acceptable/consensus required bounds and
other anti-DoS checks).

Lastly the peers execute the witness program to make sure it completes with
success.  This execution can be done by traversing the DAG representing the
witness program, and the presence of shared sub-expressions doesn't pose a
problem, since this "program code" is read-only data.

There are many variations of the above protocol that we could do instead,
and the protocol I describe above is among the more aggressive variants I
am considering.

On Wed, Dec 5, 2018 at 6:43 AM ZmnSCPxj <ZmnSCPxj at protonmail.com> wrote:

> Good morning Russell,
>
> I suppose I should try to read the technical report in more detail....
>
> Let me try to see if my understanding below is correct.
>
> 1.  A Simplicity program is an abstract syntax tree.
> 2.  The abstract syntax tree can be Merklized (MAST).
> 3.  Branches can be shared, since Simplicity is a pure total functional
> language.
> 4.  When a Simplicity program is executed offchain, it generates as
> proof-of-correct-execution the Merkle paths through the MAST that were
> evaluated.
> 5.  When the blockchain verifies the above proof-of-correct-execution, it
> reexecutes the parts of the program that executed in the first place.
> 6.  Due to purity, if a sub-expression is shared, the blockchain needs to
> only verify it once; the same subexpression when evaluated in a different
> part of the program will yield the same result. <--- is this correct?  Or
> will verification re-execute it again for each time it is shared, and the
> "only" saving here is saving the space for the proof-of-correct-execution?
> 7.  `witness` primitives cannot be shared since their Merklization depends
> on the exact values that will be loaded into the program.
>
> Regards,
> ZmnSCPxj
>
>
> Sent with ProtonMail <https://protonmail.com> Secure Email.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Wednesday, December 5, 2018 5:50 AM, Russell O'Connor <
> roconnor at blockstream.io> wrote:
>
> In more complex Simplicity programs, even one specific code path used
> during redemption might traverse identical subexpressions multiple times.
> With Simplicity's sharing, the you pay the program size cost (which will
> likely be the dominant cost) for identical subexpressions only once (you
> will also pay the time cost for those multiple traversals, multiple times,
> but the time cost will probably be less steep than the program size costs,
> meaning you would need to traverse the same expensive expression many many
> times before it starts becoming a significant factor).  If this path were
> instead compiled to Bitcoin Script, it would mean duplicating the
> expression as many times as it is traversed unless you augment Bitcoin
> Script even further by adding some operations to allow for subroutines.
>
> I'm sorry that I haven't yet written up details about how I imagine the
> weight cost of Simplicity to be determined, but roughly speaking there are
> 3 types of costs:
>
> * Program size cost: the size of the Simplicity's expressions DAG (with
> sharing).
> * Program time cost: roughly the number of Simplicity combinators
> encountered during evaluation, counting the same node visited with
> multiplicity.
> * Program memory cost: the maximum resident memory required during
> evaluation.
>
> These factors would be combined to assigned an overall "weight" cost of a
> Simplicity expression, in some sort of arbitrary way.  The idea is that we
> want to exclude abusive programs, while permitting useful programs.
> Hopefully there is a wide enough gap between these two classes of programs
> that we can create an arbitrary division between the two, whose exact
> details, while being consensus critical, isn't all that important.  Also,
> rather than instrumenting the interpreter to count these costs, I'm
> inclined to just use the bounds determined by static analysis for the cost,
> but this is design choice is debatable.
>
> On Mon, Dec 3, 2018 at 11:01 PM ZmnSCPxj <ZmnSCPxj at protonmail.com> wrote:
>
>> Good morning Russell,
>>
>> > A big problem is that translating Simplicity to Bitcoin Script would
>> miss out on a whole host of Simplicity features.  In particular, we
>> wouldn't have sub-expression sharing, which is likely going to be quite
>> important for many sophisticated programs.  Sub-expression sharing lets you
>> "unroll" bounded loops without multiplying the size of your program.
>>
>> My understanding is the value of sub-expression sharing is visible in
>> storing the Simplicity contract offchain --- onchain, only the specific
>> path that the contract took is visible, and sub-expression sharing is not
>> in effect.
>> Is my understanding wrong?
>>
>> Regards,
>> ZmnSCPxj
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/simplicity/attachments/20181205/7f7d212d/attachment-0001.html>