Verification in gWASM
Our latest release, 0.20.2 introduces a verification by redundancy scheme on gWASM. In this blogpost, we explain what this verification scheme is about, and how it helps the network actors.
When your computations are performed by someone else on your behalf (as it is in Golem), one of the main challenges is verification of the results. Can you trust that the provider has really performed the computations? You don’t know if the provider simply pretended the action, and sent you “fake result”- or might have attempted on cheating in a clever way. How can we make sure the requestor is secure in a trustless setup?
For the first Golem use-case — rendering, we have found some clever methods of performing such verification by analyzing the contents of the results (more details).
However, when it comes to the introduction of the gWasm meta-use case, the possibility of performing (almost) arbitrary computations is also the source of difficulties in evaluating the work performed by providers. In one of the previous posts, we have described our approach to billing, mentioning briefly that it works well with our approach to verification. Now the time has come to describe the latter in more detail.
Of course, this problem is neither new nor specific to Golem. All platforms outsourcing computations to untrusted providers have to tackle it somehow. For instance, BOINC, Truebit and iExec — just to name a few examples — feature sophisticated verification algorithms. The most common approach to the problem is to use computational redundancy (in other words: cross-checking results delivered by multiple providers).
Though, these algorithms have significant disadvantages making them unsuitable for Golem. First of all, they require a high redundancy factor, which means high costs for requestors (some of them also require providers to pay a deposit before computing a task). Secondly, while the aforementioned approaches are ingenious they are also very complex. Some of them involve performing parts of the computation on the blockchain, complex games involving multiple parties, or invalidating computation results a long time after they have been obtained. All of this translates to long delays with regards to the requestors getting confirmed results, therefore also delaying payments for honest providers.
Our goal is to design a simple, yet effective verification scheme that does not: incur excessive costs for the providers, introduce long delays and nor require providers to pay for the ‘privilege’ of computing tasks.
The simplest verification scenario we found is to allow two providers to compute the same task and cross-check the results: if they match, we deem the result to be valid. To improve reliability (at the expense of higher costs), one can increase the redundancy factor, and let three or more providers compute the same task. Note, that what makes this verification method feasible, is the fact the WASM computations are deterministic.
This happy path is simple enough, but, as usual, the interesting case is when something goes wrong. What should we do when the results differ? The simplest, however, the rather crude approach is to declare both results invalid. A slightly more refined scheme is to ask the third provider to act as a tie-breaker by computing the same task. These approaches can be generalized to higher redundancy factors by looking for a majority among results and ask another provider to act as a tie-breaker if needed.
A natural question to ask in this situation is: do we really have to compute every job twice? Or is there a way to lower the overhead? In fact, there is. Based on historical performance, we may verify jobs performed by more reliable providers less frequently. This seems intuitive enough, but the devil is in the details: how to relate verification frequency with reliability, and secondly, how to condense performance history into the reliability factor. One approach is to perform verification of a task through probability depending on the provider reputation: for a provider with reputation t, verification is performed with probability p = ν*(1-t). The factor ν in the above formula can be either a constant 0<ν<1, or depend on the overall verification failure ratio.
In conclusion, we have designed and implemented a simple, yet effective verification scheme. It has been deployed on our testnet along with the gWASM meta-use case and soon will be available on mainnet - together with the usage market. This, however, is not the end of the story, as there is still plenty of interesting research problems that explore.
Thanks for reading our blog. Feel free to ask us any questions, also with regards to this verification scheme, or send feedback!