Model 0: low-level defence measures for Sybil attacks in P2P networks

Authors: Marcin Benke, Jakub Konka, Łukasz Gleń

Thanks to María Paula Fernández

With Golem, you can exchange computational power as a commodity or as a service for GNT (Golem Network Tokens): the requestors rent the providers’ computation power in order to compute their tasks, and their interactions constitute the market transactions. These transactions and users hence form a small economy. Through this article, we will present how such an economy can be secured from malicious parties with a limited set of tools.

P2P network security

Picture a pure distributed p2p network. By definition, anyone can join a p2p network at any given time with no restrictions whatsoever. A participating node only reveals its network identity, while keeping its personal information secret. Since the network is decentralized — and by its distributed nature — not governed or regulated by any of its parts, the participating party cannot revoke transactions or call for arbitrage in case they believe they are being deceived by a malicious counterparty. In other words, nodes assume the full risk. Lastly, the network is untrusted — there is no ‘source of truth’, meaning that any interaction with another node is risky.

In general, security in p2p networks can be enhanced by using distributed reputation systems, other distributed algorithms, by blockchains, ‘oracles’, by providing Proof of Work, personalization, deposits or impeding the creation of new identities, etc. In this blog post, we are studying raw networks with no augmentations — a node has access only to the history of its own transactions. Based on this setup we aim to find a way to secure the transactions in the system if such a way exists. Through this article, we shall refer to this problem as Model 0.

Network security, at Golem, is a top priority we will always strive towards. While Model 0 is the first step towards it, it is by no means the end of our work on this front; in the final version, we will use potentially more advanced solutions. Model 0 is more of a subject of research than a fundamental solution.

Threats

Model 0 contemplates two main threats: 1) a malicious party creates lots of identities and attacks the network (commonly known as a Sybil attack), and 2) rational parties find it profitable to cheat. In the first case, the malicious party floods the network with fake transactions and aims to destabilize it. After one identity is used, it then creates a new one to replace it with, causing the attacker to remain undetected The second case, on the other hand, depends on the network rules, if the rational users are more likely to be honest.

One approach to these problems is to secure every transaction. Another one, the one we are adopting in this blog post, is to minimize losses in the long term. The loss rate needs to be small enough to make operating in the network profitable for honest nodes. It is considered acceptable that some transactions fail or some nodes continuously try to defraud the system. Compare this to a retail store — the business is doing okay as long as the losses attributed to theft do not exceed some preset threshold. Such an approach relies on statistics and repetitiveness which give it a sort of predictability. Note that isolated and extremely high transactions are killers for distributed economies because the risk cannot be balanced by other transactions.

Known and unknown

In Golem, the providers first send valuable data (computation results) and then the requestors make the payments. Thus, it is the providers who face most of the risk and therefore should be secured. Moreover, this solution works in the opposite case as well.

A provider interacting with a requestor first estimates the risk. As mentioned above, it relies on local reputation. Reputation is an extract from the history of past interactions with other nodes. If there were 2, 3 maybe 5 transactions with a particular requestor, the provider would have then enough information to estimate the risk and the requestor therefore considered known. If there are no or a handful of past transactions, the requestor is considered unknown. Unknown requestors pose the biggest threat to the network and the solution presented in this article focuses on them.

The intuition behind

A provider cannot interact only with known requestors because the more requestors it interacts with, the bigger its revenue will be. This is a form of exploration-exploitation problem. Moreover, it is natural that there is a slow rotation of nodes in the market, and the number of known active requestors shrinks over time. When several consecutive transactions with unknown requestors result in losses, the provider will be unlikely to interact with unknown requestors for some time and will rely on its known requestors. That way the provider is somehow pessimistic about unknown requestors upon suspicion of a Sybil attack.

With this intuition in mind, we approached the problem in the following way: the provider assumes all unknown requestors to be one requestor, who is then treated according to certain rules described next.

The solution

We will present here, for the sake of brevity, a simplified sketch of the solution — especially since the core idea between the two approaches is the same. The original algorithm is more complex and differs a little from the one presented (if you would like to explore the full, unabridged solution, see the paper).

The algorithm starts when the provider receives a request to compute a task from a requestor and needs to decide whether to interact with the requestor or not.

  1. The provider determines if the requestor is known or unknown. If the requestor is known, then the provider accepts the task and sends an offer. If the requestor is unknown, then the algorithm continues its execution.
  2. The provider draws in a probabilistic manner(depending on the interaction history) whether to reject or accept the task. In case of acceptance, it sends an offer to the requestor and the algorithm continues.
  3. If the requestor selects the offer, then the provider performs computations for the requestor.
  4. The provider updates its success rate (beta) information according to the outcome of the task; the rate is updated using an exponential moving average so that the past events have a lesser impact than the recent ones. The actual equations describing the beta parameter are rather complicated, so they will not be introduced here (if interested in learning more, please click here). Nevertheless, the basic mechanism is that beta grows with the number of successes and falls with the number of failures.
  5. Regardless of the transaction outcome, the provider waits for a period of time equal to dt/beta (dt =computation time of the task). During that time, the provider rejects all task requests from all unknown requestors.

From the algorithm’s description, you will be able to notice there are periods when the provider can accept task requests from unknown requestors — and periods when the provider always rejects task requests from the aforementioned group. In the latter case, the length of this time period depends on the ratio of successes and failures.

Model 0 secures the parties against malicious counterparties albeit in a rather basic way, as it uses very limited tools. Nonetheless, it is worth considering as it demonstrates that a party is able to manage its risks reasonably well on its own in a decentralized p2p environment.

Anyhow, as we continue our research on distributed economies and start introducing more advanced, “fancier” models into play, stay tuned for more is to come!