Fission: A Provably Fast, Scalable, and Secure Permissionless Blockchain

Ke Liang

arXiv:1812.05032·cs.CR·December 17, 2018

Fission: A Provably Fast, Scalable, and Secure Permissionless Blockchain

Ke Liang

PDF

Open Access

TL;DR

Fission introduces a scalable, secure, and decentralized permissionless blockchain using innovative pipelining, adaptive partitioning, and a hybrid network to improve throughput, reduce confirmation time, and maintain core blockchain values.

Contribution

The paper presents Fission, a blockchain system that combines novel pipelining, adaptive partitioning, and a hybrid network to enhance scalability and security while preserving decentralization.

Findings

01

Achieves high throughput via block pipelining.

02

Reduces transaction confirmation time with a relay network.

03

Ensures security with a provably secure consensus protocol.

Abstract

We present Fission, a new permissionless blockchain that achieves scalability in both terms of system throughput and transaction confirmation time, while at the same time, retaining blockchain's core values of equality and decentralization. Fission overcomes the system throughput bottleneck by employing a novel Eager-Lazy pipeling model that achieves very high system throughputs via block pipelining, an adaptive partitioning mechanism that auto-scales to transaction volumes, and a provably secure energy-efficient consensus protocol to ensure security and robustness. Fission applies a hybrid network which consists of a relay network, and a peer-to-peer network. The goal of the relay network is to minimize the transaction confirmation time by minimizing the information propagation latency. To optimize the performance on the relay network in the presence of churn, dynamic network…

Equations131

N_{i + 1}^{p} = ⎩ ⎨ ⎧ N_{i}^{p} + 1, N_{i}^{p} - 1, N_{i}^{p}, if \frac{N _{i}^{e}}{N _{i}^{p}} \geq δ N_{m a x}^{e} if \frac{N _{i}^{e}}{N _{i}^{p}} \leq (1 - δ) N_{m a x}^{e} otherwise,

N_{i + 1}^{p} = ⎩ ⎨ ⎧ N_{i}^{p} + 1, N_{i}^{p} - 1, N_{i}^{p}, if \frac{N _{i}^{e}}{N _{i}^{p}} \geq δ N_{m a x}^{e} if \frac{N _{i}^{e}}{N _{i}^{p}} \leq (1 - δ) N_{m a x}^{e} otherwise,

F (k; s_{i}, p)

F (k; s_{i}, p)

(ha s h_{i}^{e}, π_{i}^{e}) = V R F (s k_{i}, see d^{e}, t y p e),

(ha s h_{i}^{e}, π_{i}^{e}) = V R F (s k_{i}, see d^{e}, t y p e),

t i c k e t_{i} = V R F (s k_{i}, see d^{e}, ‘‘ leader ")

t i c k e t_{i} = V R F (s k_{i}, see d^{e}, ‘‘ leader ")

X_{h} \sim N (μ_{h}, σ_{h}^{2}), μ_{h} = S_{h} p, σ_{h}^{2} = S_{h} p (1 - p)

X_{h} \sim N (μ_{h}, σ_{h}^{2}), μ_{h} = S_{h} p, σ_{h}^{2} = S_{h} p (1 - p)

X_{a} \sim N (μ_{a}, σ_{a}^{2}), μ_{a} = S_{a} p, σ_{a}^{2} = S_{a} p (1 - p)

X_{a} \sim N (μ_{a}, σ_{a}^{2}), μ_{a} = S_{a} p, σ_{a}^{2} = S_{a} p (1 - p)

Pr (Y \leq 0) = Pr (Z \leq - \frac{μ _{y}}{σ _{y}}) \approx Pr (Z \leq - \frac{( 3 h - 2 ) α τ}{( 4 - 3 h ) α τ}) .

Pr (Y \leq 0) = Pr (Z \leq - \frac{μ _{y}}{σ _{y}}) \approx Pr (Z \leq - \frac{( 3 h - 2 ) α τ}{( 4 - 3 h ) α τ}) .

τ > \frac{40.5 ( 4 - 3 h )}{( 3 h - 2 ) ^{2} α}

τ > \frac{40.5 ( 4 - 3 h )}{( 3 h - 2 ) ^{2} α}

Pr (X_{a} \geq θ τ) Pr (X_{h} < θ τ) = Pr (Z \geq \frac{θ τ - μ _{a}}{σ _{a}}) < 1 0^{- 10} = Pr (Z < \frac{θ τ - μ _{h}}{σ _{h}}) < 1 0^{- 10},

Pr (X_{a} \geq θ τ) Pr (X_{h} < θ τ) = Pr (Z \geq \frac{θ τ - μ _{a}}{σ _{a}}) < 1 0^{- 10} = Pr (Z < \frac{θ τ - μ _{h}}{σ _{h}}) < 1 0^{- 10},

(1 - h) α + 6.36 \frac{( 1 - h ) α}{τ} < θ < h α - 6.36 \frac{h α}{τ}

(1 - h) α + 6.36 \frac{( 1 - h ) α}{τ} < θ < h α - 6.36 \frac{h α}{τ}

θ > (1 - h) α + 6.36 \frac{( 1 - h ) α}{τ} > \frac{1}{4} + \frac{3.18}{τ} .

θ > (1 - h) α + 6.36 \frac{( 1 - h ) α}{τ} > \frac{1}{4} + \frac{3.18}{τ} .

h α - 6.36 \frac{h α}{5000} > 0.3,

h α - 6.36 \frac{h α}{5000} > 0.3,

: E (r) = \frac{w}{∣ V ∣} m_{i} \in M \sum \frac{l _{i}^{2}}{u _{i}},

: E (r) = \frac{w}{∣ V ∣} m_{i} \in M \sum \frac{l _{i}^{2}}{u _{i}},

: m_{i} \in M \sum l_{i} = ∣ V ∣,

c_{i} (a_{i}, a_{- i}) \leq c_{i} (a_{i}^{'}, a_{- i}),

c_{i} (a_{i}, a_{- i}) \leq c_{i} (a_{i}^{'}, a_{- i}),

c_{i} (a_{i}, a_{- i}) \leq c_{i} (a_{j}, a_{- i}) \Rightarrow Φ (a_{i}, a_{- i}) \leq Φ (a_{j}, a_{- i}) .

c_{i} (a_{i}, a_{- i}) \leq c_{i} (a_{j}, a_{- i}) \Rightarrow Φ (a_{i}, a_{- i}) \leq Φ (a_{j}, a_{- i}) .

a^{*} = a \in A argmin Φ (a) .

a^{*} = a \in A argmin Φ (a) .

Φ (r) = r_{i} \in r \sum ∣ M ∣ (r_{i} - \overline{r})^{2},

Φ (r) = r_{i} \in r \sum ∣ M ∣ (r_{i} - \overline{r})^{2},

\displaystyle\Pr(j,k)=\left\{\begin{array}[]{l l}\frac{u_{i}}{|U|}\left(1-\frac{r_{k}}{r_{j}}\right)&\mbox{if $r_{j}>r_{k}$,}\\ 0&\mbox{if $j\neq k,r_{j}\leq r_{k}$,}\\ 1-\sum_{j\neq k}\Pr(j,k)&\mbox{if $j=k$.}\end{array}\right.

\displaystyle\Pr(j,k)=\left\{\begin{array}[]{l l}\frac{u_{i}}{|U|}\left(1-\frac{r_{k}}{r_{j}}\right)&\mbox{if $r_{j}>r_{k}$,}\\ 0&\mbox{if $j\neq k,r_{j}\leq r_{k}$,}\\ 1-\sum_{j\neq k}\Pr(j,k)&\mbox{if $j=k$.}\end{array}\right.

E [Φ (r_{t + 1})]

E [Φ (r_{t + 1})]

E [Φ (r_{t + 1})]

E [Φ (r_{t + 1})]

E [Φ (r_{t + 1})]

E [Φ (r_{t + 1})]

= i = 1 \sum m Var [r_{t + 1}^{i}] \leq (m Φ (r_{t}))^{\frac{1}{2}} .

E [Φ (r_{t + 1})]

E [Φ (r_{t + 1})]

f (t + 1) \leq \frac{lo g m}{2} + \frac{f ( t )}{2} .

f (t + 1) \leq \frac{lo g m}{2} + \frac{f ( t )}{2} .

f (t + λ)

f (t + λ)

\leq lo g m + \frac{1}{2 ^{λ}} f (t) .

E [Φ (r_{t + λ})] \leq m E [Φ (r_{t})]^{2^{- λ}} .

E [Φ (r_{t + λ})] \leq m E [Φ (r_{t})]^{2^{- λ}} .

E [Φ (r_{t + λ})] \leq 2 m,

E [Φ (r_{t + λ})] \leq 2 m,

Pr (Φ (r_{t + λ}) \geq 4 m)

Pr (Φ (r_{t + λ}) \geq 4 m)

\Pr(\tau=i\lambda)\left\{\begin{array}[]{l l l}\geq p&\mbox{if $i=1$,}\\ \leq p(1-p)^{i-1}&\mbox{if $i\geq 2$,}\end{array}\right.

\Pr(\tau=i\lambda)\left\{\begin{array}[]{l l l}\geq p&\mbox{if $i=1$,}\\ \leq p(1-p)^{i-1}&\mbox{if $i\geq 2$,}\end{array}\right.

E [τ]

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBlockchain Technology Applications and Security · Caching and Content Delivery · Distributed systems and fault tolerance

Full text

Fission: A Provably Fast, Scalable, and Secure Permissionless Blockchain

Version 1.0

Ke Liang

The Fission Project

[email protected]

Abstract.

We present Fission, a new permissionless blockchain that achieves scalability in both terms of system throughput and transaction confirmation time, while at the same time, retaining blockchain’s core values of equality and decentralization.

Fission overcomes the system throughput bottleneck by employing a novel Eager-Lazy pipeling model that achieves very high system throughputs via block pipelining, an adaptive partitioning mechanism that auto-scales to transaction volumes, and a provably secure energy-efficient consensus protocol to ensure security and robustness.

Fission applies a hybrid network which consists of a relay network, and a peer-to-peer network. The goal of the relay network is to minimize the transaction confirmation time by minimizing the information propagation latency. To optimize the performance on the relay network in the presence of churn, dynamic network topologies, and network heterogeneity, we propose an ultra-fast game-theoretic relay selection algorithm that achieves near-optimal performance in a fully distributed manner. Fission’s peer-to-peer network complements the relay network and provides a very high data availability via enabling users to contribute their storage and bandwidth for information dissemination (with incentive). We propose a distributed online data retrieval strategy that optimally offloads the relay network without degrading the system performance.

By re-innovating all the core elements of the blockchain technology - computation, networking, and storage - in a holistic manner, Fission aims to achieve the best balance among scalability, security and decentralization.

1. Introduction

Due to the phenomenal success of cryptocurrencies (nakamoto2008bitcoin, ; wood2014ethereum, ) in the last few years, blockchain technology has gained massive interests, and recently emerged with a promise to streamline interactions in a wide range of settings. A blockchain, also called a distributed ledger, is a decentralized and incorruptible ledger used to record transactions across many nodes111We use the terms node, peer, and user interchangeably. which do not need to fully trust each other. By applying certain consensus mechanisms, all the nodes in a blockchain network agree on an ordered set of blocks, each of which could contain multiple transactions. It should be noted that we only discuss permissionless or public blockchains in this paper, where anyone can join and participate in the process of transaction verification and block proposing.

The key feature of blockchains is decentralization. That is, instead of relying on central trusted authorities or infrastructures, blockchains are built on top of a global peer-to-peer (P2P) network, where messages (*e.g., *transactions, blocks) are disseminated to the whole network in a gossip-like manner. Moreover, everyone can verify all the transactions and propose new blocks which are supposed to be appended to the blockchain via consensus mechanisms. Note that these transactions are not just financial transactions (*e.g., *cryptocurrencies, tokens), but virtually everything of value. By applying new applications, like smart contracts (ethereum2016, ), blockchains have been leveraged to many sectors, *e.g., *financial services, Internet-of-Things, insurance, shifting the landscapes of the industries that worth trillions of dollars.

Despite the fact that the blockchain technology brings significant opportunities and disruptive potential in many industries, scalability has been a key issue severely that limits the adoptions of the blockchain technology, causing poor user experience, congested network, and skyrocketing transaction fees. The scalability of blockchains is fundamentally hindered by the challenges of distributed system design (*e.g., *consensus protocols), and limitations of the underlying P2P networks. More specifically, the former results from the core problem of all blockchains, *i.e., *double-spending. It is difficult to prevent double-spending by achieving consensus on an ordered list of transactions in a public setting where anyone, including malicious users, can participate, anytime, anywhere. The latter is due to the fact that P2P network protocols (*e.g., *Kademlia DHT (maymounkov2002kademlia, ; anderson2016new, )) that most existing blockchains employ are not designed for blockchains, where messages (*i.e., *transactions and blocks) need to be disseminated constantly in a many-to-many manner. Therefore, there are two important challenges that a scalable blockchain should address: 1) high throughput (measured by the transaction rate, *i.e., *TPS) and 2) fast confirmation times222A transaction is confirmed if it is included in a confirmed block..

To address the first challenge, a considerable amount of Proof-of-Work (PoW) based consensus protocols (Sompolinsky2016SPECTREAF, ; sompolinsky2018phantom, ; eyal2016bitcoin, ; li2018scaling, ) and Proof-of-Stake (PoS) based consensus protocols (bentov2016snow, ; david2018ouroboros, ; Gilad:2017:ASB:3132747.3132757, ; buterin2017casper, ) have been proposed to increase the system throughput without compromising the security guarantees and decentralization. However, all of the above new PoS-based consensus protocols can only achieve sub-optimal performance in terms of system throughput and confirmation times due to 1) the constrained resources (computation, bandwidth, memory, etc.) of nodes that are selected for transaction verification and block proposing, and 2) the limitations of underlying P2P networks which results in high information propagation latency due to dynamic network topologies and huge blocks with a large number of transactions. To further improve the system throughput, sharding technology (kokoris2018omniledger, ; hsiao-wei2017, ; zamani2018rapidchain, ) has been proposed to split up the task of consensus among multiple, smaller concurrently operating sets of nodes, thus reducing per-node processing and storage requirements. However, creating a secure sharding solution that is capable of making cross-shard (or inter-shard) transactions (especially atomic synchronous transactions) is non-trivial, and existing sharding-based consensus protocols either make security/performance trade-offs or rely on strong assumptions (*e.g., *trusted authorities) that defeats decentralization.

To address the second challenge, some centralized relay networks has been applied (FIBRE333http://bitcoinfibre.org/ and Falcon444https://www.falcon-net.org/ for Bitcoin) or proposed (klarmanbloxroute, ) to reduce the information (*e.g., *blocks) propagation latency. However, all the existing relay networks (either centralized or distributed) for blockchains have paid little attention to the system dynamics (which can be categorized into churn, *i.e., *users leave and join the system), and network heterogeneity (*i.e., *relayers have different and time-varying bandwidth capacities). Without addressing the system dynamics and network heterogeneity, the performance of relay networks may not only be far from optimal, but it may also be detrimental in practice.

In this paper, we present Fission, which achieves both high system throughput and fast confirmation times without compromising security and decentralization. Fission improves upon the scalability limitations in several ways. First, we propose an Eager-Lazy pipeling model that separates every atomic transaction into two successive and independent sub-transactions, which can be processed separately without any communication cost (which significantly deteriorates scalability with sharding based solutions). This enables optimal parallelization while maintaining consistency. Second, we propose an adaptive partitioning mechanism that groups sub-transactions into different partitions according to the latest transaction volumes. As a result, an optimal tade-off between system throughput and confirmation times can be achieved. Last but not least, we employ a PoS-based consensus protocol to reach consensus on transactions in every partition with high security guarantees while maintaining decentralization. We investigate and prove the effect of system activity and honest threshold to the security guarantees, and then propose an online algorithm that enables consensus to be reached in a distributed and non-interactive manner.

Fission employs a hybrid network to minimize the information propagation latency, as a result, both system throughput and confirmation times can be significantly improved. The proposed hybrid network consists of a relay network and a P2P network. To achieve a stable and near-optimal the information propagation latency in the presence of unpredictable system dynamics and network heterogeneity, we propose an ultra-fast probabilistic relayer selection algorithm which converges in $O(\operatorname*{loglog}N^{\text{r}})$ steps, where $N^{\text{r}}$ is total number of nodes in the relay network. The P2P network of Fission complements the relay network such that blocks and even the whole blockchain data can be retrieved from nodes, thus dramatically increases the data availability. We propose an online data retrieval strategy that enables any node to retrieve information from both the relay network and the P2P network in an efficient and cost-effective manner.

The rest of paper is organized as follows. Section 2 provides an overview of Fission fundamentals, including the data structures and cryptographic technologies used in Fission blockchain. We then provide the details of the Eager-Lazy transaction model, the adaptive partitioning mechanism and the consensus protocol in Section 5. In Section 6, we investigate the problem of minimizing the information propagation, and describe our relay selection algorithm, followed by the convergence analysis of the distributed algorithm. Section 7 details the P2P storage layer, the proposed data retrieval strategy and its efficiency. Section 9 will conclude the paper.

2. Overview of Fission

Fission consists of three layers, each of which is designed to fulfill the requirements of three core components of a blockchain: 1) the computation layer than enables all the nodes agree on the common view of a blockchain in presence of Byzantine failures, 2) the network layer that delivers transactions and blocks to all the nodes to create consensus, and 3) the storage layer that form an append-only, temper-proof distributed ledger with cryptographic data structures.

The native currency in Fission, denoted by FIT, is an utility token that enables users to participate in consensus and pay for the transaction processing, smart contract execution, etc.. Without loss of generality, any reference to amount, value, balance or payment in Fission should be counted in FIT.

2.1. Definitions

2.1.1. Account

Fission is a account-based blockchain. Each account has a pair of private and public keys. The private key, denoted by $sk$ , is generated by secp256k1 curve (sec20002, ) and it should be always kept secret. The public key, denoted by $pk$ , is derived from the private key with Elliptic Curve Digital Signature Algorithm (ECDSA) (johnson2001, ). The public key is also referred to as the address of an account, and it can be safely shared in public as it is almost impossible to derive the private key from a public key. One user may control many accounts, but only one public key may exist per account.

2.1.2. Transaction

A transaction is essentially a digitally signed message, where the data includes: 1) the transaction type, 2) a token transfer from one account (*i.e., *the sender) to another (*i.e., *the receiver), 3) a scalar value to be transfered from the sender, 4) a nonce that indicates the number of transactions sent by the sender, 5) a hash of the additional data that can be stored in the P2P storage layer, and 6) a signature of the transaction that used to determine the sender of the transactions. The unique identifier of a transaction is the SHA3-256 hash of the transaction.

2.1.3. Block

A block in Fission consists of 1) a body that contains a list of ordered transactions to be confirmed, and 2) a header that contains Merkle root arrays of these transactions, account table, and transaction logs, as well as the metadata (*e.g., *signatures, votes) needed for the consensus. The account table is a mapping between accounts and their states which contains the information like balances, nonces, etc.., and the transaction logs contains the post-transaction states and logs created through execution of these transactions.

2.1.4. Chain

Fission can be viewed as a transaction-based state machine where every block contains a set of states that can include such information as account balances, transaction logs, etc.. To apply the novel Eager-Lazy pipeling model (see Section 3) that significantly improve the scalability, Fission introduces two types of blocks that are appended to the blockchain in an alternating manner: 1) the interim block that confirms the eager sub-transactions, and 2) the main block that confirms the corresponding lazy sub-transactions.

2.1.5. Merkle root array

Every block has a set of Merkle root arrays (*e.g., *txRoot[]), each of which is an array of Merkle roots of all the shards. A Merkle tree is a binary tree in which every leaf node is labeled with the hash of a data block and every non-leaf node is labeled with the cryptographic hash of the labels of its child nodes.

2.1.6. Shard

A shard is a subset of transactions and states that are grouped by applying modulo function on their public keys. Let $pk_{i}$ denote the public key of node $v_{i}$ , then all the transactions sent by $v_{i}$ and the states of $v_{i}$ will be assigned to the shard with index $j\equiv pk_{i}\mod N_{\text{shard}}$ , where $N_{\text{shard}}$ is the total number of shards. The value of $N_{\text{shard}}$ is contained in every block header, and it is increased based on re-sharding algorithm which is detailed in Section 4.2.

2.1.7. Partition

A partition consists of one or more shards. Fission’s consensus is operating at the partition level such that a small group of nodes will be randomly selected (with probability proportional to their stakes) for each partition to reach a consensus on all the transactions in shards of the partition. Similarly, as detailed in Section 4, shards are assigned to partitions by applying modulo function on their shard indexes.

2.2. The Computation Layer

The goal of the computation layer is to verify transactions, reach consensus on new blocks, and append blocks to the blockchain. To improve the system throughput, sharding, a commonly used scalability scheme in databases decades ago, has been applied to parallelize transaction processing via splitting the overheads of operation among multiple, smaller groups of nodes (*i.e., *shards). However, all existing sharding-based solutions are far from optimal, as they impose a huge burden for the network due to cross-shard communication and synchronization. This significantly deteriorates both the system throughput and transaction confirmation times.

To overcome the limitations of scalability, we propose a novel Eager-Lazy pipeling model, an adaptive partitioning mechanism and a PoS-based consensus protocol in Fission’s computation layer, which significantly improve the scalability while maintaining security and decentralization.

2.2.1. Eager-Lazy pipeling model

We propose a Eager-Lazy pipeling model (see Section 3) that maximizes the parallelization by block pipeling without cross-shard communication. The basic idea is to separate each atomic transaction into two types of independent sub-transactions that can be processed sequentially while ensures consistency of the transaction. As a result, Fission’s blockchain use two types of blocks that are appended to Fission’s blockchain in an alternating manner such that each block contains a set of confirmed sub-transactions with the same type.

2.2.2. Adaptive partitioning mechanism

To the best of our knowledge, none of existing sharding-based solutions adapts the time-varying transactions volumes. More specifically, all the transactions will be distributed to a fixed number of shards. If the transaction volume decreases (during non-peak hours), shards will be underutilized in terms of computation, but both the intra-shard and the inter-shard communication cost remain the same. To this end, we propose an adaptive partitioning mechanism (see Section 4) that can accommodate transaction volumes by dynamically grouping shards in partitions and processing transactions in partition basis. As a result, shards’ resources can be optimally utilized and the network burden can be minimized.

2.2.3. PPAP consensus protocol

Designing a consensus protocol for a scalable blockchain is very challenging. First, the consensus protocol should be able to avoid Sybil attacks (douceur2002sybil, ) - a common attack in open, decentralized environments where an adversary (*i.e., *a malicious user) can create multiple identities to influence the protocol. Moreover, the consensus protocol must be scale to high transaction volumes in an energy-efficient way. To address these challenges, we propose probably secure PoS-based consensus protocol, named Parallel Proof-of-Active-Participation (PPAP, see Section 5), where active (or online) participates (*i.e., *nodes) are randomly selected (with probability proportional to their tokens) reach the consensus on partitions in parallel. We investigate the relationship between security guarantees of PPAP and the system activity and honest threshold.

2.3. The Hybrid Network

Due to the huge communication cost of broadcasting messages from one node to all the other nodes in the system and reaching consensus in partitions, we propose a hybrid network to minimize the information propagation latency, which is defined as the combination of transmission time and the local verification of the message.

As shown in Fig. 1, the proposed hybrid network consists of : 1) a relay network, which consists of a small set of relay nodes that forward messages (*e.g., *transactions, blocks, etc.) in a many-to-many manner, and 2) a P2P network, where every node only broadcast its own messages to its neighbors, while broadcasting or forwarding the hashes of messages signed by other nodes in a gossip manner.

2.3.1. The relay network

The purpose of the relay network is to minimize the information propagation latency via reducing the number of hops that messages traverse before they reach all the online nodes. The relay network consists of a set of *inter-connected * relayers (servers with high network capacity and good hardware specifications) such that every relayer is incentivized (via collecting a portion of transaction fees and block rewards) to deliver every message to a large number of nodes simultaneously.

Specifically, each node will select one relayer to broadcast messages, and update, synchronize its local copy of the blockchain. To avoid garbage messages that may overwhelm the relay network, every relayer must validate messages (based on their signatures) before relaying them to the whole network. Furthermore, the relay network will forward all the hashes of messages, but only deliver the corresponding messages when nodes request. Due to the system dynamics that incurs dynamic and unpredictable loads on the relayer, it is plausible that nodes can effectively select or change their relayers, such that the overall system will not be vulnerable to those relayers that are under-performing or under DDoS attacks.

Instead of deploying the relay network and managing the relayer selection in a centralized way, we design and implement the relay network in a strongly distributed setting. Specifically, Fission enables every node to selfishly independently and concurrently select relayer that maximizes its own profit (*i.e., *lower propagation latency) without any central control. We prove in Section 6.4 that the expected information propagation latency is minimized once the loads on relayers are balanced proportional to their capacities, in which case a Nash equilibrium is achieved.

It is important to note that without accommodating the system dynamics and network heterogeneity in the relay network, the performance of any relay selection strategy may be detrimental in practice. To this end, we then propose a game-theoretic distributed algorithm, named Probabilistic Relay Selection (PRS) that leads the system to $\epsilon$ -Nash equilibrium in an ultra-fast555We follow the common use of the superlative “ultra” for double-logarithmic bounds (cohen2003scale, ; fountoulakis2012ultra, ) $O(\operatorname*{loglog}N^{\text{r}})$ convergence time from any prior system state (*i.e., *load distribution in the relay network), where $N^{\text{r}}$ is the number of relayers.

2.3.2. The P2P network

Fission’s P2P network is based on the Kademlia DHT (maymounkov2002kademlia, ), which is designed to be an efficient means for storing and finding content in a P2P network. Unlike other blockchains like Ethereum, where the Kademlia DHT is used only for node discovery, Fission fully utilizes the DHT-based P2P network as a highly available distributed storage system, similar to IPFS (benet2014ipfs, ).

Similar to other blockchain P2P networks, transactions are disseminated in Fission’s P2P network in a gossip-like manner. Besides, all the nodes will send their transactions to their relayers who help them to reach the whole network in 2 hops. Unlike existing blockchains, Fission’s nodes do not forward blocks to their neighbors, although the block hashes are gossiped in the P2P network. In such a way, block information (*i.e., *block hash) can be propagated (with help of the relay network) to the whole network as soon as possible.

Upon receiving a block hash, a node will choose to retrieve the block if it is selected as a committee member for the block. Otherwise it will ignore it (to effectively utilize the network resource). In Fission, every node can retrieve a block (based on its hash) from a content provider, defined as the node that has a local copy of data, with a lower fee, or from the relay network with a higher fee. In addition, if a node is a committee member for a block, it needs to retrieve the block within a latency constraint (measured in seconds), otherwise it may miss the voting process, and as a result, lost the chance to get the block rewards. To minimize the cost of data retrieval with time constraints, we propose an online and light-weight data retrieval strategy that achieves a good trade-off between performance and complexity.

3. Eager-Lazy Pipeling

Although sharding promises to improve the throughput and reduce per-node processing and storage requirements, existing sharding-based blockchains still require a linear amount of communication per transaction (to sync up every transaction among different shards), and thus attain only partially benefits of sharding.

To this end, we introduce a novel Eager-Lazy pipeling model to improve the system throughput via transaction pipeling. Specifically, each atomic transaction in Fission is divided into two successive and independent sub-transactions: 1) the eager sub-transaction (*e.g., *withdrawing tokens from the payer), and 2) the corresponding lazy sub-transaction (*e.g., *depositing tokens to the payee), as shown in Fig. 2.

The proposed Eager-Lazy pipeling overcomes the limitations of sharding-base solution via avoiding *i.e., *high cross-shard communication. Fission confirms each atomic transaction via confirming the two sub-transactions separately and independently in the two successive blocks. In other words, all the eager sub-transactions will be confirmed in a block, referenced by another block containing all the corresponding lazy sub-transactions. As a result, Fission introduces two types of blocks: 1) the interim block that stores all the confirmed eager sub-transactions, and 2) the main block that stores all the confirmed lazy sub-transactions.

Fission blockchain proceeds in fixed time periods called epochs. In every epoch, a new block that contains a set of sub-transactions is appended to the blockchain. Therefore, the blockchain consists of a sequence of concatenated blocks $<B_{0},B_{1},\cdots>$ , where $i\geq 0$ indicates the epoch number. As shown in Fig. 3, blocks are appended to the blockchain in an alternating way such that $B_{i}$ is a main block if and only if $i\mod 2=0$ , otherwise it is an interim block.

3.1. Micro Block

Note that all the sub-transactions will be processed in parallel with the proposed adaptive partitioning mechanism (explained in Section 4). Upon reaching a consensus, a micro block that contains all the confirmed sub-transactions in a partition will be generated and broadcast to all the partition committee members. Otherwise, the corresponding transactions will be processed in the next round.

3.2. Interim Block

An interim block is a combination of all the micro blocks that contains the eager sub-transactions. The transactions, the account table, and the transaction logs are distributed in shards (not partitions), simply based on the public keys of the senders. As shown in Fig. 3 and Fig. 4, each interim block contains three Merkle root arrays, each of which is a array of Merkle root of each shard. It should be noted that not all the eager sub-transactions in a partition can be processed within $\Delta_{\texttt{micro}}$ (*i.e., *, the time constraint for a micro block) due to both the heterogeneous complexity of transactions (*e.g., *some of them are one-to-many fund transfers, or smart contracts), and the heterogeneous computation capacities of committee members in a partition. An interim block will be generated by the (interim) block committee by combining all the micro blocks from all the partitions within a time constraint $\Delta_{\texttt{interim}}$ . Once a consensus is reached, the interim block will be appended to the blockchain, otherwise an empty interim block will be appended.

3.3. Main Block

Once an interim block is appended to the blockchain, a (main) block committee will be selected to reach a consensus on a main block that contains the corresponding lazy sub-transactions within a time constraint $\Delta_{\texttt{main}}$ . Considering that lazy sub-transactions are basically credit operations based on logs of the eager sub-transactions (*e.g., *txLogRoot[]) in the interim block, they are supposed to be processed much faster than eager sub-transactions. Furthermore, all the corresponding lazy sub-transactions in the interim block need to be confirmed in the main block which reference the interim block by interimHash in the block header. Otherwise, an empty main block will be appended.

4. Adaptive Partitioning

Allocation of resources to processing transactions in parallel is challenging for blockchains of all sizes. Carving out or allocating parts of the system to run tasks without interfering with each other is commonly referred to as “partitioning”. Partitioning, in general, is the ability to divide up system resources into groups of parts in order to facilitate particular functions. Recently, some partitioning scheme have been proposed which partition a blockchain into persistent or static partitions (*i.e., *shards). However, the capacities of the parallel partitions would be underutilized if only a fixed number of transactions were to be proceeded on the fixed number of shards. In order to perform efficient scheduling of resources, we propose an adaptive partitioning mechanism to process transactions parallelly in different partitions, instead of shards. System performance can be further improved by adaptively determining the number of shards allocated to a partition based on the transaction volumes.

Each partition consists of one or multiple shards, each of which consists of eager or lazy sub-transactions, account table, and transaction logs assigned to it based on the public keys of senders and accounts, respectively. The sub-transactions in each partition will be processed and a micro block (see Section 3.1) will be generated and agreed by a small group of nodes, called partition committee, randomly selected using PPAP protocol based on the latest block and the partition index. Specifically, let $j$ be a shard index, then it will be assigned to a partition with index $k\equiv j\mod N^{p}$ , where $N^{p}\leq N^{s}$ is the total number of partitions. The value of $N^{p}$ is contained in every block header, and it is adjusted based on the latest transaction volumes.

4.1. Number of Partitions

The purpose of the proposed adaptive partitioning mechanism is to reach consensus on the final block as soon as possible respecting the fluctuation of transaction volumes. The larger the number of transactions is, more partitions are required such that partitions are not overloaded. In Fission, a simple auto-scaling strategy is applied to determine the number of partitions as follows: Let $N^{e}_{i}$ and $N^{p}_{i}$ be the number of confirmed eager sub-transactions, and the number of partitions in the latest block $B_{i}$ , respectively. The number of partitions for the next block $B_{i+1}$ , denoted by $N^{p}_{i+1}$ is derived as following:

[TABLE]

where $N^{e}_{\max}$ is the maximum number of sub-transactions a partition can process, and $\delta\in(0,1)$ is a scale factor. Both $N^{e}_{\max}$ and $\delta$ are pre-determined in Fission, and they can be adjusted for the best practice.

4.2. Re-Sharding

Note that the state and storage information of nodes are organized using Merkle tree structure, which is essentially a binary tree. It is straightforward and efficient to increase the number of shards by splitting a Merkle tree into two or more Merkle trees. Re-sharding will be triggered automatically if there are a number of successive main blocks, say $(B_{i},B_{i+1},\dots,B_{i+N^{rs}})$ , we have $N^{p}_{j}=N^{s}_{j}$ , for all $j\in[i,i+N^{rs}]$ , where $N^{s}_{j}$ denotes the number of shards in block $B_{j}$ , and $N^{rs}$ is a pre-determined number.

5. PPAP Consensus

Byzantine agreement protocols, *e.g., *PBFT(castro2002practical, ), have been used to sync up states among a relatively small group of servers, and it has been used in other blockchains to reach consensus. However, PBFT requires a fixed number of servers acting like bootstrapping nodes thus those blockchains may be vulnerable to Sybil attacks. Moreover, it does not scale to a large number of nodes (say over 100,000 nodes) who are participating in the consensus process.

To scale the Fission’s consensus process to many online nodes, a small group of nodes are selected (see Section 5.2) as the committee for each partition and block at each epoch. Once the nodes are selected and their voting power are determined, a Byzantine agreement protocol is executed by every selected node (*i.e., *every committee member) to reach a consensus on the new block.

Fission’s consensus process is purely decentralized such that every node can be selected as a committee member. To avoid Sybil attacks, where adversaries may create millions of nodes with negligible tokens to increase the probability of being selected, and thus increase the probability of appending malicious blocks, Fission enables nodes to be selected randomly with probability proportional to their tokens, and their voting power are proportional to their tokens. As a result, Fission’s consensus protocol is a PoS-based protocol.

5.1. Active Participation

Furthermore, Fission’s consensus protocol relies on the active participation of nodes, who improve the security and performance of the system by verifying transactions and voting for the new blocks to be appended to the blockchain. It is worth to note that all the blockchains require active participation from their users. Without the participation of users, the security of blockchains cannot be guaranteed. Therefore, Fission’s consensus protocol is named as Parallel Proof-of-Active-Participation, PPAP in short, where the security of Fission is guaranteed by online nodes who are actively participating the transactions/blocks verification and block producing.

5.2. Committee Selection

In Fission, committees are randomly selected from all the online nodes at each epoch $e$ in a non-interactive manner, with probability proportional to their tokens (*i.e., *stakes). Let $V_{e}=\cup_{i\in\mathbb{Z}^{+}}\{v_{i}\}$ and $S_{e}=\cup_{i\in\mathbb{Z}^{+}}\{s_{i}\}$ denote the set of all the active nodes, and the set of their tokens at epoch $e\in\mathbb{Z}^{+}$ , respectively. Consider every node, say $v_{i}$ performs a Bernoulli trial on every token it has. Let $p$ be the probability of success in the Bernoulli trial. Then the probability of being selected is $1-(1-p)^{s_{i}}$ , and the expected voting power is $p\times s_{i}$ . Therefore, the voting power of node $v_{i}$ , defined as $o_{i}$ , is a random variable follows Binomial distribution, i.e., $o_{i}\sim\mathcal{B}(s_{i},p)$ . The cumulative distribution function (CDF) for $o_{i}$ is defined as the following:

[TABLE]

To generate the $o_{i}$ given $s_{i},p$ for each $v_{i}\in V_{e}$ , we use a simple algorithm based on the inverse transform method (gentle2006random, ), as shown in Fig. 5 and Algorithm 1.

Specifically, every node will generate a random number, say $x$ from a uniform distribution on the interval $(0,1)$ (Line 1 of Algorithm 1). Then it compares $x$ with $F(0;s_{i},p)=(1-p)^{s_{i}}$ (*i.e., *the probability that none of $s_{i}$ tokens is selected) and stop and return [math] if $x$ is not greater, as shown in Lines 2-3 of Algorithm 1. The inverse CDF (i.e., $F^{-1}(x)$ ) is used to calculate the voting power of $o_{i}$ given the random number $x$ , and $s_{i},p$ .

Every node in Fission uses VRFs (micali1999verifiable, ) to generate random numbers to calculate their voting power. Because VRFs allows other node to verify those numbers efficiently. Furthermore, it enables nodes to calculate their voting power in a non-interactive and independent manner. Let $B_{e}$ denote the block at epoch $e$ , and let $seed_{e}$ denote the seed information used by all online nodes to calculate their voting power for the next block $B_{e+1}$ . Every node, say $v_{i}\in V_{e}$ , will generate a tuple using VRF as follows:

[TABLE]

where $hash_{i}^{e}$ is a 256-bit long pseudo-random value that is uniquely determined by $sk_{i}$ (private key of node $v_{i}$ ), $seed^{e}$ (random seed of block $B_{e}$ ), and the type of committee (*e.g., *partition or block committee), but is indistinguishable from random to any node that does not know $sk_{i}$ . $\pi_{i}^{e}$ is the proof that anyone can verify that $hash_{i}^{e}$ is valid with the knowledge of $pk_{i}$ , the public key of node $v_{i}$ . Therefore, the random number that is used to calculate voting power of node $v_{i}$ given $s_{i},p$ is $hash_{i}^{e}/2^{256}$ .

Considering that it is impractical to assume $V_{e}$ or $n=|V_{e}|$ is known as a prior, we choose $p$ as a pre-determined value which does not depend on dynamic, unpredictable information. Specifically, $p=\tau/K$ , where $K$ is the total number of tokens, and $\tau$ is the expected number of tokens selected. Choosing a good $\tau$ is very important as it affects both the performance and the security guarantee of the consensus algorithm, and we will discuss it in Section 5.4.

5.3. Byzantine Agreement

Fission’s Byzantine agreement works roughly as follows: for each block committee, a block proposer will be elected to propose a new block. Specifically, at the beginning of every epoch $e$ , every block committee member $v_{i}\in V_{e}$ generates a ticket as follows:

[TABLE]

Then block committee members then gossip their tickets with each other for a time $\Delta_{\text{leader}}$ , after which they elect the valid ticket with the lowest value they have seen and accept the corresponding node as the block proposer, i.e., leader. If this leader is or becomes unavailable, leadership passes to the next node in ascending order of tickets. Then the block proposer will broadcast the hash of the candidate block to the whole network. Upon received this hash, every block committee member will retrieve the block (from its relayer or other nodes) before verifying all the transactions in the block, and then broadcast its votes with its signature. Once the block proposer receive enough votes (i.e., $\geq\theta$ , see Section 5.4 for more details) on the new block, it will gossip the confirmed block to the whole network, where each node will append this block to its local copy of the blockchain.

5.4. Security Guarantee

Let $S_{h}$ , and $S_{a}$ denote the total number of tokens owned by the online honest nodes, and adversaries (*i.e., *malicious users), respectively. Let $h$ be the honesty threshold, indicating the fraction of tokens that are owned by honest nodes, i.e., $S_{h}=h(S_{h}+S_{a})$ . Let $\alpha$ be the activity of system, defined as the fraction of tokens that are owned by the online nodes. Therefore, we have $(S_{h}+S_{a})=\alpha K$ .

Let $X_{h}$ and $X_{a}$ denote the number of tokens selected (via executing Algorithm 1) from honest nodes, and adversaries, respective. As discussed in Section 5.2, we have $X_{h}\sim\mathcal{B}(S_{h},p)$ , and $X_{a}\sim\mathcal{B}(S_{a},p)$ , that is both $X_{h}$ and $X_{a}$ follow the binomial distribution. Note that $S_{h}$ and $S_{a}$ are sufficient large666Which means $S_{h}p>5,S_{a}p>5$ , thus both $\mathcal{B}(S_{h},p)$ and $\mathcal{B}(S_{a},p)$ can be approximated with Normal distribution as follows:

[TABLE]

Define $Y=X_{h}-2X_{a}$ , and it also follows Normal distribution, i.e., $Y\sim N(\mu_{y},\sigma_{y}^{2})$ , where $\mu_{y}=(3h-2)\alpha\tau$ , and $\sigma_{y}^{2}=(4-3h)\alpha\tau$ . To ensure the consensus is reached on valid blocks, the fraction of selected tokens are owned by adversaries should be less than 1/3. Therefore, the probability $\Pr(Y\leq 0)$ should be negligible given $h$ and $\alpha$ . We transform $Y$ to a random variable $Z$ , which follows a standard normal distribution,i.e., $Z\sim N(0,1)$ , then we have the following:

[TABLE]

Considering $\Pr(Z\leq-6.36)\leq 10^{-10}$ , hence we have

[TABLE]

To ensure that a consensus on a valid block can be reached in a non-interactive manner, the fraction of weights (*i.e., *votes), denoted by $\theta$ , needs to be pre-determined such that a block is valid with high probability (i.e., $>1-10^{-10}$ ) if there are more than $\theta\tau$ votes are received on the block. Hence $\Pr(X_{a}\geq\theta\tau)$ and $\Pr(X_{h}<\theta\tau)$ should be negligible, even the activity of system (i.e., $\alpha$ ) is relative low, as shown below:

[TABLE]

where $Z\sim N(0,1)$ is a random variable follows standard normal distribution. Considering $\Pr(Z>6.36)<10^{-10}$ and $\Pr(Z<-6.36)<10^{-10}$ , hence we have

[TABLE]

Note that one of most secure and well-adopted blockchain, Bitcoin, is still vulnerable to attacks (*e.g., *selfish mining(eyal2018majority, )) if adversaries control over $25\%$ of hashrate. We assume that the honest threshold of the system, $h$ , should be larger than $0.75$ . Otherwise Fission will be vulnerable to attacks by adversaries. As the right part of Eq. 7 is a monotone and non-increasing function over $h$ , we have $\tau>\frac{1134}{\alpha}$ given $h\geq 0.75$ . To satisfy Eq. 9 under the same assumption of $h\geq 0.75$ , $\tau$ should be relative large to be adaptive to different activity of system, i.e., $\alpha$ . However, a larger $\tau$ results in a higher communication cost, thus $\tau$ needs to be determined considering the trade-off between the communication cost and security. Specifically, based on Eq. 9 and Eq. 7, we have the lower bound of $\theta$ given $h\geq 0.75$ and $\alpha\in[0,1]$ :

[TABLE]

In Fission we set $\tau=5000$ , and $\theta=0.3$ . Therefore, to satisfy the right inequation of Eq. 9, we have

[TABLE]

and thus $\alpha>0.53$ . Therefore, given $h\geq 0.75$ , $\alpha>0.53$ , and $\tau=5000$ , a consensus on a valid block can be reached *w.h.p. *if over $1500$ votes are received.

6. The Relay Network

The relay network of Fission is a mesh network consists of a group of relayers who contribute their resources (*e.g., *bandwidth, memory and computation) to propagate messages (*e.g., *transactions, blocks, etc.). With relayers, all messages can be delivered within at most 3 hops, thus the information propagation latency is significantly reduced.

The relay network in Fission is implemented in a strong practical and distributed settings: 1) there are no oracle or centralized authorities managing or controlling the relayers, 2) each relayer has no knowledge (*e.g., *load, number of connections, etc.) about other relayers, and 3) relayers have heterogeneous bandwidth capacities and hardware specifications, and both can vary over time.

To achieve the optimal propagation latency in the distributed settings, nodes need to change their relayers from overloaded ones to underloaded ones independently and concurrently (see Section 6.1). However, It can be observed that if all nodes behave greedily at each step (*i.e., *, they select those relayers with minimum load deterministically), the system may not converge to a steady state (Berenbrink:2006:DSL:1109557.1109597, ). To guarantee that the system will converge rapidly, some probability rules need to be imposed for all nodes when they decide to change their strategies at each step. In Fission , we proposed a probabilistic relayer selection algorithm, called PRS (see Section 6.4), which leads the system converge to an $\epsilon$ -Nash equilibrium (see Section 6.2) where a near-optimal propagation latency is achieved.

6.1. Relay Selection and Update

After joining the system, every node $v_{i}\in V$ will randomly select a relayer, denoted by $m_{j}\in M$ with probability proportional to their advertised bandwidth capacities, and then it will invoke a periodic relay update process to choose a better relayer every $\Delta_{\texttt{relay}}$ seconds (the update interval).

At each step, node $v_{i}$ randomly selects a different relayer, defined as the relay candidate, in the system. Node $v_{i}$ will potentially replace the current relayer with the relay candidate using Algorithm 2, detailed in Section 6.4. It is worth noting that every node will incur a certain load on its relayer. The load of $m_{j}\in M$ , denoted by $l_{j}$ , is defined as the number of nodes that choose $m_{j}$ as their relayer, and the load ratio of $m_{j}$ , denoted by $r_{j}$ , is defined as $l_{j}/u_{j}$ .

6.2. Minimizing Propagation Delay

The key performance metric for information propagation in bandwidth constrained networked environments is the averaged propagation latency. In Fission , every node will send the message (transactions and blocks) and its hash to the relayer, who then forward the message to all the connected nodes who did not receive the message yet. Therefore, the distribution of loads on the relayers will significantly affect the propagation delay.

Considering that relayers have heterogeneous and time-varying resources in practice, we investigate the problem of minimizing expected information propagation delay respecting the heterogeneous bandwidth constraints, as shown in Theorem 6.1. We then propose a game-theoretic relay selection algorithm, *i.e., *PRS, to minimize the averaged information propagation delay in a fully distributed manner.

Theorem 6.1.

Let $V=\cup_{i\in\mathbb{Z}^{+}}\{v_{i}\}$ be the set of online nodes. Let $M=\cup_{i\in\mathbb{Z}^{+}}\{m_{i}\}$ and $U=\cup_{i\in\mathbb{Z}^{+}}\{u_{i}\}$ be the set of relayers and their advertised upload capacities, respectively. Given any distribution of load ratios on the relayers, $\mathbf{r}$ , the expected averaged information propagation delay, denoted by $\operatorname*{E}(\mathbf{r})$ , is minimized if and only if $\forall r_{i}\in\mathbf{r},r_{i}=\overline{r}=\frac{|V|}{|U|}$ .

Proof.

As every node selects a relayer u.a.r., the number of messages to be propagated by a relayer is proportional to its load (*i.e., *number of nodes that select it as the relayer). Therefore, the expected number of message to be propagated by the relayer $l_{j}/|V|$ . For every message, $m_{j}$ is supposed to forward to $l_{j}$ nodes, thus the propagation delay incurred on relayer $m_{j}$ is $\frac{\overline{w}l_{i}^{2}}{u_{i}|V|}$ , where $\overline{w}$ denotes the averaged message size. Furthermore, the problem of minimizing the expected propagation delay for any node $v_{i}\in V$ is formulated as the following

[TABLE]

By the method of Lagrange multipliers, $\operatorname*{E}(\mathbf{r})$ is minimized if $\frac{l_{i}}{u_{i}}=\frac{l_{j}}{u_{j}},\forall m_{i},m_{j}\in M$ , and this completes the proof. ∎

Theorem 6.1 implies that to minimize the expected propagation delay, the loads on the relayers must be balanced proportionally to their advertised upload bandwidth. To achieve a scalable system, our goal is to distribute and balance the loads on the relayers in a fully distributed manner. We then formulate the distributed load balancing problem as an asymmetric congestion game, where each player (*i.e., *node) in the game can change his/her strategy (*i.e., *relayer) individually and concurrently.

6.3. Congestion Game Preliminaries

The classical congestion games have been investigated for many years. A congestion game, denoted by $G$ , can be defined as a tuple $(V,M,(A_{i})_{i\in V},(f_{e})_{e\in M})$ , where $V$ denotes the set of players and $M$ denotes a set of facilities. Each player $i\in V$ is assigned a finite set of strategies $A_{i}$ and a cost function $f_{e}$ is associated with facility $e\in M$ .

To play the game, each player $i\in V$ selects a strategy $a_{i}\subseteq A_{i}$ , where $A_{i}\subseteq M$ is the strategy set of player $i$ . The strategy profile, denoted by $\mathbf{a}=(a_{i})_{i\in V}$ , is defined as a vector of strategies selected by all the players. Similarly we use the notation $\mathcal{A}=\times_{i\in V}A_{i}$ to denote the set of all possible strategy profiles. A congestion game is symmetric if all the players have the same strategy set, i.e., $\forall i,j\in V,A_{i}=A_{j}$ ; otherwise it is asymmetric. A congestion game is weighted if each player $i$ is specified a weight $w_{i}$ . The cost of player $i$ for the strategy profile $\mathbf{a}$ is given by $c_{i}(\mathbf{a})=f_{a_{i}}(\mathbf{a},w_{i})$ .

The goal of each player in congestion games is to minimize her own cost without trying to optimize the global situation. That is, all players will try to lower their own cost by changing their strategies individually. A pure Nash equilibrium is defined as a steady state in which no players have an incentive to change their strategies.

Definition 6.2 (Pure Nash equilibrium (nisan2007algorithmic, )).

A strategy profile $\mathbf{a}\in\mathcal{A}$ is said to be a pure Nash equilibrium of $G$ if for all players $i\in V$ and each alternate strategy $a_{i}^{\prime}\in A_{i}$ ,

[TABLE]

where $\mathbf{a}_{-i}=(a_{j})_{j\in V\setminus\{i\}}$ denotes the list of strategies of the strategy profile $\mathbf{a}$ for all players except $i$ .

Congestion games have a fundamental property that a pure Nash equilibrium always exists (fabrikant2004complexity, ). To analyze convergence properties of congestion games (i.e., the time from any state to a pure Nash equilibrium), we introduce the definition of potential function for congestion games.

Definition 6.3 (Potential function (even2005fast, )).

A function $\Phi:\mathcal{A}\rightarrow R$ is a potential for game $G$ if $\forall\mathbf{a}\in\mathcal{A}$ , $\forall a_{i},a_{j}\in A_{i}$ ,

[TABLE]

Therefore, $\mathbf{a}^{*}$ is a Nash equilibrium if and only if

[TABLE]

An $\epsilon$ -Nash equilibrium is an approximate Nash equilibrium, which is defined as a state in which no player can reduce her cost by a multiplicative factor of less than $1-\epsilon$ by changing her strategy.

We define a potential function to associate the expected propagation delay with $\mathbf{r}=(r_{1},r_{2},\dots)$ , which is the load ratio vector on the relayers:

[TABLE]

where $\overline{r}=|V|/|U|$ is the optimal load ratio. The potential function has the property that if node $v_{i}$ switches from a relayer with a high load ratio to another relayer with a low load ratio, $\Phi(\mathbf{r})$ will decrease accordingly.

Note that every node is selfish and will change its strategy to lower its own cost at each step. This will result in Nash dynamics – i.e., $\Phi(\mathbf{r})$ will fluctuate over time. To guarantee that the system converges to a steady state rapidly, a set of probability distributions over $A_{i}$ (i.e., strategy set) needs to be assigned to each node $v_{i}\in V$ . That is, all the nodes’ strategies are nondeterministic and are regulated by a probabilistic rule.

6.4. Probabilistic Relayer Selection (PRS)

We now present the probabilistic relay selection (PRS) algorithm based on an asymmetric congestion game. PRS enables each node to update its relayer as follows (shown in Algorithm 2). At each step, node $v_{i}$ that selects a relayer $m_{j}$ will contact a relay candidate $m_{k}$ with probability proportional to its advertised upload bandwidth. At the same time it finds the load and the capacity of $m_{k}$ (i.e., the relay candidate). Let $r_{j}$ and $r_{k}$ be the load ratios of $m_{j}$ and $m_{k}$ , respectively. If $r_{j}>r_{k}$ , then $v_{i}$ replaces $m_{j}$ with $m_{k}$ with a probability, denoted by $\Pr(j,k)$ , shown as below:

[TABLE]

where $u_{k}$ is the advertised upload capacity of $m_{k}$ , and $|U|$ is the total advertised upload capacities of all the relayers.

For very relayer $m_{k}\in M$ , the probability of being selected as a candidate relayer is $\frac{u_{k}}{|U|}$ . Our solution is to maintain an identifier space, which can be managed either by a distributed scheme (e.g., using DHT), or by a centralized scheme (i.e., using a server to store all the identifiers). Each identifier consists of a relayer ID (*i.e., *, its name or IP address) and a randomly generated hash key. To let a relayer to be chosen probabilistically proportional to its upload capacity, each relayer $m_{k}$ will maintain $\frac{u_{k}}{\mu}$ identifiers, where $\mu$ is a scalar value measured by Kbps. Therefore, at each step of the relayer update process, node $v_{i}$ will choose an identifier from the identifier space u.a.r. (uniformly at random), hence, the probability of relayer $m_{k}$ to be contacted by $v_{i}$ , is proportional to its upload capacity $u_{k}$ .

An important criteria to evaluate distributed algorithms is the convergence time, which is a measure of how fast the algorithm leads the system reaches a steady state. Otherwise system dynamics may significantly degrade the performance. We analyze the upper bound of the convergence time of PRS, where all the nodes behave simultaneously at each step.

Let $\mathbf{r}_{t}$ and $\mathbf{r}_{t+1}$ be the load ratio vectors of the current step $t$ and the next step $t+1$ , respectively. Let $\mathbf{r}_{t}^{i}=l_{i}/u_{i}$ denote the load ratio of relayer $m_{i}$ at time $t$ . Since we consider a highly dynamic and heterogeneous environment where nodes can join, leave and make transactions at will, we are especially interested to show how fast PRS can lead the system to converge to a $\epsilon$ -Nash equilibrium. We observe that $\max\Phi_{\epsilon\text{-Nash}}=\frac{\epsilon^{2}m}{4}$ , where $m$ is the total number of relayers. Hence we say that a $\epsilon$ -Nash equilibrium is reached at time $t$ if $\Phi(\mathbf{r}_{t})=O(m)$ .

Theorem 6.4.

Let $m$ be the number of relayers, and let $\Phi(\mathbf{r}_{t})$ be the potential function of the relayers at any step $t$ . Then, the upper bound on the number of steps that $\Phi(\mathbf{r}_{t})$ requires to reach a $\epsilon$ -Nash equilibrium is $O(\log\log m)$ .

Proof.

To better structure the proof, we introduce two supporting lemmas provided in the Appendix. Given any step $t$ , we observe from Lemma A.1 that the expected load ratio of any node at step $t+1$ is the optimal load ratio $\overline{r}$ . Lemma A.2 provides an upper bound of the variance of load ratios at step $t+1$ . Consider $\operatorname*{E}\left[\Phi(\mathbf{r}_{t+1})\right]$ , which is the expected value of the potential function at step $t+1$ :

[TABLE]

Note that all the nodes behave independently at each step (*i.e., *select a relayer with lower load ratio), hence $\mathbf{r}_{t+1}$ is independent of $\mathbf{r}_{t}$ . It follows that

[TABLE]

Lemma A.1 indicates that the expectation of each relayer $m_{i}$ , $\operatorname*{E}\left[\mathbf{r}_{t+1}^{i}\right]$ , is $\overline{r}$ at each step, hence we have

[TABLE]

We observe that the square-root function is concave. Therefore, by Jensen’s inequality, the expected value of the potential function at step $t+1$ is

[TABLE]

Since $\operatorname*{E}\left[\Phi(\mathbf{r}_{t})\right]>0$ for any step $t\geq 0$ , we define a function $f(t)=\log(\operatorname*{E}\left[\Phi(\mathbf{r}_{t})\right])$ , and we have the following

[TABLE]

After $\lambda\in\mathbb{Z}$ steps from any step $t\geq 0$ ,

[TABLE]

Therefore, the expected value of the potential function after $\lambda$ steps from any step $t\geq 0$ is

[TABLE]

The upper bound of $\Phi(\mathbf{r})$ is $O(m^{2})$ , which occurs in the case when every node $v_{i}$ selects one specific relayer $m_{j}\in M$ . Hence, the upper bound of $f(t)$ is $O(\log m)$ , *i.e., *there exists an integer $a\in\mathbb{Z}$ such that $\sup f(t)\leq a\log m$ . Therefore, it suffices to show that there exists a $\lambda=\lceil\log(a\log m)\rceil\sim O(\log\log m)$ , such that $\frac{f(\lambda)}{2^{\lambda}}\leq 1$ . Hence, for any $t\geq 0$ , it suffices to show that

[TABLE]

where $\lambda=O(\log\log m)$ . By Markov’s inequality, we have

[TABLE]

Let $\tau$ be the first time such that $\Phi(\mathbf{r}_{t+\tau})\leq 4m$ occurs from any step $t$ . Flip a coin after each of $\lambda$ steps until $\Phi(\mathbf{r}_{t+\tau})\leq 4m$ :

[TABLE]

where $p=\Pr(\Phi(\mathbf{r}_{t+\lambda})\leq 4m)$ , thus $0.5\leq p<1$ . Therefore,

[TABLE]

By Markov’s inequality, $\Pr(\tau\geq 60\lambda)\leq\frac{\operatorname*{E}\left[\tau\right]}{60\lambda}=0.025$ . Conversely, $\Pr(\Phi(\mathbf{r}_{t+60\lambda})\leq 4m)\geq 0.975$ . As a result, from any state, the system can rapidly (within $O(\log\log m)$ steps) converge to a $\epsilon$ -Nash equilibrium, where the load ratios of the nodes are approximately balanced.

∎

6.5. Incentive

Note that non-cooperative or malicious behaviors in may significantly adversely affect the entire network via flooding the relay network with invalid messages. To this end, relayers will verify every message they received before propagating it to other relayers or nodes. This also enables relayers to identify malicious nodes, and then limit the effect of such attacks. Besides, It incurs bandwidth costs to deliver large traffic volumes to a large number of nodes.

To ensure Fission’s sustainability and robustness, relayers are incentivized to relay messages as quickly as possible, helping partition or block committees to win the race for block rewards (deterministically, the first $\theta$ votes for blocks). One possible approach is to apply a subscription-based model that every node will pay a rental fee to its relayer. Another approach is to open a payment channel between every node and its relayer such that each payment on a single message is paid in an on-demand basis. We will detail the incentive mechanism in our future work and implementations.

7. The P2P Network

Unlike other blockchains’ P2P network, where messages are delivered in a gossip-like manner, Fission’s P2P network takes advantage of P2P content-sharing capabilities of BitTorrent or IPFS. Combined with the relay network and the P2P network, Fission provides an efficient, cost-effective, and highly available distributed information storage and dissemination solution that scales to high transactions volumes.

Specifically, only hashes of messages are gossiped in Fission’s P2P network. Hence Fission scales to large blocks. However, it may require every node to retrieve the whole content from either the relayer, or a node in the P2P network immediately after it receives a hash from its neighbor or relayer. For instance, a block committee member needs to retrieve the to-be-verified block and vote for it within a latency constraint (e.g., $\Delta_{\texttt{interim}}$ or $\Delta_{\texttt{main}}$ ) in order to receive the block rewards. To incentivize nodes to contribute their resources (*e.g., *storage, bandwidth, etc.) to serve others, a small amount of fee (in FIT) will be charged by the content provider, which are the nodes having the required data in their local storage. It should be noted that nodes should pay less fees to the content providers than the relayers (usually with much higher network capacities and hardware specifications) which provide a better QoS (*i.e., *Quality-of-Service) in terms of latency.

Therefore, the challenge is to minimize nodes’ costs of retrieving content while respecting 1) the latency constraints of content, 2) the heterogeneous resource (*e.g., *bandwidth) distribution on nodes, and 3) the system churn (nodes can join and leave the system anytime). To address this challenge, we propose an online algorithm that minimize the cost by optimally utilizing the bandwidth of nodes in a fully decentralized manner.

7.1. Problem Formulation

Let $V$ and $W$ denote the set of online nodes and the set of data (*e.g., *transactions, blocks, etc.) that is required by these nodes, respectively. Let $P_{k}\in V$ denote the set of content providers who are online nodes that has data $k$ in their storage. The size of each data $i\in W$ , denoted by $w_{i}$ , is measured in units of kilobytes (*i.e., *, KB). It should be noted that all the data is splitted into chunks, in order to improve the data availability and efficiency of retrieval. Let $Q_{i}=\cup_{j\in V}\{q_{i}^{j}\}$ be the set of requests sent by all nodes for data $i$ , where $q_{i}^{j}$ denotes the request sent by node $j$ for content $i$ . The weight of the request $q_{i}^{j}$ be the size of the data (*i.e., *, $w_{i}$ ). The latency constraint for each data is denoted by $d=\min(\Delta_{\texttt{eager}},\Delta_{\texttt{lazy}})$ , measured in seconds. Without loss of generality, we assume online nodes are heterogeneous and have different upload capacities. Let $u_{i}$ denote the advertised upload capacity of node $i$ , measured in kilobytes per second, *i.e., *, KB/s.

Since the amount of data that can be delivered by node $i$ within the latency constraint is $d\cdot u_{i}$ , and thus the relayers will disseminate the rest of the amount of data, if any. Therefore, the relayers will send $\max\{l_{i}-d\cdot u_{i},0\}$ of data due to the inadequate upload capacity of node $i$ . Our goal is to maximize the data sent by nodes while respecting the upload capacities of heterogeneous nodes and the latency constraints of data, which is formulated as follows:

[TABLE]

where $x_{k}^{j,i}$ is an indicator variable indicating whether the request $q_{k}^{j}$ is sent to node $i$ . Specifically, if $q_{k}^{j}$ is sent to node $i$ , $x_{k}^{j,i}=1$ ; otherwise $x_{k}^{j,i}=0$ . It can be observed that the objective (33) is a multidimensional knapsack problem (puchinger2010multidimensional, ), which is NP-complete (garey1979computers, ) in general. Many approaches have been proposed to solve it, such as LP-relaxation (schrijver1998theory, ), the primal-dual method (buchbinder2009design, ) and dynamic programming (schrijver2003combinatorial, ). However, these centralized approaches are not desirable and practical since a global connectivity information among nodes and the request information is required to obtain a feasible solution. Moreover, there are existing decentralized (panconesi2008fast, ) and online (buchbinder2009design, ; emek2010online, ) algorithms that compute approximate solutions within poly-logarithmic communication rounds, but they are based on the assumption that each node has the same capacity and is aware of global information, *i.e., *, it knows about loads and upload capacities of nodes, and the connectivity information between any two of them. Note that the dual problem is to minimize the data sent by relayers, given the same request set, which can be formally stated as follows:

[TABLE]

Considering every request has only one provider. The load of node $i$ is $l_{i}=\sum_{Q_{k}\neq\emptyset}\sum_{i\in Q_{k}}x_{k}^{j,i}w_{k}$ . Therefore, the objective (36) can be simplified to

[TABLE]

where $\max\{l_{i}-d\cdot u_{i},0\}$ is the amount of bandwidth consumed by the relayers due to the inadequate upload capacity of node $i$ . Therefore, we transform the problem of maximizing the data sent by nodes into an asymmetric load balancing problem: how to distribute the loads on overloaded nodes ( $l_{i}>d\cdot u_{i}$ ) to underloaded nodes ( $l_{i}<d\cdot u_{i}$ ).

The objective of Fission’s P2P storage layer is to design a P2P data retrieval strategy that can be implemented in a decentralized and online fashion. To this end, we model the P2P data retrieval problem as a congestion game and thus inherit its practical and decentralized nature, which has been successfully used to model load balancing problems in P2P networks as well as many other real world applications due to its conceptual simplicity.

7.2. Data Retrieval Strategy (DRS)

The objective (39) can be formulated as an asymmetric weighted congestion game, de=noted by $G$ , which is defined as

[TABLE]

where $Q=\cup_{k\in W}Q_{k}$ and $P=\cup_{k\in P}P_{k}$ correspond to the set of requests and the set of content providers, respectively. Each request $i\in Q$ has a weight $w_{i}$ (*i.e., *, the size of the requested data), and each node $i\in P$ has an upload capacity $u_{i}$ . The strategy set of request $i$ , denoted by $A_{i}$ , is the set of nodes $P_{i}$ that have stored the requested data in their caches. Since every request is sent to only one node at a time, $G$ is a singleton congestion game, i.e., $\forall i\in Q$ , $a_{i}\in A_{i}$ .

The strategy profile of $G$ , denoted by $\mathbf{a}=(a_{i})_{i\in Q}$ , corresponds to the DRS777That is, choose which node to retrieve data from., and $\mathcal{A}=\times_{i\in Q}A_{i}$ corresponds to the set of all possible DRSs. The cost of request $i$ given a node selection profile $\mathbf{a}$ is defined as

[TABLE]

where $j=a_{i}$ is the node selected by request $i$ and $h_{i}$ is the height 888All nodes serve retrieval requests in a FIFO manner. of request $i$ at node $j$ . Clearly, for each node $j\in S$ , the sum of the cost of requests that select node $j$ is

[TABLE]

Recall that $\max\{l_{j}-d\cdot u_{j},0\}$ is the amount of data that the relayers will disseminate to the corresponding nodes due to the upload capacity limitation of node $j$ . We define the potential function as the sum of the cost of all requests:

[TABLE]

Therefore, the problem of maximizing the data sent by nodes can be solved by minimizing the potential function of the corresponding congestion game.

It has been shown that finding a pure Nash equilibrium in a congestion game is PLS-complete (fabrikant2004complexity, ), hence the number of changes of the strategy profile required from one state to any pure Nash equilibrium is exponentially large. Therefore, our goal is to quickly reach an approximate Nash equilibrium from any state by enabling each node to change its strategies for its requests according to a bounded jump rule (see lines 3 and 3 in Algorithm 3), which will be discussed later.

It is worth to point out that the offline data retrieval strategies are undesirable in practice due to both the additional delay they incur to find appropriate content providers for all nodes, and the inefficient utilization of node bandwidth (since the node bandwidth cannot be used to delivery data during the process of finding a good node selection strategy). Furthermore, blockchains are highly dynamic environments such that churn and time-varying node capacities will significantly degrade the feasibility and efficiency of offline data retrieval strategies.

We introduce an online, light-weight data retrieval strategy which maximizes the data sent by nodes by enabling each node to repeatedly update its strategy for each data that it requests. More specifically, all requests are sent by nodes in a repeated fashion, *i.e., *, the strategy of each request can be changed if (1) the current content provider of the request is overloaded, and (2) the corresponding node finds an underloaded provider within the latency constraint.

Our algorithm is online such that each node performs data retrieval independently. Each request is sent to a content provider u.a.r. once it is generated (line 3 in Algorithm 3). Each node $i$ can change its content provider for each request in a repeated fashion if the bounded jump rule is satisfied.

Our algorithm is light-weight such that each node $i$ needs to contact only one content provider in $P_{k}(d^{\prime})$ for each request $q_{k}^{i}$ with latency constraint $d^{\prime}$ at one time, where $P_{k}(d^{\prime})=\{i\in P_{k}:l_{i}<d^{\prime}u_{i}\}$ . The current latency constraint (i.e., $d^{\prime}=d-t+t_{k}$ ) of the request is calculated at the requesting node, where $t$ and $t_{k}$ are the current time and the time when the request is generated, respectively. It is worth noting that every node in $P_{k}$ has the same probability being selected, although some nodes have failed to satisfy the bounded jump rule in previous rounds.

The key idea of the bounded jump rule is to restrict the change of content providers such that the potential of the system decreases monotonically. Since all requests are processed in a FIFO fashion, the nodes that have the requests which can be processed within their latency constraints will not consider changing their current strategies (see lines 3 and 3 in Algorithm 3). From the perspective of congestion games, those requesting nodes have no incentives to change their strategies since their costs cannot be reduced (for pure Nash equilibrium) or not significantly reduced (for $\epsilon$ -Nash equilibrium). More formally, for a given request $q_{k}^{i}$ (sent by node $i$ for data $k$ ) at node $p_{i}^{k}$ with a weight $w_{k}$ and a height $h_{k}^{i}(t)$ , by Eq. 41, the cost of $q_{k}^{i}$ at time $t$ is

[TABLE]

where $\mathbf{a}(t)$ is the strategy profile of all nodes at $t$ , and $(d-t+t_{k})$ is the latency constraint of data $k$ at $t$ . This implies

[TABLE]

Therefore, a node will consider to change its strategy for its requests only when the costs of requests are higher than their weights, otherwise it will stop contacting other nodes (line 3 in Algorithm 3). If the requester of $q_{k}^{i}$ (i.e., node $i$ ) finds another content provider $j^{\prime}$ at $t$ (line 3 in Algorithm 3) during the repeated update process (the loop from line 3 to line 3 in Algorithm 3) such that $l_{j^{\prime}}(t)\leq(d-t+t_{k})u_{j^{\prime}}$ , it will send $q_{k}^{i}$ to node $j^{\prime}$ . It is worth noting that multiple nodes may find the content provider $j^{\prime}$ and send requests to it at the same time. In the worst case, only the first request can be processed by node $j^{\prime}$ since nodes process requests in a FIFO manner. Finally, the request will be sent to the relayers if $c_{q_{k}^{i}}(\mathbf{a}(t))=w_{k}$ for any $t\in[t_{k},t_{k}+d]$ . It should be noted that no requesting node will change its content provider mid-way during a download, since it will stop contacting other content providers if it can download the request data from the current content provider (lines 3 and 3 in Algorithm 3).

With the bounded jump rule, it is easy to observe that the potential function of game $G$ is monotonically decreasing such that for a given set of requests and $\forall t_{1},t_{2}$ , we have

[TABLE]

Furthermore, our algorithm applies to a fully distributed and concurrent setting such that all nodes can change their content providers at the same time without a centralized coordination.

7.3. Analysis

With the proposed bounded jump rule, a steady state is a $\epsilon$ -Nash equilibrium of the congestion game. Therefore, a critical question to investigate is how fast does the system reach a $\epsilon$ -Nash equilibrium from any state? In what follows, we seek to investigate the upper bound of the convergence time of our data retrieval strategy.

Recall that the potential function of the game $G$ with the bounded jump rule is monotonically decreasing in the presence of concurrent changes of nodes’ strategies (nodes update their content providers individually), because nodes only consider to change the overloaded content providers. More specifically, node $i$ will change the content provider of request $q_{k}^{i}$ , if and only if $h_{k}^{i,j}-(d-t+t_{k})u_{j}\geq w_{k}$ (*i.e., *Eq. 46 and Line 3 in Algorithm 3), where $t$ and $t_{k}$ are the local time and the time when the request is generated, respectively. The equivalent condition is $h_{k}^{i,j}+(t-t_{k})u_{j}-w_{k}\geq d\cdot u_{j}$ . Therefore, we can analyze the convergence rate of the proposed node selection strategy via analyzing the convergence rate of the corresponding congestion game.

Theorem 7.1.

Given a game $G=(Q,P,(A_{i})_{i\in Q},(f_{e})_{e\in P})$ that satisfies the bounded jump rule, where $Q=\cup_{k\in W}Q_{k}$ and $P=\cup_{k\in W}P_{k}$ is the set of requests and content providers, respectively. Then a $\epsilon$ -Nash equilibrium can be reached within $O(\log n)$ rounds, where $n=|V|$ is the number of online nodes.

Proof.

Let $\Phi(t)$ be the potential of the game at time $t$ . According to Eq. 44, we have $\Phi(t)=\sum_{k\in Q(t)}w_{k}$ . Similarly, let $P(t)=\cup_{k\in Q(t)}P_{k}(d)$ be the set of underloaded nodes at time $t$ , where $P_{k}(d)=\{i\in S_{k}:l_{i}<d\cdot u_{i}\}$ denotes the set of underloaded nodes that have data $k$ in their caches. Let $n=|V|$ and $m_{t}=|P(t)|$ be the number of online nodes and underloaded content providers at time $t$ , respectively. Note that a pure equilibrium is reached at time $t$ if (1) $\Phi(t)=0$ (all requests can be delivered by nodes within latency constraints), or (2) $m_{t}=0$ (no more underloaded content providers). Therefore, we introduce a non-increasing function $\Omega(t)=m_{t}\Phi(t)$ to analyze the convergence rate of DRS.

At time $t+1$ , let $\mathbf{x}=\{x_{1},x_{2},...,x_{m_{t}}\}$ be a vector, where $x_{i}$ denote the total size of requests that migrate to content provider $i\in P(t)$ . Obviously, $\Phi(t)=\sum_{i=1}^{m_{t}}x_{i}$ . By applying Cauchy-Schwarz inequality to $\operatorname*{E}[\Omega(t+1)]$ , which is the expected value of $\Omega(t+1)$ , we have the following:

[TABLE]

Noting that the square root function $f:x\rightarrow\sqrt{x}$ and the function $g:x\rightarrow x^{2}$ are convex functions, we apply Jensen’s inequality to Eq. 48 and have the following:

[TABLE]

where $m$ is the number of underloaded content providers in $P(t+1)$ at time $t+1$ . The expected value of the potential function given $m$ is

[TABLE]

Combining Eq. 50 and Eq. 49, we have the following:

[TABLE]

Let $\tau$ be the number of rounds that a $\epsilon$ -Nash equilibrium (i.e., $\Omega(t+\tau)=O(1)$ ) is reached from time $t$ . By Lemma 7.2, we have the following

[TABLE]

Note that the maximum value of of $\Omega(t)$ is $|W|n^{2}$ ( $|W|$ is the total number of data), which occurs in the case when every node requests all the data in the region. Therefore, $\operatorname*{E}[\tau]\leq 2\log(|W|n)$ . By Markov’s inequality, we have that

[TABLE]

This implies that $\Pr(\tau\leq 40\log(|W|n))\geq 0.95$ . As a result, with a high probability, a $\epsilon$ -Nash equilibrium can be reached from any state within $O(\log n)$ rounds. ∎

Lemma 7.2.

Let $X_{1},X_{2},...$ denote a sequence of non-negative random variables and assume that for all $i\geq 0$

[TABLE]

for some constant $\alpha\in(0,1)$ . Furthermore, fix some constant $x^{*}\in(0,x_{0}]$ and let $\tau$ be the random variable that describes the smallest $t$ such that $X_{t}\leq x^{*}$ . Then,

[TABLE]

Proof.

The complete proof is shown in (fischer2008approximating, ). ∎

Theorem 7.1 indicates that the proposed DRS can rapidly reach a $\epsilon$ -Nash equilibrium within $O(\log n)$ communication rounds from any state, where $n=|V|$ is the total number of online nodes in the system.

8. Future Work

At the time of writing, Fission is still a Proof of Concept, and it has limitations that we want to address in future work. As described in Section 7, nodes are contributing their resources (*e.g., *storage and bandwidth) to speed up information propagation and improve data availability. We leave to future work the exploration of incentive mechanisms that encourages nodes to share their resources, similar to Filecoin999https://filecoin.io/. We also leave to future work the use of advanced cryptography, such as BLS signature scheme (boneh2001short, ) for performance improvements on PPAP consensus. Last but not least, we see in Section 5.4 that the system activity (*i.e., *online users participating the consensus process) has a great impact to security guarantees. Considering that most of users may not be online 24x7, we will investigate some delegation mechanisms that improve the security of our consensus protocol without significantly degrading the decentralization nature of Fission .

9. Conclusion

In this paper, we present Fission that distinguishes itself from all existing permissionless blockchains that it achieves scalability in both terms of system throughput and transaction confirmation time through maximizing the parallelization (Eager-Lazy pipeling) and efficiency (adaptive partitioning and PoS-based consensus protocol) in the computation layer, and minimizing the information propagation latency via a hybrid topology.

Appendix A Lemmas for Proof of Theorem 6.4

Let $J:=\{j\mid r_{j}<r_{i}\}$ be the set of relay nodes that have a smaller load ratio than $m_{i}$ . Similarly let $H:=\{h\mid r_{h}>r_{i}\}$ be the set of relay nodes that have a larger load ratio than $m_{i}$ . Let $m:=|M|$ be the total number of relay nodes.

Lemma A.1.

Let $\mathbf{r}_{t}$ be the load ratio vector at time $t$ . Then, the expected load ratio of any relay node $m_{i}$ at time $t+1$ , denoted by $\mathbf{r}_{t+1}^{i}$ , is the optimal load vector $\overline{r}$ , i.e., $\forall r_{i}\in\mathbf{r},\ \operatorname*{E}\left[\mathbf{r}_{t+1}^{i}\mid\mathbf{r}_{t}\right]=\overline{r}$ .

Proof.

For any relay node $m_{k}\in M$ , the nodes that choose $m_{k}$ as a helper at time $t$ will choose $m_{i}$ as the relay node with probability $\Pr(k,j)$ (see Eq. 21), hence we have

[TABLE]

For each node $k\not\in H\cup J$ , $r_{k}=r_{i}$ . Hence $\frac{u_{i}}{u}\sum_{k\not\in H\cup J}l_{k}-\frac{l_{i}}{u}\sum_{k\not\in H\cup J}u_{k}=0$ . As a result, $\operatorname*{E}\left[\mathbf{r}_{t+1}^{i}\mid\mathbf{r}_{t}\right]=\frac{l}{u}=\overline{r}$ . ∎

Lemma A.2.

$\sum_{i=1}^{m}\operatorname*{Var}\left[\mathbf{r}_{t+1}^{i}\mid\mathbf{r}_{t}\right]\leq\left(m\Phi(\mathbf{r}_{t})\right)^{\frac{1}{2}}$ .

Proof.

Let $(Y_{i,1},\dots,Y_{i,m})$ be a random variable drawn from a multinomial distribution with the constraint $\sum_{j=1}^{m}Y_{i,j}=l_{i}$ . For each relay node $m_{i}$ , the variance of the load ratio of $m_{i}$ at time $t+1$ is shown as below:

[TABLE]

where $u_{i}$ is the upload capacity of relay node $m_{i}$ , and we assume $u_{i}\geq 2$ . Therefore, we sum up the variance of the load ratio at each relay node and we have

[TABLE]

For all $r_{k},r_{i}\in\mathbf{r}$ , $|r_{k}-r_{i}|\leq|r_{k}-\overline{r}|+|r_{i}-\overline{r}|$ . Using this, we can simplify the above inequality:

[TABLE]

Note that the multivariable function $f(u_{1},...,u_{m})=\sum_{i=1}^{m}\frac{1}{u_{i}}$ is a Schur-concave[27] function, and the maximum of $f$ , denoted by $\sup f$ , is achieved if and only if $\forall i,j\in[1,m],\ u_{i}=u_{j}$ (by Karamata’s inequality[27]). That is $\sup f=f(\frac{u}{m},...,\frac{u}{m})$ . Hence we have

[TABLE]

Since $u_{i}\geq 2$ for each $m_{i}\in M$ , we have $\sum_{i=1}^{m}\frac{1}{u_{i}}|r_{i}-\overline{r}|\leq\frac{1}{2}|r_{i}-\overline{r}|$ . In addition, $2mu_{k}\leq u^{2}$ for any $u_{k}\geq 2$ and $m\geq 2$ . Combining all the inequations above, we have

[TABLE]

Then, by Cauchy-Schwarz inequality, we obtain

[TABLE]

∎

Bibliography41

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] L. Anderson, R. Holz, A. Ponomarev, P. Rimba, and I. Weber. New kids on the block: an analysis of modern blockchains. ar Xiv preprint ar Xiv:1606.06530 , 2016.
2[2] J. Benet. Ipfs-content addressed, versioned, p 2p file system. ar Xiv preprint ar Xiv:1407.3561 , 2014.
3[3] I. Bentov, R. Pass, and E. Shi. Snow white: Provably secure proofs of stake. IACR Cryptology e Print Archive , 2016:919, 2016.
4[4] P. Berenbrink, T. Friedetzky, L. A. Goldberg, P. Goldberg, Z. Hu, and R. Martin. Distributed selfish load balancing. In Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm , SODA ’06, pages 354–363, Philadelphia, PA, USA, 2006. Society for Industrial and Applied Mathematics.
5[5] D. Boneh, B. Lynn, and H. Shacham. Short signatures from the weil pairing. In International Conference on the Theory and Application of Cryptology and Information Security , pages 514–532. Springer, 2001.
6[6] N. Buchbinder, J. S. Naor, et al. The design of competitive online algorithms via a primal–dual approach. Foundations and Trends® in Theoretical Computer Science , 3(2–3):93–263, 2009.
7[7] V. Buterin and V. Griffith. Casper the friendly finality gadget. ar Xiv preprint ar Xiv:1710.09437 , 2017.
8[8] M. Castro and B. Liskov. Practical byzantine fault tolerance and proactive recovery. ACM Transactions on Computer Systems (TOCS) , 20(4):398–461, 2002.