September 11, 2018

So They're Selling You a Blockchain

Blockchain is certainly one of the hot technology topics today. Yes, it’s probably in part a hype bubble – but from my perspective as a long-time researcher in decentralized systems I believe there is also a lot of potential value in the concept, which is why my DEDIS lab at EPFL has been investing years of effort in building next generation blockchain architectures. I also believe the value of the blockchain concept is relatively independent of the unpredictable and scam-riddled financial market for cryptocurrencies.

However, the genuine promise of blockchain technology will be realized only if blockchains are designed properly and deployed cautiously. Several companies evaluating blockchain technologies have asked me to help them figure out what to look for and what questions to ask in considering alternatives, which prompted me to create a list of questions that I thought I’d share for whatever they may be worth to others. This is by no means a complete list, and is certainly not “unbiased”: in particular it is strongly oriented towards security- and privacy-related questions and issues, which is natural since I’m a security/privacy researcher. But I hope some people might find it useful, for whatever it’s worth. I’m happy to take suggestions to extend the list further, and will consider adding them if I agree on their importance – although I given the mountain of (often blockchain-related) E-mail I’m constantly buried under, I can’t promise how promptly I’ll be able to respond, let alone edit and update this post with suggestions. So this list is provided purely “without warranty”: take it for whatever it’s worth, and of course your mileage may vary.

Blockchain Architecture Questions

Has the technical architecture and design been validated through peer review in independently peer-reviewed publication venues?

Which venues, and how reputable and selective are they? (For reference, the top systems venues are SOSP and OSDI, the top security/privacy venues are IEEE Security & Privacy and USENIX Security, and the top cryptography venue is CRYPTO.)

Who runs consensus nodes in the blockchain architecture, and how is membership decided?

Is the blockchain open to anyone (permissionless) or run among a fixed set of participants (permissioned)? What is the process for adding or removing members, and how is that process secured?

How is the consensus group membership governed, organizationally and technically?

Who decides when and how to deploy blockchain upgrades that might not be backward-compatible, thus requiring a hard-fork? How are software update roll-outs managed and secured?

How decentralized will the intended deployment be in practice?

How many independent organizations will be expected to run consensus nodes, initially and in the long term? Stated more bluntly, are we decentralized yet?

Does the architecture rely on energy-intensive proof-of-work mining?

Who deploys and operates the mining nodes? What are the energy and environmental costs of this mining, and are the trends sustainable? How decentralized is the distribution of mining power: e.g., how many of the largest miners or pools would need to collude to execute a 51% attack, or a 33% attack, on the blockchain?

Security and Robustness Questions

Does the system have an open, independent security and privacy review and analysis?

If so, where may that independent security/privacy analysis be found?

Does the blockchain system implement true Byzantine consensus?

Is the consensus scheme fully robust not just to failed (crashed) but also fully-compromised (hacked, adversarial) consensus group participants? Or does the system rely on non-Byzantine consensus algorithms like Paxos that assume all parties are well-behaved and cannot ensure integrity if even just one participant is compromised?

How many compromised participants can the consensus protocol tolerate?

What number or percentage of consensus group members participants can be fully compromised (e.g., hacked, colluding, or under adversarial control) without compromising the integrity of the blockchain or the confidentiality of the deta entrusted to it?

Are there single points of failure or compromise outside of consensus?

For example: Does the system depend on a centralized Certificate Authority (CA) for naming and membership, which if compromised could allow the attacker to impersonate any and all of the blockchain participants simultaneously? Does the system rely on centralized “oracles” to obtain and report information from the outside world that smart contracts might need to depend on, such that smart contracts can be misled if that oracle service is hacked? Does the system rely on centralized off-chain data- or key-storage services to ensure the confidentiality of data or encryption keys entrusted to the blockchain, such that the compromise of that storage service could make all the stored data and/or keys readable and decryptable in bulk by an attacker without necessarily leaving access records on the blockchain? Does the system rely on a centralized cloud-based login, control, and provisioning services that, if compromised, could enable an attacker to gain control of most or all of the blockchain nodes at once?

How can I know the blockchain will be available and have adequate transaction processing capacity when my business needs it?

What are the risks that the potentially-independent actions of other parties using the blockchain innocently or deliberately consuming enough resources to deny service or limit availability to my business’s needs, as occurred in the Crypto-Kitties incident for example?

What provisions prevent misbehaving consensus nodes from censoring transactions?

Do relying parties (e.g., mobile apps accessing the blockchain) maintain connections to multiple consensus nodes or just one, which could be a single point of failure, compromise, or censorship? How can relying parties detect and recover if their access attempts are manipulated or censored?

Can offline or poorly-connected devices securely verify blockchain transactions?

Do edge devices designed to scan and verify blockchain-secured credentials become inoperative or insecure if they cannot connect to the consensus nodes? Or can edge devices such as credential scanners cryptographically verify transactions offline?

What are the financial costs of blockchain transactions, currently and historically?

How can I mitigate the risk that unexpected future increases in transaction costs, whether temporary or permanent, could impose economically prohibitive costs to my blockchain-dependent business?

What are the energy and network bandwidth costs to resource-constrained edge devices needing to interact with the blockchain?

Do resource-constrained edge devices or mobile apps need to maintain connections and gossip with multiple consensus group members, or with just one? In the former case, what are the energy and bandwidth costs of this gossip? In the latter case, can that one reference node, if compromised, potentially censor the relying party’s access to the blockchain or trick it into using an “alternate-reality” blockchain inconsistent with the public one?

Performance and Scalability Questions

How scalable is the architecture to many consensus participants?

What is the maximum number of widely-distributed participants for which the blockchain system has been validated and performance-tested? Tens, hundreds, thousands, tens of thousands?

How long does it take to commit a transaction securely?

How are transaction latencies affected by consensus group size? When can I be overwhelmingly certain that a transaction can’t be reverted or rewritten after commitment? Were these transaction latencies tested on a wide-area deployment with global speed-of-light delays between consensus nodes, or in a local-area environment where all participants are close together?

What is the system’s maximum demonstrated transaction throughput?

Was this throughput demonstrated in a wide-area or local-area deployment? What processing power and network bandwidth was required?

What does the system’s maximum transaction throughput depend on?

Can the system’s transaction-processing capacity scale gracefully as more participants and hardware resources are added to the system (e.g., via sharding), or is processing throughput limited to the slowest consensus group member?

Privacy and Confidentiality Questions

How can the blockchain handle private, confidential transaction data?

Do all the consensus group members see all transaction data, or just the participants that are directly involved in a given transaction? If confidential transaction data is encrypted, how are the secret encryption keys managed, and who holds those keys? Are the encryption keys held by a centralized party that, if compromised, could leak confidential data in bulk to an attacker?

Can the blockchain enforce global invariants on confidential transaction data?

If the amounts of coin or token in a transaction are encrypted and only known to the direct parties to the transaction, for example, can those parties collude to create fake coins or tokens “out of thin air” without the knowledge of the other validators or consensus group members? Or can the system somehow ensure that “the books balance” globally without revealing transaction amounts?

Can the blockchain hide data that smart contracts depend on?

Does all the data that a smart contract uses in its calculations and decisions need to be exposed to all the blockchain miners or validators? Is there any approach available if a smart contract needs to make a decision based on sensitive confidential data?

Does the blockchain provide any form of metadata privacy for transactions?

Can the blockchain hide the identities of parties involved in transactions, or does it effectively reveal who’s transacting with whom even if the actual content of the transaction is encrypted? Can the blockchain hide or obscure the timing, pattern, or rate of transactions in any way, limiting information leakage to an observer who might, for example, infer that a burst of activity signifies an upcoming product release, merger, major trade, or other announcement??

Smart Contract Questions

Which language(s) are smart contracts written in?

Does the language have a clear and rigorous specification? What measures does the language take for mitigating risks of critical smart contract bugs? Has the specification been validated through independent security analysis and/or publication in peer-reviewed venues?

What are the costs of running simple, or complex, smart contracts?

What is the financial cost and latency to run a simple “if-then” type contract? What is the financial cost and latency to run complex smart contract code that, for example, needs to include cryptographic algorithms to verify non-native signatures or zero-knowledge proofs?

What is the recourse if something goes wrong due to a smart contract bug?

Does the blockchain architecture provide a systematic means to roll-back or correct for the effects of smart contract bugs? How costly are those measures, e.g., do they require a global hard-fork as was used to address the DAO incident for example? If there are less-costly measures, how are they activated, and how does the architecture protect against abuse (e.g., via costly over-use or “crying wolf” attacks)?

Does the smart contract virtual machine enforce strict determinism in the smart contract code?

Or can buggy or maliciously-written smart contract code potentially inject random or other nondeterministic results into their output, and thereby preventing the validators from agreeing on the results of a transaction?