Blockchain - The Nuts & Bolts of Hashes, Merkle Trees, Private & Public Keys, and Smart Contracts
Database Design
Overview
Concept
A blockchain is mechanism for storing data with a few properties that make it unusual.
-
It is peer-to-peer. There is no need for a central server or a central organization to pass messages among computers on the network.
-
It operates in a trustless environment, yet the data held within it is trustworthy.
-
It depends upon some form of reward for those who help maintain it.
-
The data held within it is immutable.
-
All data is public and verifiable.
Creation myth
The true indentity of the inventor of blockchain is either unknown or disputed.
-
On October 31st 2008, a user by the name of Satoshi Nakomoto, published the Bitcoin white paper on a cryptography-focused email list.
-
A few likely suspects include Hal Finney (inventor of ‘proof of work’), Wei Dai (inventor of b-money), Nick Szabo (inventor of bit gold), and Craig Wright (fiery Australian polymath and businessman)
-
Finney, Dai, and Szabo have all denied being Satoshi Nakomoto
-
Wright claims that he is Satoshi Nakomoto and that he created Bitcoin with the help of Dave Kleiman, although he has evaded requests to provide proof of this claim.
Kleinman’s family are currently embroiled in a farcical legal dispute with Wright.
- Several notable crypto-celebrities, including Vitalik Buterin (founder of Ethereum), CZ (CEO of Binance), and Roger Ver (self-proclaimed ‘Bitcoin Jesus’) have called Wright a “fraud”.
Bitcoin basics
Example code
Some of the basic cryptography concepts mentioned in this section, including hashes and digital signatures, are exhibited in Python code in a companion set of Jupyter Notebooks
Hash
A cryptographic hashing function is a algorithm that can take data of an arbitrary length and produce a unique fixed-length ‘digest’ form of the data - the hash of the data.
-
The same data, when run multiple times through the same hashing function, will always produce the same hash. These functions are deterministic.
-
It is infeasible for any two different pieces of data, when run through the same hashing function, to produce the same hash. Hashes are thus unique to the data from which they are derived.
-
It is infeasible to be able to discover the original data from its hash. Hashing functions are one-way.
-
Hash functions are computationally simple and fast.
Hash (continued)
Hash (continued)
Hashes have two important use cases:
- Authenticity: a sender can send both a message and the hash of the message to a recipient.
The recipient can then run the message through the hashing function, and compare the hash to the hash received from the sender. If the two match, then the message has not been tampered with by a 3rd party.
- Proof of work: it is not possible to predict what the hash of any given data will look like.
Thus, if trying to find data that produces a hash with specific attributes, a brute force approach must be taken, where a message is modified and hashed repeatedly until a hash with the desired attributes is encountered.
Proof of work systems require that a message sender have performed a particular amount of computational work by rejecting messages that do not contain a hash meeting specific requirements.
The first such system, created in 1997, was called Hashcash, and was intended to make spam unprofitable by requiring emails to show proof of work.
Nonce
A nonce is arbitrary data added to a message in order to change the hash of that message so as to hopefully meet the hash requirements of a proof of work system.
-
the word, nonce, originated in the 13th century, meaning “occurring, used, or made only once or for a special occasion”.
-
if the message plus the nonce do not produce a hash with the desired attributes, a different nonce must be used, thus expending work.
PS:
- nonce in this usage is not to be confused with the British prison slang acronym, nonce, meaning “not on normal courtyard exercise.”
Nonce (continued)
Mining
Mining is the process of trying out different nonces over and over again until one is found which, when added to the data in the message such that the requirements of the proof of work system are met.
-
The difficulty of mining depends upon the difficulty of the requirements of the proof of work scheme.
-
In cryptocurrencies that follow a proof of work system, miners that find a successful nonce are rewarded with newly-minted cryptocurrency coins.
-
Thus, while minining requires continual use of computing power and energy resources, it offers a chance of a reward.
Mining (continued)
Public key cryptography
Public key cryptography is a set of technologies for proving ownership of digital data. Two mathematically interrelated keys (text strings) are involved:
-
a private key, which should be kept private.
-
a public key, which can be shared publicly without concern.
It is derived from the private key by a one-way mathematical function.
Public key cryptography (continued)
There are two main uses of public key cryptography:
- Privacy - any digital message can be encrypted with the intended recipient’s public key.
Only the recipient, who holds the corresponding private key, will be able to decrypt the message.
- Authentication - a sender of a message can use their own private key to “sign” the message, producing a digital signature that can safely sent with the message.
The recipient can use the sender’s public key to verify the authenticity of that signature, and thus the message.
Privacy
Authentication
Addresses
In blockchains, each user has an address from which they can send or receive cryptocurrency transactions.
- This address is nothing but a shortened form of a public key, to which the user keeps secret the corresponding private key.
Merkle tree
A Merkle Tree, or hash tree, is a hash of hashes, where the top-most hash can be considered a fixed-length digest of the data beneath it.
-
The data in the leaf nodes is hashed, and those hashes are hashed, until all hashes convege at the top-most hash, the Merkle Root.
-
It is possible to verify that a given piece of data in a leaf node is present in a given hash tree without re-hashing all data in the tree.
Merkle tree (continued)
Block
At last, we come to the foundational concept of the blockchain - the block! Each block includes:
-
the hash of the previous block, thus creating a chain where each block references the previous block.
-
a nonce that, when combined with the other data in the block, successfully produces a hash that meets specific requirements in a proof of work scheme.
-
a Merkle Root that is the top of a hash tree.
-
a set of individual transactions, each digitally signed by the sender using their private key, which together make up the data in the leaf nodes of the hash tree.
Block (continued)
Block (continued again)
Block (continued once more)
Since each new block includes the hash of the previous block, any tampering with data in previous blocks (which would change their hashes) would invalidate subsequent blocks.
- Any computer running the full blockchain software validates all blocks and detects fraudulent blocks.
Block (continued yet again)
Since the proof of work scheme requires that a block’s hash meet certain specific requirements which are time-consuming to achieve, there is an intentional latency to the system.
-
in the Bitcoin system, it takes on average about 10 minutes for one of the computers on the network to produce a viable block.
-
the difficulty of the proof of work requirements is automatically adjusted every 2 weeks to keep this pace, even as the number of computers on the network and the strength of their processing power is in continual flux.
-
a typical computer on the network will solve the hashing problem once every 2 years, on average.
Transactions
Each transaction stored in a block includes details about the sender, the recipient, and the amount of funds to transfer.
-
A finite number of transactions fit into the memory space of each block.
-
Each transaction is signed by the sender, using their private key. This signature is included as part of the transaction data.
The authenticity of this signature can be validated by anyone, by simply using the sender’s public key.
-
All computers running the blockchain software can validate each transaction.
-
The data for a transaction shows
inputs
- addresses from which funds will be taken, andoutputs
- addresses to which those funds will be sent. -
If the sum of
inputs
funds is more than the sum being sent to theoutputs
, then the sender can choose to return the excess balance to themselves, or provide it as a tip for the miners.
Mining (continued)
Computers running the full version of the blockchain software (a.k.a. ‘full nodes’) compete with one-another to solve the hashing challenge of the proof of work system.
- Each full node picks a set of transactions from a shared list of pending transactions (the mempool) that all nodes pass around the network.
They tend to pick those transactions that offer them the highest tip, if any are available.
-
The first computer to produce a hash that meets the proof of work requirements is able to publish their block by sending it around to the other full nodes on the network, and is rewarded with newly-minted cryptocurrency.
-
The other full nodes will accept the new block once they validate that its hash meets the proof of work requirements.
Smart Contracts
Concept
While Bitcoin included a limited scripting language that allowed some programmable logic to be placed inside of any transaction, Ethereum is the blockchain that took this concept to the next level.
-
A “smart contract”, a term coined by Nick Szabo, is a program.
-
In Ethereum, smart contracts have addresses, just like use accounts, and can send or receive cryptocurrency to/from other accounts.
-
Like everything on the blockchain, smart contracts are public and validated and executed by all full nodes on the network.
The entire network is considered a single virtual machine execution environment.
-
Any account - be it a user account or a smart contract - can send cryptocurrency to a smart contract to trigger one of its functions to be executed.
-
Functions in a smart contract typically transfer cryptocurrency or tokens to and from addresses on the blockchain.
Dapps
Decentralized applications, similar to the apps that most people are familiar with on the web, can be hosted on a blockchain, in combination with related decentralized platforms.
-
From a user’s perspective, dapps appear to operate just like regular apps.
-
Decentralized file storage services, like IPFS, allow the files that make up the app to be stored across the entire network rather than on a single server.
-
Smart contracts allow the coin- or token-related functions to be hosted on a blockchain, rather than on a single server.
-
Integration with cryptocurrency wallets allows users of dapps to digitally sign transfers of cryptocurrency or tokens to authenticate them.
-
Check out some examples
Tokens
Tokens are arbitrary representations of value.
-
Tokens can be created (i.e. ‘minted’) by smart contracts and sent and received to or from any address on the blockchain.
-
Ethereum’s ERC-20, ERC-721, and other standards attempt to standardize which functions and behaviors should be included in smart contracts that issue fungible and non-fungible tokens.
This standardization allows for interoperability among smart contracts and even between different blockchains.
-
The value of a token is determined by what you can do with that token or by what else you might be trade that token for.
-
Some tokens act like reputation points within social networks and gaming environments.
Other tokens act as symbolic representations of ownership or possession of physical goods.
Still other tokens are useless except for the fact that you can trade them for other tokens or fiat currency.
DeFi
The world of decentralized finance (DeFi) - finance on the blockchain - has become extremely popular within the last year.
-
DeFi is growing to become a parallel financial system, with all kinds of financial instruments, old and new, on offer: borrowing, lending, crowdfunding, buying shares, taking long or short positions on the market, trading on “foreign currency” exchanges, providing liquidity, farming, and more.
-
Cryptocurrency as “programmable money”, allows people, funds, smart contracts and bots to perform any financial operation imaginable with little-to-no regulation.
-
Warning: this environment is rife with speculators trying to make quick fortunes by inventing or betting on, and shilling cryptocurrencies or tokens with little-to-no inherent value. Beware of rug pulls.
Concerns
Concept
Given the relative newness of blockchain, the incessant touting of its disruptive capabilities, and its explosive growth as a means of financial speculation and perhaps exploitation, many concerns about it have cropped up, including:
- Does blockchain have a use outside of cryptocurrency?
Listen to crypto explainer Andreas Antonopoulos consider “Blockchain vs. Bullshit””.
- Isn’t blockchain a giant waste of energy that is killing the environment?
View the Cambridge Center for Alternative Finance’s comparisons of Bitcoin’s energy consumption to other common energy draws.
- Isn’t blockchain designed to let criminals evade the government?
Watch self-proclaimed Satoshi, Craig Wright, denounce decentralization as “a lie” and the Brookings Institute tout its crime-fighting properties.
- Can’t the government just kill it once it becomes too big?
Nobody is really sure if this is possible.
Some nations, like China, are trying to subvert it.
Conclusions
Thank you. Bye.