Ethereum is a deterministic but practically unbounded state machine, consisting of a globally accessible singleton state and a virtual machine that applies changes to that state.

It is an open source, globally decentralized computing infrastructure that executes programs called smart contracts.

It uses a blockchain to synchronize and store the system’s state changes, along with a cryptocurrency called ether to meter and constrain execution resource costs.

Ethereum’s purpose is not primarily to be a digital currency payment network. While the digital currency ether is both integral to and necessary for the operation of Ethereum, ether is intended as a utility currency to pay for use (running smart contracts) of the Ethereum platform as the world computer.

Ethereum is designed to be a general-purpose programmable blockchain that runs a virtual machine capable of executing code of arbitrary and unbounded complexity. Where Bitcoin’s Script language is, intentionally, constrained to simple true/false evaluation of spending conditions, Ethereum’s language is Turing complete, meaning that Ethereum can straightforwardly function as a general-purpose computer.

Ethereum was conceived at a time when people recognized the power of the Bitcoin model, and were trying to move beyond cryptocurrency applications. But developers faced a conundrum: they either needed to build on top of Bitcoin or start a new blockchain.

  • Building upon Bitcoin meant living within the intentional constraints of the network and trying to find workarounds. The limited set of transaction types, data types, and sizes of data storage seemed to limit the sorts of applications that could run directly on Bitcoin; anything else needed additional off-chain layers, and that immediately negated many of the advantages of using a public blockchain.
  • For projects that needed more freedom and flexibility while staying on-chain, a new blockchain was the only option. But that meant a lot of work: bootstrapping all the infrastructure elements, exhaustive testing, etc.

The idea behind Ethereum was that by using a general-purpose blockchain, a developer could program their particular application without having to implement the underlying mechanisms of peer-to-peer networks, blockchains, consensus algorithms, etc. The Ethereum platform was designed to abstract these details and provide a deterministic and secure programming environment for decentralized blockchain applications.

Like Bitcoin, Ethereum is also a distributed state machine. Instead of tracking only the state of currency ownership, Ethereum tracks the state transitions of a general-purpose data store that can hold any data expressible as a key–value tuple.

Ethereum has memory that stores both code and data, and it uses the Ethereum blockchain to track how this memory changes over time. Ethereum loads code into its state machine and run that code, storing the resulting state changes in its blockchain.

Two of the critical differences from most general-purpose computers are that Ethereum state changes are governed by the rules of consensus and the state is distributed globally.

Ethereum’s components:

  • Ethereum transactions are network messages that include (among other things) a sender, recipient, value, and data payload.
  • Ethereum state transitions are processed by the Ethereum Virtual Machine (EVM), a stack-based virtual machine that executes bytecode (machine-language instructions). EVM programs, called “smart contracts,” are written in high-level languages (e.g., Solidity) and compiled to bytecode for execution on the EVM.
  • Ethereum’s state is stored locally on each node as a database (usually Google’s LevelDB), which contains the transactions and system state in a serialized hashed data structure called a Merkle Patricia Tree.
  • Ethereum uses Bitcoin’s consensus model, Nakamoto Consensus, which uses sequential single-signature blocks, weighted in importance by PoW to determine the longest chain and therefore the current state.
  • A game-theoretically sound incentivization scheme (e.g., proof-of-work costs plus block rewards) to economically secure the state machine in an open environment.

Ethereum’s groundbreaking innovation is to combine the general-purpose computing architecture of a stored-program computer with a decentralized blockchain, thereby creating a distributed single-state (singleton) world computer that produce a common state secured by the rules of consensus.

Ethereum’s development culture is characterized by rapid innovation, rapid evolution, and a willingness to deploy forward-looking improvements, even if this is at the expense of some backward compatibility.

Developers can’t simply “upgrade” smart contracts. They must be prepared to deploy new ones, migrate users, apps, and funds, and start over.

In order to “evolve” the platform, developers have to be ready to scrap and restart smart contracts, which means they have to retain a certain degree of control over them.

Ether Currency Units

Ethereum’s currency unit is called ether, identified also as “ETH”.

Ethereum is the system, ether is the currency.

Ether is subdivided into smaller units, down to the smallest unit possible named wei. One ether is 1 quintillion ($$10^{18}$$) wei. The value of ether is always represented internally in Ethereum as an unsigned integer value denominated in wei.

Two Types of Accounts

The type of account created in the wallet is called an externally owned account (EOA). Externally owned accounts are those that have a private key; having the private key means control over access to funds or contracts.

That other type of account is a contract.

  • A contract (account) has smart contract code, which a simple EOA can’t have.
  • A contract does not have a private key.
    • Contracts only run if they are called by a transaction.
    • A contract can call another contract that can call another contract, and so on, but the first contract in such a chain of execution will always have been called by a transaction from an EOA.
  • It is owned (and controlled) by the logic of its smart contract code: the software program recorded on the Ethereum at the contract account’s creation and executed by the EVM.
    • It has both associated code and data storage.
  • Contracts are not executed “in parallel” in any sense—the Ethereum world computer can be considered to be a single-threaded machine.

Contracts have addresses.

Contracts can also send and receive ether.
Contracts can send and receive ERC20 tokens only if contracts are properly designed (using the approve & transferFrom workflow).

Interacting with Contracts

Registering a contract on the blockchain involves creating a special transaction whose destination is the address 0x0000000000000000000000000000000000000000, also known as the zero address. The zero address is a special address that tells the Ethereum blockchain that you want to register a contract.

To deploy a smart contract, you merely send an Ethereum transaction containing the compiled bytecode of the smart contract without specifying any recipient.

Once a contract is created on Ethereum, it has an Ethereum address.

Anytime someone sends a transaction to a contract address it causes the contract to run in the EVM, with the transaction as its input. Transactions sent to contract addresses may have ether or data or both.

  • If they contain ether, it is “deposited” to the contract balance.
  • If they contain data, the data can specify a named function in the contract and call it, passing arguments to the function.

Running the transfer function generated an internal transaction (also called a message).

Ethereum Clients

Ethereum is defined by a formal specification called the “Yellow Paper” (periodically updated). This formal specification, in addition to various Ethereum Improvement Proposals, defines the standard behavior of an Ethereum client.

As a result of Ethereum’s clear formal specification, there are a number of independently developed, yet interoperable, software implementations of an Ethereum client.

Ethereum has a greater diversity of implementations running on the network. It has proven itself to be an excellent way of defending against attacks on the network, because exploitation of a particular client’s implementation strategy simply hassles the developers while they patch the exploit, while other clients keep the network running almost unaffected.

Light clients validate block headers and use Merkle proofs to validate the inclusion of transactions in the blockchain and determine their effects, giving them a similar level of security to a full node.

Ethereum Address

In Ethereum public keys may represented as a serialization of 130 hexadecimal characters (65 bytes), adopted from a standard serialization format documented in Standards for Efficient Cryptography (SEC1).

  • Ethereum uses uncompressed public keys (prefixed with (hex)04);
  • The serialization concatenates the $$x$$ and $$y$$ coordinates of the public key: 04 + x-coordinate (32 bytes, or 64 hex) + y-coordinate (32 bytes, or 64 hex).

Ethereum addresses are hexadecimal numbers, identifiers derived from the last 20 bytes (least significant bytes) of the Keccak-256 hash of the public key.

  • Most often you will see Ethereum addresses (0x001d3f1ef827552ae1114027bd3ecf1f086ba0f9) with the prefix 0x that indicates they are hexadecimal-encoded.
  • The public key is not formatted with the prefix (hex) 04 when the address is calculated.

Ethereum addresses are presented as raw hexadecimal without any checksum.

  • The rationale behind that decision was that Ethereum addresses would eventually be hidden behind abstractions (such as name services) at higher layers of the system and that checksums should be added at higher layers if necessary.
  • In reality, these higher layers were developed too slowly and this design choice led to a number of problems in the early days of the ecosystem, including the loss of funds due to mistyped addresses and input validation errors.

Ethereum uses the Keccak-256 cryptographic hash function in many places.

  • Keccak-256 was designed as a candidate for the SHA-3 Cryptographic Hash Function Competition held in 2007 by the National Institute of Science and Technology. Keccak was the winning algorithm, which became standardized as Federal Information Processing Standard (FIPS) 202 in 2015.

    • During the period when Ethereum was developed, the NIST standardization was not yet finalized. NIST adjusted some of the parameters of Keccak after the completion of the standards process, allegedly to improve its efficiency.
    • At the same time Edward Snowden revealed documents that imply that NIST may have been improperly influenced by the National Security Agency to intentionally weaken the Dual_EC_DRBG random-number generator standard, effectively placing a backdoor in the standard random number generator. The result of this controversy was a backlash against the proposed changes and a significant delay in the standardization of SHA-3.
    • At the time, the Ethereum Foundation decided to implement the original Keccak algorithm, as proposed by its inventors, rather than the SHA-3 standard as modified by NIST.
  • Ethereum uses Keccak-256, even though it is often called SHA-3 in the code.

    # https://go.dev/play/p/hRLywsOHp5U
    Keccak256("") = c5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470
         SHA3("") = a7ffc6f8bf1ed76651c14756a061d662f580ff4de43b49fa82d80a4b80f8434a
    

EIP-55

EIP-55 offers a backward-compatible checksum for Ethereum addresses by modifying the capitalization of the hexadecimal address. By modifying the capitalization of the alphabetic characters in the address, we can convey a checksum that can be used to protect the integrity of the address against typing or reading mistakes. Wallets that do not support EIP-55 checksums simply ignore the fact that the address contains mixed capitalization, but those that do support it can validate it and detect errors with a 99.986% accuracy.

Encoding:

  1. Hash the lowercase address, without the 0x prefix;

    
    Keccak256("001d3f1ef827552ae1114027bd3ecf1f086ba0f9") =
    23a69c1653e4ebbb619b0b2cb8a9bad49892a8b9695d9a19d8f673ca991deae1
    
  2. Capitalize each alphabetic address character if the corresponding hex digit of the hash is greater than or equal to 0x8.

    
    Address: 001d3f1ef827552ae1114027bd3ecf1f086ba0f9
    Hash   : 23a69c1653e4ebbb619b0b2cb8a9bad49892a8b9...
    Result : 001d3F1ef827552Ae1114027BD3ECF1f086bA0F9
    

Detecting an error:

  1. Misread an F as an E;

    
                                                     F
               001d3F1ef827552Ae1114027BD3ECF1f086bA0E9
    
    Keccak256("001d3f1ef827552ae1114027bd3ecf1f086ba0e9") =
    5429b5d9460122fb4b11af9cb88b7bb76d8928862e0a57d46dd18dd8e08a6927
    
  2. Check the capitalization.

    
    001d3F1ef827552Ae1114027BD3ECF1f086bA0E9
    5429b5d9460122fb4b11af9cb88b7bb76d892886...
    
    • Several of the alphabetic characters are incorrectly capitalized.

References