Ethereum Transactions

Transactions are atomic:

Transactions execute in their entirety, with any changes in the global state (contracts, accounts, etc.) recorded only if all execution terminates successfully.
- Successful termination means that the program executed without an error and reached the end of execution.
- If execution fails due to an error, all of its effects (changes in state) are “rolled back” as if the transaction never ran.
A failed transaction is still recorded as having been attempted, and the ether spent on gas for the execution is deducted from the originating account, but it otherwise has no other effects on contract or account state.

Transaction Structure

Transaction structure transmitted:

Nonce: A sequence number, issued by the originating EOA, used to prevent message replay;
Gas price: The price of gas (in wei) the originator is willing to pay;
Gas limit: The maximum amount of gas the originator is willing to buy for this transaction;
Recipient: The destination Ethereum address;
Value: The amount of ether to send to the destination;
Data: The variable-length binary data payload;
v,r,s: The three components of an ECDSA digital signature of the originating EOA.
Derived information of a transaction:
- EOA’s public key can be derived from the v,r,s components of the ECDSA signature. The sender address can, in turn, be derived from the public key;
- Block number;
- Transaction ID.

The transaction message’s structure is serialized using the Recursive Length Prefix (RLP) encoding scheme.

RLP’s length prefix is used to identify the length of each field. Anything beyond the defined length belongs to the next field in the structure.
Field labels (to, gas limit, etc.) are not part of the transaction serialized data.
In general, RLP does not contain any field delimiters or labels.

All numbers in Ethereum are encoded as big-endian integers, of lengths that are multiples of 8 bits.

Nonce

The nonce is an attribute of the originating address; only has meaning in the context of the sending address.

The nonce is not stored explicitly as part of an account’s state on the blockchain. It is calculated dynamically, by counting the number of confirmed (i.e., on-chain) transactions that have originated from an address.

If at any point the gas supply is reduced to zero we get an “Out of Gas” (OOG) exception; execution immediately halts and the transaction is abandoned. No changes to the Ethereum state are applied, except for the sender’s nonce being incremented anyway and their ether balance going down to pay the block’s beneficiary for the resources used to execute the code to the halting point.

The nonce provides important features:

Transactions being included in the order of creation;
Transaction duplication protection.

When you create a new transaction, you assign the next nonce in the sequence. But until it is confirmed, it will not count toward the getTransactionCount total.

When you build an application that constructs transactions, it cannot rely on getTransactionCount for pending transactions. Only when the pending and confirmed counts are equal (all outstanding transactions are confirmed) can you trust the output of getTransactionCount to start your nonce counter.

The Ethereum network processes transactions sequentially, based on the nonce.

If you create several transactions in sequence and one of them does not get officially included in any blocks, all the subsequent transactions will be “stuck,” waiting for the missing nonce.

A transaction can create an inadvertent “gap” in the nonce sequence because it is invalid or has insufficient gas.

To get things moving again, you have to transmit a valid transaction with the missing nonce.

Once a transaction with the “missing” nonce is validated by the network, all the broadcast transactions with subsequent nonces will incrementally become valid; it is not possible to “recall” a transaction.

If you accidentally duplicate a nonce, for example by transmitting two transactions with the same nonce but different recipients or values, then one of them will be confirmed and one will be rejected.

Which one is confirmed will be determined by the sequence in which they arrive at the first validating node that receives them—i.e., it will be fairly random.

Gas

Gas is not ether—it’s a separate virtual currency with its own exchange rate against ether.

Gas is separate from ether in order to protect the system from the volatility that might arise along with rapid changes in the value of ether, and also as a way to manage the important and sensitive ratios between the costs of the various resources that gas pays for (namely, computation, memory, and storage).

The gasPrice field in a transaction allows the transaction originator to set the price they are willing to pay in exchange for gas.

The price is measured in wei per gas unit.
The minimum value that gasPrice can be set to is zero, which means a fee-free transaction.
- That means that wallets can generate completely free transactions.
- During periods of low demand for space in a block, such transactions might get mined.

gasLimit gives the maximum number of units of gas the transaction originator is willing to buy in order to complete the transaction.

For simple payments (transactions that transfer ether from one EOA to another EOA), the gas amount needed is fixed at 21000 gas units.
If the transaction’s destination address is a contract, then the amount of gas needed can be estimated but cannot be determined with accuracy. That’s because a contract can evaluate different conditions that lead to different execution paths, with different total gas costs.

When you transmit your transaction, one of the first validation steps is to check that the account it originated from has enough ether to pay the gasPrice * gasLimit fee.

The amount is not actually deducted from your account until the transaction finishes executing.
You are only billed for gas actually consumed by your transaction, but you must have enough balance for the maximum amount you are willing to pay before you send your transaction.

Recipient

The recipient of a transaction is specified in the to field containing a 20-byte Ethereum address.

The address can be an EOA or a contract address.
Ethereum does no further validation of this field (there’s no way to validate). Any 20-byte value is considered valid.
Sending a transaction to the wrong address will probably burn the ether sent, rendering it forever inaccessible (unspendable).
- Specially designated burn address: 0x000000000000000000000000000000000000dEaD

Value and Data

The main “payload” of a transaction is contained in two fields: value and data.

A transaction with only value is a payment.
A transaction with only data is an invocation.
A transaction with both value and data is both a payment and an invocation.
A transaction with neither value nor data is probably just a waste of gas.

Payment transactions behave differently depending on whether the destination address is a contract or not.

For EOA addresses, or rather for any address that isn’t flagged as a contract on the blockchain, Ethereum will record a state change, adding the value you sent to the balance of the address.
- If the address has not been seen before, it will be added to the client’s internal representation of the state and its balance initialized to the value of the payment.
If the destination address is a contract:
- If there is data in the transaction, the EVM will execute the contract and will attempt to call the function named in the data payload of the transaction.
- If there is no data in the transaction:
  - If there is a fallback function, the EVM will call it and, if that function is payable, will execute it to determine what to do next.
  - If there is no fallback function, then the effect of the transaction will be to increase the balance of the contract, exactly like a payment to a wallet.
- A contract can reject incoming payments by throwing an exception immediately when a function is called, or as determined by conditions coded in a function.
- If the function terminates successfully (without an exception), then the contract’s state is updated to reflect an increase in the contract’s ether balance.

When the transaction contains data, it is most likely addressed to a contract address. It is allowed send a data payload to an EOA, but it is ignored by the Ethereum protocol. Any interpretation of the data payload by an EOA is not subject to Ethereum’s consensus rules, unlike a contract execution.

The data payload sent to an ABI-compatible contract is a hex-serialized encoding of:

A function selector : The first 4 bytes of the Keccak-256 hash of the function’s prototype.
- The prototype of a function is defined as the string containing the name of the function, followed by the data types of each of its arguments, enclosed in parentheses and separated by commas.
```
function withdraw(uint withdraw_amount) public {
// prototype: withdraw(uint256)
// uint is an alias for uint256
```
The function arguments: The function’s arguments, encoded according to the rules for the various elementary types defined in the ABI specification.

Contract creation transactions are sent to a special destination address called the zero address; the to field in a contract registration transaction contains the address 0x0.

The zero address represents neither an EOA (there is no corresponding private–public key pair) nor a contract. It can never spend ether or initiate a transaction. It is only used as a destination, with the special meaning “create this contract.”

A contract creation transaction need only contain a data payload that contains the compiled bytecode which will create the contract. The only effect of this transaction is to create the contract.

You can include an ether amount in the value field if you want to set the new contract up with a starting balance, but that is entirely optional.
If you send a value (ether) to the contract creation address without a data payload (no contract), then the effect is the same as sending to a burn address—there is no contract to credit, so the ether is lost.
The contract creator doesn’t get any special privileges at the protocol level (although you can explicitly code them into the smart contract).

Transaction Propagation

The Ethereum network uses a “flood routing” protocol. Each Ethereum client acts as a node in a peer-to-peer (P2P) network, which (ideally) forms a mesh network.

Transaction propagation:

The originating Ethereum node receives a signed transaction.
The transaction is validated and then transmitted to all the other Ethereum nodes that are directly connected to the originating node.
- On average, each Ethereum node maintains connections to at least 13 other nodes, called its neighbors.
Each neighbor node validates the transaction as soon as they receive it. If they agree that it is valid, they store a copy and propagate it to all their neighbors (except the one it came from).
As a result, the transaction ripples outwards from the originating node, flooding across the network, until all nodes in the network have a copy of the transaction.

From the perspective of each node, it is not possible to discern the origin of the transaction. To be able to track the origins of transactions, or interfere with propagation, an attacker would have to control a significant percentage of all nodes.

This is part of the security and privacy design of P2P networks.

References

Mastering Ethereum by Andreas M. Antonopoulos and Dr. Gavin Wood (O’Reilly). Copyright 2019 The Ethereum Book LLC and Gavin Wood, 978-1-491-97194-9

Contents

Transaction Structure

Nonce

Gas

Recipient

Value and Data

Transaction Propagation