Ethereum Source Code Walkthrough

Published 2018-06-13.
Time to read: 6 minutes.

This page is part of the posts collection, categorized under Blockchain, Ethereum, Go, Open Source.

April 27, 2020 Update

This was a sample work in progress as part of a proposal to the Ethereum Foundation Grants committee. They declined to fund this activity, but gave no reason and no feedback as to what might be acceptable. The preparation of my proposal took weeks, and the preliminary feedback that I had received from many knowledgeable people was that it had great value.

I felt that the evaluation process was broken, in fact the entire organization was broken, and there was little hope that Ethereum would ever become a professional organization. Two years later, I believe history has proved me right. This was the last blockchain-related initiative I participated in.

I no longer maintain web3j-scala, an Ethereum-related project for Scala programmers. I created that open source project and worked on it for free for 3 years. It has been forked and I’ve been told it is or was used in production. However, I see no reason to continue working for free on it. Others can carry the project forward if they want.

This is a brief walkthrough of some of the core source files for smart contracts in the official Go language Ethereum implementation, which includes the geth command-line Ethereum client program, along with many other programs. Ethereum clients include an implementation of the Ethereum Virtual Machine (EVM), which are able to parse and verify the Ethereum blockchain, including smart contracts, and provides interfaces to create transactions and mine blocks.

I’ve added some suggestions for how the source code might be improved. If there is general agreement that these suggestions make sense (tell me in the comments!) then I’ll create a pull request.

License

This Ethereum client project was released under the GNU Lesser General Public License, version 3 or later, which permits use of the code as a library in proprietary programs.

Source Files

The gocloc program counted the following source files and lines:

Language	Files	Blank Lines	Comment Lines	Code Lines
Go	1824	58,134	81,861	639,435
C	55	17,257	30,909	84,719
C Header	97	2,559	6,318	15,083
Markdown	88	3,152	0	9,175
JavaScript	13	1,845	4,495	7,986
Assembly	39	557	957	3,783
JSON	17	0	0	2,065
Protocol Buffers	2	113	40	1,030
Plain Text	11	217	0	954
C++	4	132	102	937
BASH	10	178	315	931
Perl	10	268	1,289	879
JSX	11	119	245	722
XML	9	0	0	651
M4	4	79	99	649
YAML	20	77	42	581
NSIS	5	86	154	446
Java	4	143	187	438
Makefile	11	101	84	381
Python	6	154	250	339
HTML	3	15	9	245
Solidity	7	56	171	213
Bourne Shell	6	23	25	119
CMake	1	9	0	35
Awk	1	4	4	17
TOML	1	0	0	3
Total	2,260	85,278	127,556	771,825

Packages

I used the following incantation to discover that geth defines 244 packages:

Shell

$ grep -rh "^package" | grep -v "not installed" | \
  tr -d ';' | sed 's^//.*^^' | awk '{$1=$1};1' | \
  sort | uniq | wc -l

I won't list them all. The godoc for the project contains much of the following documentation for the top-level packages. I provided the rest of the information from disparate sources, including reading the source code:

accounts implements high-level Ethereum account management.

trie provides a binary Merkle tree implementation.

cmd

Contains the following command-line tools. Most tools support the --help option.

abigen	source code generator to convert Ethereum contract definitions into easy to use, compile-time type-safe Go packages. It operates on plain Ethereum contract ABIs with expanded functionality if the contract bytecode is also available. However it also accepts Solidity source files, making development much more streamlined. Please see the Native DApps wiki page for details.
bootnode	runs a bootstrap node for the Ethereum Discovery Protocol. This is a stripped-down version of `geth` that only takes part in the network node discovery protocol, and does not run any of the higher level application protocols. It can be used as a lightweight bootstrap node to aid in finding peers in private networks.
clef	a standalone signer that manages keys across multiple Ethereum-aware apps such as Geth, Metamask, and cpp-ethereum. Alpha quality, not released yet.
ethkey	a key/wallet management tool for Ethereum keys. Allows user to add, remove and change their keys, and supports cold wallet device-friendly transaction inspection and signing. This documentation was written for the C++ Ethereum client implementation, but it is probably suitable for the Go implementation as well.
evm	a version of the EVM (Ethereum Virtual Machine) for running bytecode snippets within a configurable environment and execution mode. Allows isolated, fine-grained debugging of EVM opcodes. Example usage: Shell evm --code 60ff60ff --debug
faucet	an Ether faucet backed by a light client.
geth	official command-line client for Ethereum. It provides the entry point into the Ethereum network (main-, test- or private net), capable of running as a full node (default) archive node (retaining all historical state) or a light node (retrieving data live). It can be used by other processes as a gateway into the Ethereum network via JSON RPC endpoints exposed on top of HTTP, WebSocket and/or IPC transports. For more information see the CLI Wiki page.
p2psim	a simulation HTTP API. Docs are here.
puppeth	assembles and maintains private networks.
rlpdump	a pretty-printer for RLP data. RLP (Recursive Length Prefix) is the data encoding used by the Ethereum protocol. Sample usage: Shell rlpdump --hex CE0183FFFFFFC4C304050583616263
swarm	provides the `bzzhash` command, which computes a swarm tree hash, and implements the swarm daemon and tools. See the swarm documentation for more information.
wnode	simple Whisper node. It could be used as a stand-alone bootstrap node. Also could be used for different test and diagnostics purposes.

common contains various helper functions worth checking out

consensus implements different Ethereum consensus engines (which must conform to the Engine interface): clique implements proof-of-authority consensus, and ethash implements proof-of-work consensus.

console Ethereum implements a JavaScript runtime environment (JSRE) that can be used in either interactive (console) or non-interactive (script) mode. Ethereum's JavaScript console exposes the full web3 JavaScript Dapp API and the admin API. More documentation is here. This package implements JSRE for the geth console and geth console subcommands.

containers

contracts

core implements the Ethereum consensus protocol, implements the Ethereum Virtual Machine, and other miscellaneous important bits

crypto cryptographic implementations

dashboard

eth implements the Ethereum protocol

ethclient provides a client for the Ethereum RPC API

ethdb

ethstats implements the network stats reporting service

event deals with subscriptions to real-time events

internal Debugging support, JavaScript dependencies, testing support

les implements the Light Ethereum Subprotocol

light implements on-demand retrieval capable state and chain objects for the Ethereum Light Client

log provides an opinionated, simple toolkit for best-practice logging that is both human and machine readable

metrics port of Coda Hale's Metrics library. Unclear why this was not implemented as a separate library, like this one.

miner implements Ethereum block creation and mining

mobile contains the simplified mobile APIs to go-ethereum

node sets up multi-protocol Ethereum nodes

p2p implements the Ethereum p2p network protocols: Node Discovery Protocol, RLPx v5 Topic Discovery Protocol, Ethereum Node Records as defined in EIP-778, common network port mapping protocols, and p2p network simulation.

params

rlp implements the RLP serialization format

rpc provides access to the exported methods of an object across a network or other I/O connection

signer

swarm

tests implements execution of Ethereum JSON tests

trie implements Merkle Patricia Tries

vendor contains a minimal framework for creating and organizing command line Go applications, and a rich testing extension for Go's testing package

whisper implements the Whisper protocol

I used the following incantation to list the package names:

Shell

find . -maxdepth 1 -type d | sed 's^\./^^' | sed '/\..*/d'

The build/ directory does not contain a Go source package; instead, it contains scripts and configurations for building the package in various environments.

Smart Contract Source Code

The core/vm directory contains the files that implement the EVM. These files are part of the vm package. Let's look at two of them:

contract.go, which defines smart contract behavior.
contracts.go, responsible for executing smart contracts on the EVM.

Referenced Types

Two of the types used in the source files that we would like to understand are defined in common/types.go. Let's look at them first.

Address is defined as an array of 20 bytes:

Shell

const (
    HashLength    = 32
    AddressLength = 20
)

// Address represents the 20 byte address of an Ethereum account.
type Address [AddressLength]byte

Hash is defined as an array of 32 bytes:

Shell

// Hash represents the 32 byte Keccak256 hash of arbitrary data.
type Hash [HashLength]byte

The opcodes for each version of the EVM are defined in jump_table.go. The operation struct defines the properties:

Shell

type operation struct {
    // execute is the operation function
    execute executionFunc
    // gasCost is the gas function and returns the gas required for execution
    gasCost gasFunc
    // validateStack validates the stack (size) for the operation
    validateStack stackValidationFunc
    // memorySize returns the memory size required for the operation
    memorySize memorySizeFunc

    halts   bool // indicates whether the operation should halt further execution
    jumps   bool // indicates whether the program counter should not increment
    writes  bool // determines whether this a state modifying operation
    valid   bool // indication whether the retrieved operation is valid and known
    reverts bool // determines whether the operation reverts state (implicitly halts)
    returns bool // determines whether the operations sets the return data content
}

Notice the jumps property, a Boolean, which if set indicates that the program counter should not increment after executing any form of jump opcode.

The destinations type maps the hash of a smart contract to a bit vector for each the smart contract's entry points. If a bit is set, that indicates the EMV's program counter should increment after executing the entry point. analysis.go defines the destinations type like this:

Shell

// destinations stores one map per contract (keyed by hash of code).
// The maps contain an entry for each location of a JUMPDEST instruction.
type destinations map[common.Hash]bitvec

contract.go

This file defines smart contract behavior.

Imports

This comment applies to all of the Go source files in the entire project. I think the following absolute import would have been better specified as a relative import:

Shell

"github.com/ethereum/go-ethereum/common"

The relative import would look like this instead:

Shell

"../../common"

If relative imports were used instead of absolute imports that point to the github repository, local changes to the project made by a developer would automatically be picked up. As currently written, absolute imports cause local changes to be ignored, in favor of the version on github. It might take a software developer a while to realize that the reason why their changes are ignored by most of the code base is because absoluate imports were used. It would then be painful to for the developer to modify the affected source files throughout the project such that they used relative imports.

Types

The publicly visible AccountRef type is defined as:

Shell

// Account references are used during EVM initialisation and
// it's primary use is to fetch addresses. Removing this object
// proves difficult because of the cached jump destinations which
// are fetched from the parent contract (i.e. the caller), which
// is a ContractRef.
type AccountRef common.Address

The same file defines a type cast from AccountRef to Address:

Shell

// Address casts AccountRef to a Address
func (ar AccountRef) Address() common.Address { return (common.Address)(ar) }

The ContractRef interface is used by the Contract struct, which we'll see in a moment. This ContractRef interface just consists of an Address.

Shell

// ContractRef is a reference to the contract's backing object
type ContractRef interface {
    Address() common.Address
}

The Contract struct defines the behavior of Ethereum smart contracts, and is central to the topic, so here it is in all its glory:

Shell

type Contract struct {
    CallerAddress common.Address
    caller    ContractRef
    self      ContractRef

    jumpdests destinations // result of JUMPDEST analysis.

    Code     []byte
    CodeHash common.Hash
    CodeAddr *common.Address
    Input    []byte

    Gas   uint64
    value *big.Int

    Args []byte

    DelegateCall bool
}

CallerAddress is a publicly visible Address of the caller. caller and self are private ContractRefs, which as we know are really just Addresses.

jumpdests, a private field, has type destinations, which as we've already discussed defines if the entry point in the smart contract that need the program counter to be incremented after executing.

Code is a a publicly visible byte slice. We don't yet know if this is the smart contract source code, compiled code, or something else.

CodeHash is the publicly visible hash of the Code, while CodeAddr is a publicly visible pointer to the Address (of the code, presumably).

Gas is the publicly visible amount of Ethereum gas allocated by the user for executing this smart contract, stored as an unsigned 64-bit integer.

Value is a private pointer to a big integer. Possibly this might be the result of executing the contract?

Args is a publicly visible byte slice, not sure what it is for.

DelegateCall is a publicly visible Boolean value, unclear if this means the smart contract was invoked using delegatecall. From the documentation: "This means that a contract can dynamically load code from a different address at runtime. Storage, current address and balance still refer to the calling contract, only the code is taken from the called address. This makes it possible to implement the “library” feature in Solidity: Reusable library code that can be applied to a contract’s storage, e.g. in order to implement a complex data structure."

contracts.go

This file is responsible for executing smart contracts on the EVM.

Imports

The following imports are used:

Package sha256 from the crypto project implements the SHA224 and SHA256 hash algorithms as defined in FIPS 180-4.
errors, the Go language simple error handling primitives, such as error.
math/big implements arbitrary-precision arithmetic (big numbers).

Other packages in this project (go-ethereum):

Shell

"github.com/ethereum/go-ethereum/common"
"github.com/ethereum/go-ethereum/common/math"
"github.com/ethereum/go-ethereum/crypto"
"github.com/ethereum/go-ethereum/crypto/bn256"
"github.com/ethereum/go-ethereum/params"

Again, I think the above imports would have been better specified as relative imports:

Shell

"../../common"
"../../common/math"
"../../crypto"
"../../crypto/bn256"
"../../params"

ripemd160 implements the RIPEMD-160 hash algorithm, a secure replacement for the MD4 and MD5 hash functions. These hashes are also termed RIPE message digests.

Type PrecompiledContract

PrecompiledContract is the interface for native Go smart contracts. This interface is used by precompiled contracts, as we will see next. Contract is a struct defined in contract.go.

Pre-Compiled Contract Maps

These maps specify various types of cryptographic hashes and utility functions, accessed via their address.

PrecompiledContractsHomestead contains the default set of pre-compiled contract addresses used in the Frontier and Homestead releases of Ethereum: ecrecover, sha256hash, ripemd160hash and dataCopy.

PrecompiledContractsByzantium contains the default set of pre-compiled contract addresses used in the Byzantium Ethereum release. All of the previously defined pre-compiled contract addresses are provided in Byzantium, plus: bigModExp, bn256Add, bn256ScalarMul and bn256Pairing.

I’m not happy about the code duplication, whereby the contents of PrecompiledContractsHomestead are incorporated into PrecompiledContractsByzantium by listing the values again; this would be better expressed by referencing the values of PrecompiledContractsHomestead instead of duplicating them.

Contract Evaluator Function

The RunPrecompiledContract function runs and evaluates the output of a precompiled contract. It accepts three parameters:

A PrecompiledContract instance.
A byte array of input data.
A reference to a Contract, defined in contract.go, discussed above.

The function returns:

A byte array containing the output of the contract.
An error value, which could be nil.

Other Functions

RunPrecompiledContract – runs and evaluates the output of a precompiled contract; returns the output as a byte array and an error.
RequiredGas (overloaded) – Computes the gas required for input data, specified as a byte array and returns a uint64.
Run (overloaded) – Computes the smart contract for input data, specified as a byte array and returns the result as a left-padded byte array and an error.
newCurvePoint – Unmarshals a binary blob into a bn256 elliptic curve point. BN-curves are an elliptic curves suitable for cryptographic pairings that provide high security and efficiency cryptographic schemes. See the IETF paper on Barreto-Naehrig Curves for more information.

Jekyll::Drops::SiteDrop articles on LLMs

Mainframe image; Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License by PekoeBlaze

© Copyright 1994-2025 Michael Slinn. All rights reserved.
For requests to use this copyright-protected work in any manner, email mslinn@mslinn.com.

This website was made using Jekyll and Mike Slinn’s Jekyll Plugins.

Ethereum Source Code Walkthrough

April 27, 2020 Update

License

Source Files

Packages

Smart Contract Source Code

Referenced Types

contract.go

Imports

Types

contracts.go

Imports

Type PrecompiledContract

Pre-Compiled Contract Maps

Contract Evaluator Function

Other Functions

Blog

Mike Elsewhere