The principle and proof of Solidity Merkle trees
what is merkle trees
Definition
Merkle trees, named after Ralph Merkle, are a fundamental data structure used in cryptography and computer science. They are commonly employed in blockchain systems to ensure data integrity and enable efficient verification.
The principle behind Merkle trees is based on the concept of hash functions. A hash function is a mathematical algorithm that takes an input (data) and produces a fixed-size output, known as a hash value or digest. The key properties of a hash function are collision resistance and the avalanche effect.
how to generate merkle tree data via solidity
we can calculate hash value through solidity keccak256 function:
1 | keccak256(abi.encodePacked(toHashValue) |
After performing the hash operation on the values of the leaf nodes, the adjacent nodes are then hashed together until only a single root node remains.
Suppose there are two adjacent nodes, A and B. The order of the hash operation, whether it is hash(A+B) or hash(B+A), is determined by the sizes of A and B. In the corresponding Merkle code in OpenZeppelin, we can find the following code snippet:
1 | function _hashPair(bytes32 a, bytes32 b) private pure returns (bytes32) { |
In summary, the smaller value is typically placed in front to determine the order for computing the hash values. This may cause some confusion when performing the calculations manually.
In practical projects, it is common to store only the final result, the root hash value, in the contract. Storing a large number of addresses in the contract would consume a significant amount of gas fees. By using a Merkle tree calculation, the amount of data that needs to be stored is greatly reduced.
Let’s demonstrate how to calculate and store the root hash value using a setup example from Foundry:
1 |
|
For demonstration purposes, we have provided only four addresses. In actual projects, the number of addresses can be significantly larger.
how to proof merkle data
Once we have stored the root hash value in the contract, how do we verify if a client-provided address is a valid address or belongs to a whitelist?
Firstly, we need to hash the address as the third parameter. Then, we pass the hash value of the address, along with the adjacent hash values, as the proof to the verification function.
The proof list corresponds to the red-marked area in the image below.
test proof function:
1 | function testVerify() public { |
Here is the complete Foundry testing code from github:https://gist.github.com/coffiasd/96022abd7f9a8a2189bda14bf9e755dc
Applications in Real-World Projects
- airdrop
- whitelist
Common Vulnerabilities in Smart Contract Audits
1 | function parentHash(bytes32 a, bytes32 b) public pure returns (bytes32) { |
The statement “abi.encode(address, uint)” will produce 64 bytes. Since “abi.encode(bytes32, bytes32)” also yields 64 bytes, hash collisions may potentially occur between leaf nodes and parent nodes.
Follow me on twitter :https://twitter.com/coffiasse
TG:@coffiasd