OneSwap Series 6 — Expensive Storage

OneSwap
9 min readSep 30, 2020

Storage overview

The Gas fees on Ethereum are notoriously high. Yet faced with the recent boom in DeFi, we can’t help but heaving a sigh of lament.

Let’s randomly take a liquidity-removing transaction from UniSwap to see how high the Gas fee could be:

Tx Fee:   0.09569346 Ether ($36.32)
Gas Price: 0.000000505 Ether (505 Gwei)
Gas Used: 189492

Such a high Gas fee has raised the threshold for users. Key accounts and giants capture the bulk of the network traffic, leaving retail investors in awe and keeping them isolated from Ethereum.

According to the calculation formula of the Gas fee, Gas Fee = Gas Price * Gas ​​Used, there are two contributing factors behind the high Gas fee. One is the price of unit Gas, which is GasPrice, and the other is Gas Used, the amount of Gas required to execute the contract code.

GasPrice is affected by market supply and demand which cannot be controlled by contract developers, while GasUsed is what developers can control. Well-optimized contract code can function with extremely low gas consumption. That also reduces the Gas fee, bringing retail investors a glimmer of hope.

Each operation in the smart contract, that is, each line of work code, corresponds to a certain amount of Gas, and the bulk of the most expensive operations are related to storage. The smarter the use of storage, the less Gas will be consumed during code execution.

Let’s first look at the gas consumption of storage-related operations:

create new slot: 20000 gas
change slot content for the first time: 5000 gas
change slot content again: 800 gas
load slot content: 800 gas
delete slot: refund 10000 gas

Storage costs so much Gas because it is stored in all full-nodes of the blockchain and occupies resources such as hard disks; at the same time, when mining new blocks, miners need to fill in the block header with the world state’s merkle root, which means a lot of calculation on the storage, because storage is part of the world state.

On the contrary is the memory type variables specified by the memory keyword in the contract written with Solidity language. It costs much less Gas because its life cycle is limited to the contract execution and after execution the memory is reclaimed without involving the on-chain storage or consensus process.

The state variables of a contract cannot be memory. This is because these contracts deployed on Ethereum are shared in different calls, and the state variables of the contract need to be kept up to date in different calls. That is similar to a distributed database in which users need to be able to change these state variables and access them again later. Obviously, that goes beyond the reach of state variables of memory type.

Storage optimization

In the storage are the data that we can modify through transactions and must be able to access again, and the storage costs so much, so we need to do something about the data if we want to save the Gas fee for users.

Not all data needs to be on the chain

The storage on the chain has a high cost, so do not put on the chain the redundant, unnecessary information and information that can be calculated off-chain. For example, all the buy and sell orders are stored on the chain, but there is no need to store the number of these orders though it is more friendly to DApp to provide such information. Here is a sensible choice: you can obtain the information by synchronizing on-chain events off-chain, and store it in traditional databases such as MySQL to provide services for DApps.

For another example, information like the time when an order is placed can be obtained through querying the transaction itself, so we do not need to keep this information in the world state.

Not all state variables need to be set to the storage type

Some of the state variables of the contract may need to be written in the contract call, but some are only used as global constant information. We usually define variables for this purpose as constant types. Variables that can only be determined during deployment can be defined as immutable types. These variables can be assigned in the constructor and stored in the bytecode of the contract. As we know, it is very cheap to read the code of the contract.

For example, in OneSwap’s OneSwapPairProxy contract, there is a code snippet as below:

uint internal immutable _immuFactory;
uint internal immutable _immuMoneyToken;
uint internal immutable _immuStockToken;
uint internal immutable _immuOnes;
uint internal immutable _immuOther;

The information can only be known when it is created by the factory contract, so they cannot be marked as constant. But these variables do not need to be modified throughout the life cycle after they are created, so it makes the most sense to mark them as immutable. They are stored in the contract code instead of on-chain storage. Isn’t it a good method to save Gas? Just apply it to your contract code!

Shorten your data length as much as possible

Since the storage operations of EVM are carried out in 256bit units, we must try our best to limit the data accessed at one time to 256bit. For example:

Orders in OneSwap are stored in the following two arrays:

uint[1<<22] private _sellOrders;
uint[1<<22] private _buyOrders;

An order contains the price, quantity, creator, and id of the order. This information may be organized as follows

struct Order {
address sender;
uint price;
uint amount;
uint nextID;
}

If so, storing and reading such a struct could be terrible as it occupies four storage slots (each slot is 256bit), which requires four read or write operations. To be specific, it costs 80,000 Gas to create a new order and 3,200 Gas to read an order.

So let’s take a look at how the developers of OneSwap organize their orders:

// compress an order into a 256b integer
function _order2uint(Order memory order) internal pure returns (uint) {
uint n = uint(order.sender);
n = (n<<32) | order.price;
n = (n<<42) | order.amount;
n = (n<<22) | order.nextID;
return n;
}

// extract an order from a 256b integer
function _uint2order(uint n) internal pure returns (Order memory) {
Order memory order;
order.nextID = uint32(n & ((1<<22)-1));
n = n >> 22;
order.amount = uint64(n & ((1<<42)-1));
n = n >> 42;
order.price = uint32(n & ((1<<32)-1));
n = n >> 32;
order.sender = address(n);
return order;
}

They compressed the four pieces of necessary information of the order into one uint while ensuring their respective effective digits, and encapsulated the operation of the storage through functions, thus creating and reading orders through one reading and writing operation! After the information is compressed, the Gas fee is reduced yet at the cost of readability and programmability. So the developers defined a more readable Order struct in the memory to map the uint:

struct Order { //total 256 bits
address sender; //160 bits, sender creates this order
uint32 price; // 32-bit decimal floating point number
uint64 amount; // 42 bits are used, the stock amount to be sold or bought
uint32 nextID; // 22 bits are used
}

Only keep hash of the information

OneSwap developers take another approach to compress storage space. Let us look at the following example in the LockSend contract:

mapping(bytes32 => uint) public lockSendInfos;

The information of the user’s locked transfer is stored in the above table, including the initiator of the transfer, the recipient, the transferred token, and the unlock time. When initiating an unlock operation, the user needs to verify whether the initiator, recipient, token and unlock time are the same as in the locked information. If we use the struct to store these four parameters, it will take up 4 storage slots. So how can we save storage space while achieving the goal?

OneSwap developers gave the following answer:

keccak256(abi.encodePacked(from, to, token, unlockTime))

First, the four parameters were abi-encoded, and then a keccak hash operation was performed on the encoded byte sequence. They used this hash value as the key to access the above lockSendInfos table. When a user unlocks the information, his input is hashed in the same way. If the two are consistent, it means they correspond to one locked transfer information.

Cache your storage

If some storage variables are frequently used in your contract, then multiple times of reading and writing will inevitably lead to higher Gas consumption. You should cache your variables in memory instead of operating the storage every time.

For example, during the execution of the contract, some variables, such as the optimal buy order id and sell order id in the order book and the number of tokens in the order book, need to be read and modified multiple times. To this end, OneSwap developers defined a Context struct which contains the context during the contract execution process and is always located in the memory, as shown below:

struct Context {
// this order is a limit order
bool isLimitOrder;
// the new order's id, it is only used when a limit order is not fully dealt
uint32 newOrderID;
// for buy-order, it's remained money amount; for sell-order, it's remained stock amount
uint remainAmount;
// it points to the first order in the opposite order book against current order
uint32 firstID;
// it points to the first order in the buy-order book
uint32 firstBuyID;
// it points to the first order in the sell-order book
uint32 firstSellID;
// the amount goes into the pool, for buy-order, it's money amount; for sell-order, it's stock amount
uint amountIntoPool;
// the total dealt money and stock in the order book
uint dealMoneyInBook;
uint dealStockInBook;
// cache these values from storage to memory
uint reserveMoney;
uint reserveStock;
uint bookedMoney;
uint bookedStock;
// reserveMoney or reserveStock is changed
bool reserveChanged;
// the taker has dealt in the orderbook
bool hasDealtInOrderBook;
// the current taker order
Order order;
// the following data come from proxy
uint64 stockUnit;
uint64 priceMul;
uint64 priceDiv;
address stockToken;
address moneyToken;
address ones;
address factory;
}

A contract has a lot of fields, and they will be read and written many times in a contract call. At the beginning of the external interface, these fields are initialized with information from storage and calldata. Then the contract constantly reads and writes them in the following code execution, and, when the contract execution is coming to an end, updates some of these fields to storage.

From the storage reading at the beginning of the contract execution, to the storage writing in the end, all operations are performed in memory at a low cost.

It’s worth mentioning that this global context design also solves the problem of stack too deep in the Solidity compiler. This error occurs when there are too many local variables in the function. When a contract code is as complex as OneSwap, you will find that the local variables available to you are insufficient. In that case, context is a good choice as it stores information in memory and saves space on the stack.

Gas Token

5,000 Gas will be refunded for deleting a storage slot. Taking advantage of that feature, a special ERC20 token called gas token has come into being. It is generated by creating a new storage slot when the Gas price of Ethereum is at a low level, and when the Gas price is high, it obtains the refunded Gas by deleting the previously generated storage slot. The refunded Gas can cover the Gas consumed for calling a contract, so the Gas fee can be cut when the Gas price is high.

OneSwap does not directly use Gas Token. However, when placing an order, a Maker needs to pay 20,000 Gas to create a new storage slot to save his order; when trading with the order, a Taker deletes the storage slot and is refunded with 10,000 Gas. This is equivalent to the Maker preparing a Gas Token for the Taker. Therefore, even if the Taker trades with several orders of the Maker in a row, the Gas consumption can be controlled at a low level.

Using Gas to limit storage operations is not a reliable solution

Storage is safe and reliable in itself, but sometimes due to factors like changes in EVM gas rules, contract operations related to storage gas have become uncertain.

Let’s look at an example in OneSwap:

to.call{value: value, gas: 9000}(new bytes(0))

In the code, ETH in the value amount is transferred to the to address. This call is only allocated 9,000 Gas. In other words, if the called address is a contract, its execution logic can only consume 9,000 Gas at most; otherwise, an out-of-gas error will occur. The developer aims to restrict the called address to just reading, updating, and calculating, rather than creating storage.

Assuming that, the Ethereum rules get changed, and modifying storage requires 10,000 Gas, then 9,000 Gas is not enough, and an out of gas error will occur in the called address. If, unfortunately, the caller contract has to verify the return value of this transfer call is true before continuing the logic, then it will suspend working.

Such problems and risks require prudent consideration by contract developers based on their own contract logic. It is advised to prepare an upgrade mechanism so the contract can be modified according to changes in the rules of Ethereum.

Summary

Storage operations cost the bulk of the Gas consumed by contract calls. To reduce transaction costs, the most vital is to properly arrange and operate the storage in the contract. You need to carefully select those data that really need to be put on the chain, and then compress them to a few storage slots in a proper way. If storage variables need to be frequently accessed in your program, then caching them in memory is a good choice. Last but not the least, don’t design your program relying on concrete Gas number of individual storage operations. At least be aware of the potential risks.

--

--

OneSwap

A fully decentralized exchange protocol on Smart Contract, with permission-free token listing and automated market making.