Building a word-class L1 chain

from inception to mainnet

Flavius Burca
#blockchain
#web3

Let's start with a few stats at the time of writing this article, taken from https://explorer.arthera.net:

  • Average TPS in 2024: 250k 
  • Total addresses created: 208.671K
  • Completed transactions: 50.557M
  • Total blocks: 19.481M
  • Total contracts: 1.391K
  • Verified contracts: 239
  • Active validators: 23
  • ERC tokens launched on the chain: 358
  • Number of dApps that deployed on our chain: 134

It all started in 2023 with an idea described in one of my previous articles — Introducing BlazeDAG. After months of research on DAG consensus mechanisms, I finally compiled a paper describing BlazeDAG: the first DAG-based blockchain consensus mechanism to reach 830k TPS with 20 validators deployed distributed over 5 continents.

I immediately started looking for options to implement this over an EVM chain. After researching existing chains, I found a paper about Lachesis, another DAG-based consensus, and the source code for their reference implementation. After digging through their codebase, it seemed a good starting point because it was a DAG, and it would be easy to integrate with Geth, since a reference implementation already exists - the Fantom blockchain. 

I started pitching the idea to various people (both tech and non-tech) to get support for building BlazeDAG. After a couple of months of asking around, I stumbled upon a team that wanted to launch their own EVM chain, and BlazeDAG seemed to be just what they needed to differentiate themselves from other EVMs. The team seemed serious, with people from around the globe, but they had no technical expertise in building an L1. So I went on. This is where the Arthera journey started, and I was in charge of building the whole blockchain infrastructure.

The team decided to launch a simple EVM chain with a neat feature—gasless transactions. This would allow people and dApp creators to purchase gas passes and transact with no gas fees for some time.

Replace the Geth IBFT consensus with Lachesis.

Starting from Fantom would have been ideal, as it had everything I needed. The problem was Fantom was built on an old version of Geth (1.10.8). So, I went the other way around. I started from the latest pre-POS release of Geth and added Lachesis on top of it. It was daunting, but I had the first version running after a couple of months.

I also added a way to create a new genesis by serializing the blocks, epoch, and EVM state to a single file and loading the chain's initial state from that file.

Adding support for staking and slashing

Lachesis is a POS-based leaderless protocol where block votes are based on the weight of a specific validator (i.e., the amount of stake it has). But Lachesis needs a way to manage the actual staking process, delegations, rewards, slashing, withdrawal, etc. But where to implement this? Native code or a smart contract. Native code is hard to upgrade, if you need to fix a bug or change a config param, you must convince all validators to upgrade to the new version. This was inappropriate for a fast-paced, dynamic L1 that was still in its infancy, so I wrote this in a smart contract. And I did. I wrote a beautiful contract with all the features needed to manage validators, staking, unstaking, slashing,  delegations, rewards, claiming, re-staking, withdrawals, and foundation-delegated stakes to validators.

Now, the problem was wiring this into Lachesis and Geth. Staking needed to be part of Genesis, where I also added a way to configure the initial set of validators and pre-mined accounts. 

After extensive investigations in Geth and Lachesis, I realized that staking was not the only feature I would need in the future. Gasless passes, gas rebates and various chain configuration parameters could be implemented using a set of smart contracts. 

So, I started implementing the concept of system smart contracts.

System smart contracts

These are special contracts that are deployed and initialized in the genesis. They are called by the chain node from native code, don't consume gas, must be upgraded somehow, and must be under strict access control. Again, mode coding and hacking into Geth got me the final result. Right now, this feature is being used extensively for many system features.

Adding gas-less transactions on top of Geth

This task is twofold: both EOA and Smart Contract accounts must support this feature. Regular users (EOA) must be able to interact with the chain based on the gas pass. This means if the user has a 0 balance and a valid gas pass, it should be able to send transactions. If he has some balance and a valid gas pass, he should first consume his pass and then use his balance. Things are different for smart contracts. Users must interact with a contract without paying gas fees if the contract has a valid gas pass.

Geth is strict about gas fees and zero-balance accounts, so making this work was challenging. I ended up hacking the TX pool and reimplementing gas estimation, calculations, and refunds. I wrote all gas pass logic in several smart contracts and heavily used the previously developed system contracts capability to wire these contracts inside Geth. All gas calculations had to consider the logic from the gas pass contracts. Again, after a couple of months, this was live.

Dummy balances

The gas pass is nice and useful, and dApps have started showing a lot of interest in testing it. But there was a problem. Many wallets don't allow zero-balance accounts, so users with a valid gas pass but zero balance can't work with the chain. 

The only solution I could think of here was to report a dummy balance of 1 AA if the user had a valid gas pass. Everybody welcomed the idea well, and implementing it was not difficult; I had to change the RPC methods slightly.

Creating the wallet and other software pieces

As we were approaching testnet, many things started to pop up: the need for a web wallet where users can manage their gas passes and staking, a place to see validators, an admin console to tune fast the configuration of system contracts (staking params, network params, etc.) and of course a block explorer. So I went back to coding, and I came up with a few products:

I chose Angular for all our dApps. People are used to React, but I was very familiar with Angular, which helped me deliver stuff fast. For example, the wallet took me 1 week to develop, and the validators page was live in 2 days.

The Block Explorer

Blockscout was easy to do. It's a fantastic block explorer that competes with Etherscan. Since it was my first deployment and their new front-end just came out, I contacted their team to ask for some directions and problems I was having. They responded fast and were very supportive. Kudos to the Blockscout team for that! 

The Arthera block explorer is available at https://explorer.arthera.net and https://explorer-test.arthera.net 

Deploying Testnet

It was time to give the new toy to the community to test it. Several dApps had already lined up, and I started planning the deployment. My first go-to cloud for getting VMs is Hetzner. So I went on to create an account in the name of Arthera, but to my surprise it didn't get approved. I wrote to their support and got the response that they don't allow any cryptocurrency-related businesses. Disappointing….

Next on the list was OVH. By the way, their user interface is soo terrible and slow… but it got the job done. I got several VMs as follows:

  • 5 VMs to act as validator nodes, all of them in Europe
  • 3 VMs to act as RPC nodes, 1 in Canada, 1 in Europe, and 1 in Singapore
  • 1 VM for Blockscout

I chose AWS S3 and Cloudfront to deploy web apps like the wallet, faucet, and validator page. Next, I needed to make users reach the closest RPC node. I used AWS Route53 and Geolocation DNS routing for the 3 RPC nodes. 

I created the genesis file, registered the validators using the node CLI, and fired up all services. The network started to produce blocks!

Performance testing

Now, this was a hassle. I wanted to test the chain's performance to see how much TPS it could handle. Looking for tools on the net, I found one written in Go and another in NodeJS. I got dedicated VMs for them and started to fire transactions to the chain—simple transfer transactions at first and then smart contract calls. 

The first problem was nonce handling. Neither of the two tools did that correctly, and performance tests started to fail randomly with 'nonce too low.' Again, I tried different strategies to manage nonces ahead of time, and after a few days, it started to work.

Initial tests were disappointing: 300 TPS.  After much debugging and investigation, it turned out that Lachesis was not handling event emissions properly, kernel params for RPC nodes were not tuned, and VMs were not appropriately sized. After one week of hacking, I raised the number to 3800 TPS. This was a satisfying number for now, at least until BlazeDAG arrives.

When users and dApps started using the testnet, it was stable. However, I did have an issue with OVH, which stopped all our services because somebody forgot to pay the bill :)) After settling the outstanding amount quickly, everything started working again. This was a good DR test, and I was delighted that the whole network could recover after such an event. It was also an alarm for us to keep all our infrastructure distributed among several cloud providers.

Planning for mainnet

The team set the mainnet launch for 25th December 2023. I started to plan for a proper deployment. Security was paramount, so I sent everything to audit with a professional company: blockchain code, system contracts, and web apps. Several issues were found, and I fixed them. The codebase was ready to go live.

About where the infrastructure would be deployed, I chose three cloud providers: AWS, OVH, and GCP. After careful consideration, I laid out the following architecture:

  • 3 bootnodes in AWS, 3 in GCP, and 2 in OVH
  • 2 RPC nodes in the US, 2 in Europe, and 2 in Singapore. I also installed nginx on each RPC node to immediately failover requests to the second node and, if unavailable, to nodes in other regions.
  • 2 RPC archive VMs with a lot of storage, load-balanced by DNS round-robin
  • 2 Blockscout VMs, load-balanced by nginx
  • AWS Route35 for DNS, Geographic routing and weighted balancing between nodes in the same region.
  • AWS S3 and Cloudfront for the wallet and validator apps
  • Netdata for monitoring the infrastructure

I updated the VMs to the latest stable kernel and upgraded all system packages. I also enabled the firewall to allow only needed ports. I changed the SSH ports, enabled only key authentication, disabled root login, and enforced all kinds of rules. SSH is only allowed from trusted IPs. I configured the bootnodes in the genesis, pre-mined some accounts following tokenomics, and started everything.

The network started to produce blocks. We did not release this to the public yet because I insisted on testing it properly. We invited some dApps to deploy their contracts and some users to make transactions. Everything was working.

Note that even though the mainnet went live, the AA native coin was not yet available. The pre-mined supply of AA was stored in the genesis wallets, and no AA was sent to anyone. Everybody was working with gas passes, and it was beautiful! Users started to create NFTs, ERC-20 tokens, telegram mini-games, etc. The ecosystem was booming without its native coin!

While this was nice, validators who wanted to join the network needed the AA native coin. I couldn't send them any AA, so I had to devise another mechanism allowing Arthera to delegate AA to a validator. That validator would not own the AA; only the rewards would be. I locked withdrawals for all validators, and we started to onboard them. Again, everything went super smooth. 

The mainnet went live on 25 December 2023, and everything worked perfectly. Minimal interventions were required to increase storage capacity for VMs, archive logs, and perform other everyday tasks. 

We had days with 500k actual TPS generated by various dApps, contests, and users that did all kinds of quests, tasks, and other stuff. 

The TGE was on 3rd December 2024 when the native AA coin went into circulation. Again, everything went smoothly.

Final thoughts

Building and launching an L1 chain is a massive task if you are serious about it. While you can always take an existing chain and run your own network, I strongly advise against it. Unless you understand that chain very well, and I mean every line of code from it, you are leaving things to chance and gambling with people's assets.

This post was for anyone who wants to launch their own L1. Even the most battle-tested chains have their issues. Solana, Ethereum, Bitcoin — it doesn't matter. If you have a critical issue regarding consensus, data corruption, EVM, networking, etc., you must know how to fix it, or you and your users can lose a lot!

Now, it's time to start architecting BlazeDAG.