Scaling Chainlink in 2020
Everything within this article is a collection of my own thoughts, research, and findings. It does not reflect any official stance taken by the Chainlink team. If you find any factual errors in this article, please let me know.
A Little Background
2019 was quite a monumental year for Chainlink, to say the least. Throughout the year there were countless integrations of the oracle framework from a wide array of different entities: crypto startups, large enterprises, blockchain/DLT projects, data providers, node operators, ecosystem builders and much more. I don’t blame you if you couldn’t keep up with the flurry of activity that's been taking place! Just one look at the chart below, and it becomes immediately clear that interest in the Chainlink protocol has been growing exponentially since 2017.
The Chainlink mainnet officially went live on May 30th (after three independent security audits) with just three security reviewed node operators (Chainlink, LinkPool, and Fiews). For most of 2019 there was just a single price feed reference contract that aggregated the ETH/USD market price every five minutes (update frequency later adjusted). While the chainlink network was technically live on the Ethereum mainnet, you could think of it more as an extended testing period as the Chainlink team carefully analyzed the reliability and security of their sole Aggregator smart contract (also audited) before they began expanding to include a wider array of different oracle networks.
Before you can run, you must first learn how to walk, especially when your value proposition is building new infrastructure that facilitates trust. Similar to raising a child, nurturing the early outgrowths of trust in the network is crucial to forming a foundation that can support long term success.
Over the course of the second half of 2019, the team successfully scaled the ETH/USD price feed from just three initial nodes (and three data sources) to their stated goal of 21 independent, security reviewed, and Sybil resistant nodes (and ten data sources) with zero issues or downtime. This proved that not only does Chainlink have scalable security but it can do so in an elegant manner wherein developers don’t have to redeploy the entire Aggregator contract each time they want to add more nodes (or remove nodes). This flexibility is important as developers can choose and pay for the exact level of decentralization/security required for their smart contract today, while simultaneously having the capability of easily scaling that security up as needed as the value of that contract rises.
Since launch, the Chainlink ecosystem has grown substantially from just three nodes to over 27 security reviewed nodes that are live today feeding off-chain data to the 29 various Chainlink Price Reference Data contracts currently being consumed by several Ethereum mainnet DeFi protocols such as Synthetix, Loopring, Ampleforth, and Aave. This means that Chainlink Oracle Networks are already securing over $150M worth of collateral locked in the above DeFi projects (now over $740M with the addition of Celsius). Oracle security is absolutely paramount for contracts that lock up and direct user funds. Without a secure oracle solution, DeFi protocols would be putting users’ collateral at risk of loss by malicious actors manipulating the oracle system that delivers the data that triggers and ultimately determines the action a smart contract takes. As the saying goes, “garbage in, garbage out.”
There are also 100+ independent community nodes listed on the open marketplace market.link. There are also 21+ Chainlink-connected data sources and 23 Chainlink-powered dApps on the API marketplace honeycomb.market. The pace of growth has been nothing short of mind-blowing and showcases a real demand by smart contracts to be connected to external sources of data (usually market prices) from outside the blockchain (off-chain) which is securely and reliably delivered to the smart contract by time-tested and security-hardened oracle networks.
LINK Token Usage
While these decentralized oracle networks are vastly more secure and even cheaper than trying to develop and run your own centralized oracle system, the current costs of Chainlink oracle networks is still quite real and primarily falls into two buckets. Firstly, there are the costs associated with paying oracle nodes in the ERC677 token LINK to access their off-chain data/APIs/compute. Secondly, there are the gas costs which are paid in the native currency of the Ethereum blockchain, ETH, each time an on-chain transaction is created. Specifically, ETH is needed when requesting data and when nodes deliver and aggregate such data.
As to why Chainlink needs its own token, I wrote about it in a thread here and The Crypto Oracle wrote about it here. In short,
- It enables TransferAndCall functionality (allows requesters to send LINK payments and request data from nodes in a single transaction)
- Isolates the security and economics of the overall network from unrelated external factors (incentives are aligned by having active nodes hold a share of the network and prevent security from being fragmented into many disparate networks)
- Forces nodes to have “skin-in-the-game” (LINK will be staked by nodes as collateral which can be slashed for malicious behavior or non-performance)
- Allows the Chainlink team to subsidize early oracle networks to make sure operating a node in the early days of the network is economically feasible (solving the chicken or egg problem of building a new network)
The amount of LINK paid to nodes for their off-chain data depends on the value of the contract being triggered, the use of free/paid APIs, what rates the nodes are willing to accept, and what the requester is willing to pay. Thus, before a network can be established, a price equilibrium needs to be found. The LINK payments need to be high enough to prevent bribery attacks, but low enough to be economically sustainable in the long term. This cost is unique for every integrating contract and oracle network and should be evaluated on a case by case basis.
For example, each node feeding data to the decentralized ETH/USD (21 nodes) and BTC/USD (20 nodes) price reference networks are paid quite handsomely at 0.333333333333333333 LINK per data feed, which has an update frequency of every twenty minutes (or a 0.5% deviation from last update). This amounts to ~7 LINK in total to update just one price feed network. This means a node that delivers data to just one of these price networks is currently paid ~8,760 LINK per year ($40k at current prices) and double that if the node is delivering data to both (ETH and BTC) price feed networks. This puts the total LINK cost of securing just one of these price networks at ~180,000 LINK per year (Over $800,000 at current prices) and double that for securing both networks. These numbers are calculated using worst case scenarios where feeds only update every twenty minutes and volatility is never above 0.5% within that time period (which is unlikely so nodes are actually paid more than this).
The parameters above have since been updated from when I initially wrote this article. As of April, the ETH/USD and BTC/USD feeds (now both 21 nodes each) update at every 1% deviation threshold since the last update and a heartbeat of every two hours. Node payments have also been bumped up to 2 LINK per node per update per data feed. This means node income is now more variable as these networks will trigger updates (and node payments) still incredibly often during times of high volatility, but much less often during times of low volatility.
It should also be noted that, down the line, the amount of tokens paid to each node operator can vary depending on a predetermined function (e.g. first nodes to respond get paid the most), and/or be pegged to a certain dollar amount of LINK tokens (e.g. $0.30 of LINK paid per node for each update) by leveraging a LINK/USD price feed network.
These LINK payment expenses to nodes are required to prevent bribery attacks (ETH and BTC price feeds are high value targets) as nodes who try to cheat would immediately get removed from that price feed (and likely all price feeds they currently feed data to, not just the one attacked). This would amount to losing out on all future revenue, which as demonstrated above, is quite a substantial amount. Such an attempted attack would also destroy a node’s reputation making any chance of them earning more money in the Chainlink ecosystem essentially impossible. Proof of the node’s malicious activity is immutable and forever viewable on-chain or on the Chainlink explorer by anyone. If they wanted to rejoin the network and make any revenue, they would have to create a brand new node and build their reputation all over again from scratch.
Potential attackers would have to convince not just one node to take on this monumental risk but at least 50% of the nodes in an oracle network (since the current Aggregator takes the median). This becomes more and more improbable (and expensive) too the more nodes that are added to a particular network (and the more each node is paid). These nodes would not want to lose out on such a lucrative revenue stream (i.e. payments in LINK and price appreciation of that LINK) for an insignificant bribe especially in the early days of the network. Many of these node operators are also public-facing businesses that offer other services to their customers (PoS validator pools, NaaS infrastructure, etc). Thus, their reputation determines not only their revenue in the Chainlink network, but for all other networks they currently serve. Additionally, node operators could be held legally responsible for their malicious actions as oracle manipulation is already illegal in traditional financial markets.
The LINK spent on these networks is a big reason why these Chainlink Price Reference Contracts are currently the most secure oracle networks on the market. The cryptoeconomic incentives heavily favor honest nodes as dishonest nodes will lose most, if not all, future revenue in the Chainlink economy. In order to raise the security of an oracle network, the creator/requester could take a combination of multiple paths: they could require each node to stake some amount of LINK as collateral which gets slashed if the node responds with outlier/faulty data or not at all (currently still in development), they could pay the nodes more LINK per request, and/or they could add more nodes to their oracle network to increase both the liveness and security guarantees. All of these methods increase the cost of bribery/collusion attacks, making the oracle network more dependable for the consuming smart contracts.
To sum it up, the LINK paid to nodes operators in each oracle network directly affects the level of security for such a network. While some of the payment is used by nodes to subsidize the ETH gas costs associated with creating on-chain transactions in response to data requests, most of it is pocketed by the nodes as profit. This profit incentive mechanism ensures nodes act honestly in delivering data and do not succumb to bribes. If you are looking to reduce costs of your oracle network, but keep security at the same level (or higher), reducing the amount of LINK paid to nodes is generally not the way to go.
Sharing the Costs
One way to mitigate these LINK payment costs is to have many projects contribute and support a common oracle network (like the ETH/USD reference feed). In this regard, the total payment to nodes remains the same, but the cost borne by each individual project goes down (while maintaining the same level of oracle security). Each project now only has to contribute a fraction of the overall funds required to run the network securely. These shared networks effectively become a public good available to any crypto project or smart contract developer, as no one project could support such a highly decentralized oracle network that’s sufficiently funded on their own.
If a project wanted to create a proprietary and isolated price feed, specifically to prevent freeloading of the oracle network they support, they could setup and fund their own oracle network using a whitelist-supported Aggregator contract (in development). In such a model, only pre-approved on-chain contracts can query the Aggregator for its latest/historical data. This is possible as Chainlink is a framework for building/connecting to any kind of oracle network. What works for one project, may not for another; this flexibility is key and is one of the killer features of Chainlink.
By designing oracle networks as public goods that are shared by many projects, the per-node LINK payment costs are reduced, as the costs are spread amongst each individual project without lowering the oracle network’s overall security. But this isn’t the only area where efficiencies can be made in regards to reducing the upkeep costs of an oracle network. We also need to examine the ETH used to pay for Ethereum transactions.
ETH Gas Payments
Every time you want to make a transaction on the Ethereum network, you need to pay for the costs associated with how much computational effort miners need to endure in order to process and verify that transaction (this is to protect against DDoS and spam attacks). The more complex a transaction is the more gas that’s required. A simple ETH transfer costs only 21,000 units of gas, while a more complex smart contract operation may cost up to 10,000,000 gas (current gas limit of a single Ethereum block). Every individual transaction has a gas limit, which the sender determines by setting a maximum limit on how much gas a transaction is able to consume.
The user initiating the transaction also chooses the gas price they are willing to pay, denominated in Gwei, for each unit of gas consumed (1 Gwei = 0.000000001 ETH). So while the gas used in a transaction is always fairly static, the gas price you need to pay for each unit of gas can change as the Ethereum network as a whole gets more or less congested (Remember the CryptoKitties craze in late 2017?). If you want your transaction to be processed and confirmed within a timely manner, you will have to pay a higher gas price as miners prioritize transactions with the highest gas price first. Miners can only fit so many transactions in a single block and ultimately want to maximize their profits. Essentially, Ethereum is a decentralized computer with limited bandwidth where users can pay more for better access to quicker bandwidth.
Currently, the Chainlink protocol is the third highest consumer of gas in the Ethereum network (can also be thought of as the third highest consumer of Ethereum blockspace). This is just behind dYdX (a margin trading DEX) and the stablecoin Tether USDT (used for value transfer/arbitrage across cryptocurrency exchanges). While Chainlink having this high level of gas consumption is considered bullish to many (as it shows real network usage), it is also one the major factors that contribute to the current costs of running and maintaining Chainlink oracle networks (these costs are borne by both requesters and responding node operators). The ultimate goal is to increase usage, while decreasing the amount of gas consumed.
Understanding the basics of Ethereum gas is needed to understand where most of the current costs of maintaining a Chainlink oracle network originates from. The gas required for requesting an update of an on-chain Aggregator contract (for say a price feed) from an oracle network increases linearly the more nodes that need to be pinged for their data. For example, back when the ETH/USD price feed featured only three nodes, an update request transaction cost 378,440 gas (~$2.04 at 20 Gwei gas price), while today with 21 nodes in the network it costs 2,374,048 gas (~$12.82 at 20 Gwei) to initiate an update. This is already 23% of the available bandwidth in a single Ethereum block!
Besides the gas costs associated with requesting an update from an oracle network, there are also the costs each responding node incurs when responding back on-chain with their data. Some nodes may also have to perform and pay for on-chain data aggregation (and update the trusted answer) if the minimum number of responses (a predetermined threshold) has been reached for that particular update round. For the ETH and BTC price feeds, the minimum threshold to start aggregating is after 14 responses (later lowered to 9). This currently costs anywhere from 42,204 gas per response (~$0.23 at 20 Gwei) if aggregation doesn’t need to be performed, all the way up to 144,776 gas per response (~$0.78 at 20 Gwei) if on-chain aggregation needs to be performed. These gas costs scale linearly with how many nodes were pinged for their data (more nodes = more response transactions).
500,000 is the default gas limit set per response transaction by Chainlink nodes. This prevents malicious data requesters from having nodes respond to a malicious contract that tries to drain as much of a node’s ETH funds as possible by performing needlessly complex operations, yet still leave enough gas to allow for the aggregation of many data points. This can also save on gas costs as miners are more picky on transactions that have a high gas limit since those take up a larger percentage of any given block. 500,000 is the optimal amount sitting right in that comfortable goldilocks zone.
Due to these gas costs (both request and response transactions), the current Aggregator contracts that lie at the heart of every currently live chainlink network can only support a maximum of 45 nodes before encountering gas consumption issues. While 45 nodes are still vastly more secure than a centralized oracle network with only a single node, if they want to eventually secure contracts that are valued in the billions, if not more, then they are going to need to bring these costs down multiple orders of magnitude to enable support for more nodes in a single network. Ethereum’s blockspace is not infinite and if we expect a plethora of heterogeneous oracle networks live at any single given point of time, then ultimately efficiencies need to be made. This is exactly what the Chainlink team has been working on mitigating for quite some time now.
Temporary Cost Saving Measures
A simple way of mitigating some of these LINK payment and ETH gas costs is to change the rate at which oracle networks update. This is a direct trade-off of latency vs cost, but could be customized for each oracle network depending on demand. The ETH/USD reference feed launched in May updated every five minutes (288 times a day), but was later reduced in November to every ten minutes (144 times a day). This cut costs in half (half as many LINK payments, half as much gas consumed), but also increased latency by twice as much. Around this time, the BTC/USD reference feed launched and was also set up to update every ten minutes. In February of this year, the BTC/USD and ETH/USD reference feeds were extended again to update every 20 minutes and at every 0.5% deviation from the last update (minimum 72 times a day but could be more depending on volatility).
In April of this year, these same reference feeds were adjusted to update every two hours and at every 1% deviation from the last update. Payments were also bumped up from 0.33333333333333333 to 2 LINK per node per update. As stated before, these oracle networks will update more often during times of high volatility and less often during low volatility.
While not the most optimal long-term way to scale, this was a quick and easy way to temporarily cut down on-chain costs and create more economically sustainable oracle networks that are being supported by the Chainlink team. This is just the first of many optimizations rolling out to reduce the overall expenses of oracle networks.
Chainlink Nodes and Gas Prices
Another obvious way to save money on Ethereum transactions is to lower the gas price used. By default Chainlink nodes set the gas price of their response transactions to 20 Gwei and if it doesn’t confirm within 12 blocks (~2.6 minutes as blocks are every ~13 seconds), then the transaction is bumped with a gas price 5 Gwei higher. It repeats this process 10 times or until the transaction confirms. If after 10 gas bumps the transaction has yet to be confirmed, then the node considers the job run a failure (ideally this would never happen). All of the above variables are 100% customizable by the node operator and can be set to any arbitrary value. While most of these variables have to be set before starting the node, the default gas price used for transactions can be updated in real-time as the node is running.
While transactions usually confirm quickly with no issue, this method causes nodes to often way overpay as the Ethereum network is *usually* not congested enough to justify paying such a high default gas price. This wastes the node operator’s ETH funds and ultimately digs into their profits. For example the gas price to confirm a transaction within one block may only be four Gwei, but the Chainlink node (by default) still sets the gas price to the value held in its database (initially set at 20 Gwei), this causes the node to overpay five times over what it actually needed to.
In order to mitigate this issue, Chainlink integration engineer Thomas Hodges created the Chainlink Gas Price Update Service that node operators can run locally on the same machine as their node. This module automatically fetches the latest gas price from multiple gas reporting APIs (default is EthGasStation, Anyblocks Analytics, and POA network but node operators can choose which APIs they want), and updates the Chainlink node with the max “fast” value of the endpoints every minute.
By leveraging this gas updater module, Chainlink nodes can always know what gas price they need to set to ensure their transaction goes through in a timely manner while not overspending their funds needlessly. The Chainlink team is working on building this feature natively into the Chainlink node itself, designed in such a way that there aren’t any dependencies on external gas estimator APIs. This makes the overall gas price estimation process of each node much more accurate and censorship-resistant. While it’s unlikely that all three default independent gas estimator APIs are DDoSed, hacked, or compromised at the same time, it’s not an impossibility, so having the Chainlink node calculate gas prices internally, using data from its connected Ethereum node, is much safer (multiple Ethereum nodes are connected behind a load-balancer for redundancy).
While this solution lowers some of the costs associated with paying gas for Ethereum transactions, this isn’t enough to scale Chainlink to the security level required by high value smart contracts (ones that would power entire industries: Insurance, Trade Finance, Derivatives, etc). But don’t worry, the Chainlink team isn’t asleep at the wheel, far from it, and in-fact right now there are a few major developments in the works that will largely solve these gas consumption problems.
Data Request Optimizations
A significant portion of the gas used in Chainlink networks can be eliminated entirely by never making any request transactions in the first place! This is possible with the new Prepaid Aggregator smart contract the Chainlink team has been developing silently since mainnet launch (note, still in development and no public audit yet). This is a much more flexible and upgraded version of the Aggregator v1 smart contract that is currently in use by all Chainlink networks today (Aggregator v1 is audited and time-tested).
How it works is largely in the name, instead of a requester having to make a specific on-chain transaction each and every time they want a data update, they can instead prepay ahead of time by loading the contract with enough LINK tokens (how much is up to the requester and depends on a multitude of factors). More LINK tokens can be added to the contract at anytime by anyone. The Chainlink nodes in the network then autonomously trigger new update rounds, push their data to the contract, and aggregate the data all on a predetermined basis with no further input needed from the requester.
These predetermined update parameters can be based on a time schedule (e.g. every 10 minutes), on a deviation threshold (e.g. every 0.5% price deviation since previous update), or a mixture of the two (e.g. every 1% price deviation or 2 hours since last update). This feature is enabled by what the team calls the “Flux Monitor.” Each node would run its own Flux Monitor module (integrated into the Chainlink node natively), this way a new round of updates can be initiated by any node in the network.
All nodes would run their Flux Monitor in the background which would, on a predetermined interval (e.g. every minute), check if a specific threshold/trigger has been met, and if it has, initiate a new round of updates and deliver their data on-chain. Nodes would also continuously monitor the on-chain contract to see if any other node in the network has already initiated a new round of updates, in which case it would fetch the desired off-chain data, feed it into the contract as usual, and perform aggregation if needed.
The Prepaid Aggregator contract and the Flux Monitor go very much hand in hand, each not being possible without the other. Additionally, this combination also greatly increases the censorship-resistant nature of Chainlink oracle networks as any node in the network can initiate a new round of updates, instead of just the single requester having that capability. DDoSing/censoring the requester would achieve nothing after they've already setup and loaded the contract with LINK. This makes Chainlink oracle networks much more autonomous and fault-tolerant overall.
Trying to perform the same DDoS attack on a distributed/decentralized network of Chainlink nodes becomes increasingly more difficult/impractical the higher the number of nodes in the network (if not being completely impossible due to how nodes are security hardened: e.g. inbound ports closed or IP-locked, denying unauthorized connections, only making external connections as needed). This creates an oracle network that is incredibly resilient to failure and capable of withstanding a much wider array of attacks.
The Prepaid Aggregator also contains many additional quality-of-life upgrades and new security features which greatly benefit both smart contract developers and node operators alike. This includes gas-optimised variable sizes and contract logic, improved readability (and thus auditability), isolated aggregation logic, increased transparency with additional getter methods, optional contract whitelisting for data access control, increased security for LINK withdraws through key hierarchy, and more.
The Prepaid Aggregator contract will likely end up becoming the de facto standard aggregator contract that will sit at the core of many Chainlink oracle networks, autonomously coordinating node activity and consensus. The contract will eventually become time-tested (and audited of-course) just like the current v1 Aggregator contract is today, and due to the flexible, compossible, and upgradable nature of the Chainlink framework, many more aggregation contracts/strategies can be created (by anyone) as a template or as a specialized contract that can meet any project/dApp’s exact aggregation, security, and off-chain data needs.
The Prepaid Aggregator contract will substantially lower gas costs for the requester (from consuming millions of gas per request to literally zero), but the added complexity in the aggregator contract logic will slightly increase the gas costs experienced by Chainlink nodes when they respond with their data on-chain. Due to this, the Prepaid Aggregator can only handle a maximum of 42 nodes (just slightly lower than the 45 nodes maximum of the Aggregator v1 contract in use today).
While 42 independent oracle nodes is still much more decentralized and secure than what a centralized oracle with only a single node can offer, gas costs still prevent Chainlink networks from scaling to the level required by high-value smart contracts which will want hundreds if not thousands of independent nodes in their oracle network feeding data to and triggering their contract logic. The Chainlink team have long recognized this issue (in the whitepaper), and have been working on a very key development that will largely alleviate this issue entirely.
Data Response Optimizations
In Chainlink oracle networks today, each time there is a data update request, every node in the network has to create their own on-chain transaction to contribute their data to the aggregator contract with aggregation being performed on-chain. As demonstrated in the sections above, this can be incredibly inefficient since the gas costs increase linearly with the number of nodes in a network. This method of on-chain aggregation will always have an upper bound on the maximum number of nodes that can be supported in a single oracle network.
To overcome this, Chainlink nodes can instead connect and coordinate off-chain (current plan is to use the libp2p networking stack) and achieve consensus among a majority of the nodes without needing to perform any expensive on-chain transactions to create the final aggregated data point. In order to prove that consensus was achieved off-chain, nodes sign the data and combine their signatures into a single Threshold Signature.
The result is that oracle networks now only have to create a single response transaction to update an on-chain contract with off-chain data. This transaction contains the final aggregated data point and a single threshold signature that proves the data was aggregated and signed by a majority threshold of nodes in the network. The threshold of nodes needed to create a complete threshold signature is determined beforehand and can any fraction of the network (e.g. 2/3, 4/5 of nodes). By having only a majority (not all) of the nodes in the network come to consensus, this avoids any issues related to stubborn/unresponsive nodes holding up progress.
Only a single node in the network needs to create the response transaction (usually the node that completed the threshold signature) leading to huge reduction in the amount of gas consumed in an oracle network overall. Nodes don’t need to stock up on nearly as much ETH as the probability of needing to make an on-chain transaction goes down the higher the number of nodes in a network. Additionally, the cost for verifying a threshold signature on-chain is extremely cheap at only 15,000 gas! This puts the total transaction cost at 36,000 gas (21,000 base gas cost + 15,000 gas for threshold signature logic). This low cost signature verification is achieved by taking advantage of the ECRECOVER trick discovered by Vitalik Buterin.
The most fundamental advantage gained through a threshold signature model is turning a linear cost into a static cost. This breaks the upper bound that was previously limiting how many nodes could operate in a single oracle network. With threshold signatures, any arbitrary amount of nodes can achieve off-chain consensus while only one node needs to push a single on-chain transaction (containing the data point) on the behalf of all nodes. The data point pushed on-chain will only be accepted if the threshold signature is verified to be valid (proves a certain threshold of nodes signed off on the data point) preventing any single point of failure. This means Chainlink networks will scale completely independently of Ethereum’s throughput (or the throughput of any blockchain as Chainlink is blockchain agnostic).
Decentralized oracle networks comprised of hundreds to thousands to virtually any arbitrary number are now not only possible from a technical perspective, but also from an economic perspective as well. As a requester, you can now pay nodes less LINK while those nodes still make the same exact profits (or more) as they no longer have to constantly make expensive on-chain transactions. This drastically decreases the marginal cost of adding new nodes to an oracle network, enabling a diverse and rich ecosystem of heterogeneous oracle networks, all at varying levels of decentralization.
An easy way of thinking about it is that the oracle network is essentially batching all the node’s response transactions together into just a single transaction, which is verified by a single threshold signature. This means the efficiency of Chainlink oracle networks goes from O(n), with n being the number of nodes, to O(1). An update of an oracle network with 21 nodes (ETH/USD price feed), would go from 21 on-chain response transactions (from each node), to only one single response transaction.
If it seems like I am repeating myself, I’m just trying to drive the point home, the advantage of threshold signatures cannot be understated. While 42 nodes are the maximum supported in the prepaid aggregator contract, off-chain aggregation using threshold signatures enables an unbounded number of nodes in a network! Not only can Chainlink oracle networks become massively more decentralized, but they become much cheaper at the same time. Leveraging Chainlink oracle networks will become economically feasible for projects at any stage of development/funding. If you want to understand the more nuanced points of threshold signatures or want to verify the math for yourself, Chainlink researcher Alex Coventry wrote a very informative article that crypto geeks will enjoy.
As of late March, Coventry posted an update regarding Threshold Signature’s development that the team was moving from Schnorr Signatures to BLS Signatures which will greatly reduce complexity and enable much larger signing groups. Off-chain communication efficiency has been lowered from the cube of the number of participants O(n³) to being linear O(n) (assuming the threshold number is roughly a constant proportion of the total participation group). On-chain gas costs for BLS signature verification will be slightly higher than Schnorr, but will still have a transaction efficiency of O(1) and actually lower costs for all participating nodes even further due to enabling much larger signing groups.
Total Costs Saved
With the above developments, supporting a Chainlink oracle network will become vastly cheaper and accessible to blockchain developers everywhere. Let’s see how much in gas is potentially saved by comparing the costs of running the ETH/USD reference feed today to how much it would cost when leveraging some of the above mentioned features. To keep things simple, I will assume each transaction is submitted with a 10 Gwei gas price (seems to be the current average).
Currently, it costs about 2,374,048 gas ($6.41) to request a price update from a network of 21 oracle nodes. With the Prepaid Aggregator contract, this would cost 0 gas ($0.00)! As stated before, this is because the new aggregator contract requires no request transactions to ever be made, effectively enabling infinite gas cost savings in this regard.
Currently, the cost from all 21 nodes responding with data back on-chain is 1,358,377 gas ($3.67), with an average cost of 64,684 gas per node ($0.17). With Threshold Signatures, the response from all nodes when batched together off-chain, leads to a static cost of only 36,000 gas ($0.09). With 21 nodes, this leads to an average per node cost of just 1,714 gas ($0.004)! This is a cost reduction of 37x.
Putting it all together, the total gas costs spent on a single update of the ETH/USD price reference feed from 21 nodes went from 3,732,425 gas ($10.07) to just 36,000 gas ($0.09)! This is a cost reduction of 103x.
If we scale things up to the maximum number of nodes allowed in the Aggregator v1 contract (45 due to gas consumption limitations), the total gas costs of a single update would go from ~10,000,000 gas ($27.00) to again just 36,000 gas ($0.09), leading to an average per node cost of just 800 gas ($0.002)! This is a cost reduction of ~277x.
Thanks to the Prepaid Aggregator and Threshold Signatures, an oracle network made up of 10,000+ nodes would still only cost 36,000 gas ($0.09) per update, which is an average per node cost of just <3.6 gas (<$0.000009)! The gas cost savings per node increases as the number of nodes in the oracle network increases.
It’s important to note however now that with Threshold Signatures being implemented with BLS signatures instead of Schnorr signatures, the gas consumption savings will be slightly lower, but will still offer a substantial cost savings now that signing groups can be made much larger with lower off-chain communication latency and bandwidth requirements. BLS based Threshold Signatures similarly offer a static transaction efficiency of O(1) no matter how many nodes are in the signing group.
Remember that these numbers are just what is spent on gas, but since the gas costs experienced by each node are now much much lower, nodes can be more competitive and accept a lower LINK payment per data request while at the same time the overall volume of the chainlink ecosystem will be increasing, so nodes will also be making up those costs on volumes if not profiting more than before. These cost savings trickle down the end-users of the consuming contracts/dApps.
So while today Chainlink oracle networks can be quite expensive to maintain, this will not be the case for forever. Using a public good model, adjusting the update frequency, using a gas price estimator, Prepaid Aggregator contract, Flux Monitor module, and Threshold Signatures are all strategies that the Chainlink team are deploying/developing that will massively alleviate these growing pains.
It should be noted that in this article, I limited the scope to only talking about Chainlink’s own network scalability and not about how Chainlink can scale other Smart Contracts through attested off-chain computation using Arbitrum, Trusted Execution Environments, Zero Knowledge Proofs, Mixicles, and more. I also didn’t mention many of the little improvements that have been made here and there to the Chainlink codebase over time, but instead focused this article on the major network-wide upgrades that will dramatically increase scalability, lower costs, and enable a much more sustainable cryptoeconomic model for both nodes and the data-requesting contracts. This would be a never ending article if I tried to cover everything!
Hope your brain didn’t melt too much from trying to soak it all in, but now you see why I believe 2020 will be the year Chainlink scales.
Follow me on Twitter @ChainLinkGod where I fight the information asymmetry surrounding DeFi, Oracles, and Chainlink.