We have focused on building a non-custodial relayer, Infura Transaction Service (ITX), that takes a pre-signed message (e.g. meta-transaction), packs it into an Ethereum transaction and then gradually bumps the fee until it is mined in the blockchain.
Why do we care about relayers? We expect businesses to emerge that offer reliable transaction delivery as a service. Relayers help alleviate the fundamental problem of competing in the global block-space fee market and they specialise in picking relay strategies depending on the transaction’s priority. For example, the relayer may optimise for quick delivery if the user is swapping tokens on an AMM, whereas it may optimise for a low transaction fee if the user is voting in a DAO.
However, as we have seen with ITX and several other relayers, sending transactions on behalf of others is mostly a solved problem in Ethereum except for a few edge cases such as the msg.sender issue and gas tank management. We’ll cover those problems in a future blog post :)
This post focuses on:
Can we implement relayers in Bitcoin? And what type of problems pop up?
What properties should a relayer achieve?
We focused on the following four properties when building ITX:
- Non-custodial. ITX never has access or control over the user’s primary funds.
- Non-interactive. ITX participates in the global block-space fee market without continuous interaction from the user.
- No initial setup. The user can use the relay service without an initial setup phase.
- Relayer pays. It is the relayer, not the user, who pays for the network fee.
Three of the properties are straightforward. The relayer should never have access to the user’s primary funds, the user should be able to offline while the relayer is sending their transaction and it is ideal if there is no initial setup to use the relayer.
The final property, the relayer pays, mostly emerged as a symptom of building on Ethereum as the relayer must front the funds in order to send the user’s transaction.
This brings up the question:
Should the relayer or the customer’s on-chain funds be used to pay for the transaction?
Easy for Ethereum, but not as clear for Bitcoin. The popular narrative for relayers to pay the network fee is to workaround protocol (or regulatory) issues for the user. The key example is supporting “gasless” ERC20 transfers as the user may lack ETH to pay the network fee, but they can refund the relayer via an ERC20 token. It is arguable the same use-cases exists for Tether on Bitcoin — although no ecosystem has evolved for it. Ultimately, we suspect relayers will pay the network fee for the user as long as revenue can be extracted from the service.
One example is a Bitcoin vault. The user has hired a watchtower to monitor their vault and to issue a cancel transaction if it detects malicious behaviour. The user pays the watchtower a subscription and in return the watchtower promises to pay whatever fee is necessary to protect the vault (and the business model assumes only a fraction of all vaults require any intervention). As well, using the relayer’s funds to pay the network fee may be desirable just for implementation simplicity (e.g. attach multiple fee outputs in an ad-hoc manner to radically bump the fee).
Who pays is potentially a trivial change. Regardless, we can support either the customer paying for the fee or the relayer paying for the fee thanks to the UTXO model. So if you truly believe the user should pay for the fee from their own funds, then it we can just not include the relayer’s UTXO as an input to the transaction :)
But before we jump into the solutions, let’s take some time to review some background information about Bitcoin transactions.
Background on Bitcoin Transactions (UTXO)
A Bitcoin transaction has a list of inputs and a list of outputs:
- An output (UTXO) is associated with a number of coins and the claim script that establishes a set of conditions that must be satisfied to spend the coins.
- An input references an unspent output and provides the redeem script which is evidence that satisfies the conditions to spend the coins.
As we can see in Figure 1, the first input of Transaction 2 references the first output of Transaction 1. The other two inputs for Transaction 2 are not shown in the picture as they originate from different transactions. When Transaction 2 is processed by the network; it effectively executes each the input-output pairs to validate if 0.5 BTC, 0.8 BTC and 0.3 BTC can be spent by this transaction.
There are three aspects of a Bitcoin Transaction that is important for a fee replacement protocol:
- Script flexibility. The input and output of a transaction creates a script. Most scripts require a single signature to authorise the transaction, but it can other conditions including multisig, valid after time T, and if the preimage of a hash is revealed.
- Inputs redemption. Each input of a transaction can be redeemed by a different party and the inputs are only valid if all parties agree to the transaction template (e.g. if you have 3 inputs for a transaction, then the script for each input must be satisified before the transaction can be mined).
- SIGHASH rules. A user’s signature to authorise the transaction can cover one or more inputs, and one or more outputs.
What about the transaction fee? Where is that defined? There is no explicit UTXO or field for a transaction fee. It is defined:
tx_fee = total_inputs - total_outputs, where tx_fee ≥ 0.
For example, if the inputs collectively represent 1 BTC and the outputs send 0.9 BTC, then the transaction has a fee of 0.1 BTC. To adjust the network fee — the signer must re-define the transaction outputs to deduct the total number of coins sent and then re-sign the transaction.
Finally, there is no account-system in Bitcoin. Instead, there is global UTXO set that defines the list of spendable transaction outputs. Each UTXO is associated with a claim script and some bitcoins. Each block removes spent transaction outputs from the set and then appends new spendable outputs to the set.
If you want to learn more about Bitcoin transactions, please check online resources such as the Developer Guide.
Another quirk of the UTXO model is coin management. If a user sends the counter-party 5 bitcoins to the same address, but in different transactions, then the counter-party will need to manage 5 unspent transaction outputs. This became problematic for Coinbase as they had 265 bitcoin across 1.5m UTXO and there is an interesting blog post by lopp about the challenges here.
Relayer Fee Outputs
To satisfy the Relayer Pays property, the relayer must include a UTXO in the customer’s transaction. As seen in Figure 2, the relayer’s input has 0.3 BTC and the relayer receives 0.1 BTC, thus they have allocated 0.2 BTC for the transaction network fee. For simplicity, we will call it the relayer fee output.
However there are some subtle gotchas to consider:
UTXO management. The relayer must ensure each fee output have sufficient funds to cover the maximum cost of a transaction. They may need to periodically batch if a set of fee outputs become insufficient or split the fee outputs if they contain too many coins. All management requires on-chain transactions and thus incurs a financial cost to the relayer.
Concurrent relay transactions. If we assume only one fee output is used per pending transaction for the user, then the number of concurrent transactions the relayer can manage at any one time is dependent on the number of fee outputs available.For example, if the relayer has 10 UTXO with 1 BTC each, then they can support sending 10 relay transactions at any one time. Unlike an account-based system, it is not necessarily dependent on the total funds held by the relayer. Thus how the relayer’s UTXO are management has a direct impact on their quality of service.
Not safe to chain pending transactions. Ideally, the same fee output can be used for multiple pending transactions that are chained together to maximise its utility. However, it is not always safe to chain dependent transactions together for a relay service. This is because the input of a transaction depends on the hash of the previous transaction. Thus changing the fee for a single transaction will invalidate all dependent (and pending) transactions.
Of course, if the customer is paying the network fee, then the relayer does not include an input for the transaction. Depending on the protocol, the customer may create an additional relayer fee output using their own funds. Thus, the customer can allocate funds for the relayer to spend and bump the transaction fee. Any funds left over when the transaction is mined can be given to the relayer as a tip.
Native replace-by-fee flag in a Bitcoin transaction?
BIP125 implemented an opt-in fee replacement flag for transactions such that the fee can be increased at any time while the transaction remains unconfirmed in the memory pool.
The protocol is pretty straight forward.
- Optional: Relayer provides the user with an input that can be used to pay for the network fee.
- User signs several transactions, where each transaction has a larger fee than the previous transaction.
- User sends the relayer all transactions.
- Relayer broadcasts the first (lowest-fee) transaction, and then steadily releases the subsequent transactions to gradually bump the fee.
I’d suspect if a relayer as a service was to emerge, then this is the first and easiest approach to implement for most cases.
It is non-interactive as all transactions are signed up-front, and second the relayer is only trusted to forward transactions in a timely manner that helps the user pay the best price at any given time. The downside is that the user must sign several transactions and send them to the relayer. Thus the relayer has no control over the granularity of the fee which is important to avoid over or under-bumping the fee.
Interesting quirk. It is recommended that you cannot trust a transaction for 0-confirmation if it has the RBF flag on. The BIP identifies that if the first transaction has RBF=on, and the replaced transaction has RBF=off, then the transaction is still replaceable. Thus, if you just validate the transaction you receive to check if RBF=off, then you can still be fooled!!
Child pays for parent
Figure 2 highlights the child pays for parent primitive. It allows a second transaction to offer a network fee that covers the cost of both pending transactions (e.g. itself and its parent transaction). Assuming miners evaluate chains of pending transactions, then it should entice miners to include both transactions in the same block.
We can leverage child-pays-for-parent by including a relayer fee output in the first pending transaction. As mentioned previously, the coins used to fund this output may originate from the relayer or the customer, but the relayer will have full custody over the fee output. If the first pending transaction gets stuck on the network, then the relayer can create a second transaction using the anchor output to cover the network fee of both transactions.
The protocol is straight forward:
- Optional: Relayer provides the user with an input that can be used to pay for the network fee.
- Customer creates, signs and sends the relayer the transaction. It includes the additional relayer fee output.
- Relayer broadcasts the transaction to the network, monitors the transaction, and only sends a second transaction if the fee needs to be bumped.
- If there is no fee bump and the user’s transaction is mined, then the relayer fee output can be used for the next customer.
The protocol is non-interactive after the initial pending transaction is sent, the relayer never has access to the users primary funds, the relayer can pay the network fee for the user, and of course there is no initial setup.
The downside is a larger transaction (dedicated relayer output) and the fee may need to cover both transactions instead of a single transaction.
Temporary ownership over fee output? If the relayer is spending the user’s funds in the fee output, then it would be nice to return the funds to the user after an expiry time. The go-to approach is to have an if statement in the script such that the relayer can spend the coins before time T, otherwise they belong to the user. Unfortunately due to limitations with Bitcoin script, there is no strict before time T and instead it is implemented such that the user can spend the coins after time T. It is subtle, but it implies both conditions will remain true after time T and thus the relayer is always entitled to spend the coins until the UTXO is spent by the user.
The Magic of SIGHASH
As I have mentioned throughput the post, we must sign the transaction to authorise it. But what does the signer actually sign? This brings us to the subtle SIGHASH rules.
There are situations when a user only wants to sign a combination of inputs and outputs for a transaction, but they do not care about signing the entire transaction. The user can declare a SIGHASH rule that dictates what information in the transaction will be signed by the user.
A full explanation of the SIGHASH rules can be found here, but for our purposes we only care about:
- SIGHASH_SINGLE: The user signs ALL inputs, but only one output.
Figure 3 highlights the user signs a transaction template that includes their desired output (e.g. transfer of coins to a merchant) and both inputs. The relayer’s output is NOT signed by the user and for all purposes it is ignored when the user’s input is validated.
The incomplete transaction template is sent to the relayer who is responsible for including the second output and authorising the transaction. The transaction is not valid without the relayers signature; so there is no risk that the miner can mine the transaction and steal the relayer’s input funds. As well, the SIGHASH flag can be different for each input. So while the user has used SIGHASH_SINGLE, the relayer can simply use SIGHASH_ALL (default option).
There is a subtle gotcha with the SIGHASH approach.
The SIGHASH_SINGLE implementation has a bug (bitcoin is truly an unfinished frankenstein system) such that it the transaction is still valid even if there is not a corresponding output for the input signed (i.e., the number of inputs exceed the number of outputs). Check out the bug report to find out why.
Thus, if we want to support a change output for the user, then the user must include a second dummy input that is also using SIGHASH_SINGLE.
Other than that, the protocol is straight forward:
- Relayer must provide an input for the user to include in the transaction to let the relayer re-sign the transaction for RBF.
- User creates a transaction template with 3 inputs and 2 outputs. The first output sends coins to the destination and the second output refunds the remaining change. The third input is the relayer’s anchor UTXO.
- User provides a signature for each input with the declared flag SIGHASH_SINGLE (Figure 4).
- User sends the relayer the partially signed transaction.
- Relayer creates a third output that refunds the relayer and pays for the transaction network fee. Their input is signed using SIGHASH_ALL and broadcast the transaction to the network.
- To bump the fee, the relayer can simply adjust the third output, re-sign the transaction, and broadcast it.
Of course, the relayer can only bump the fee up to the maximum value of the input they have supplied. But there we have it :)
There is a sighash rule called ANYONECANPAY. At first glance after reading this blog post you may think it would allow anyone to pay for the transaction fee. But don’t be fooled! It allows anyone to contribute an input to the transaction as long as its sending coins to a desired output. Mike Hearn used the feature to build lighthouse that was a crowed-funding application for Bitcoin.
Can we mix SIGHASH_SINGLE with ANYONECANPAY? Yes! The user will only sign one input and one output of the transaction. But we caution against its use because if the input.coins > output.coins, then the miner can simply ignore the rest of the transaction and simply mine the input-output pair where the fee = input.coins — output.coins. This vulnerability can result in a hefty theft if the user is expecting to return coins via a change output.
We have covered three approaches to support third party relayers in Bitcoin and we can briefly compare them based on the initial properties:
- Non-custodial. The relayer never has access to the user’s funds in all solutions. Although, if the user funds the relayer fee output, then the relayer is trusted to pick an appropriate fee.
- Non-interactive. All solutions are non-interactive from the user’s perspective as they can simply provide sufficient information up-front before going offline and letting the relayer take over. Although, the replace-by-fee approach does require a significant setup for the user as they need to pre-sign a list of transactions and send them to the relayer.
- Relayer pays. The relayer can include an additional input in all three approaches and pay for the network fee. Although, the SIGHASH approach, it is not yet clear if the customer can pay for the network fee. It appears the relayer must include an additional input and output pair for the transaction.
- No initial setup. All approaches do not require an initial setup from the user which we define as an on-chain transaction before interacting with the relayer.
What issues pop up? The most significant issue is the additional size of the transaction. If the relayer pays for the network fee, then the customer’s transaction must include an additional input. If the child-pays-for-parent approach is used, then the customer may pay for a second transaction (2x the cost). And due to the SIGHASH_SINGLE bug, the user may need to include a dummy input and the relayer’s input.
Replace-by-fee vs SIGHASH. I would argue that the sighash_single approach is an optimisation to the replace-by-fee approach. Instead of requiring the user to pre-sign a list of transactions, it simply lets the relayer take a single signed transaction by the user and then append a transaction fee onto it. It may be wise to investigate a new sighash rule called SIGHASH_MULTIPLE that would allow a user to sign a set of inputs/outputs and offer greater flexibility for a relayer to attach a network fee to a pre-existing transaction (e.g. think watchtowers!).
Can we mix the approaches? All approaches can include a relayer fee output that is deducted to pay for the network fee. It is possible that dedicated fee output cannot cover the full cost of the transaction. If that occurs, then the relayer can simply perform child-pays-for-parent and include two inputs in the transaction. The first input spends the customer’s transaction via the relayer output fee, and the second input is another UTXO controlled by the relayer to include more funds in the transaction. So it is possible for the relayer to include additional UTXOs on-the-fly if the network fees suddenly sky-rocket to unexpected heights.
Acknowledgements. Thanks to Sergi Delgado for his feedback and his new obsession with the idea of SIGHASH_MULTIPLE.