By Mike Reinhart, Sr Program Manager
Polkadot is an intricate network which incorporates lessons learned from prior protocols. It is undeniably well thought out, teetering between elegance and complexity depending on your perspective.
Slashing is an excellent example of this. Slashing is the means by which misbehaving validators and their nominators are penalized. There are really only two things validators can get slashed for in Polkadot: being unavailable or acting against consensus. What lies beneath are complex mechanisms for determining availability, increased penalties for incidents that seem like coordinated attacks, and levers for mitigating penalties in cases where we’ve moved too fast and accidents happened.
This post covers the mechanisms of how slashing in Polkadot works, the various ways a validator can be slashed, and the steps an honest validator can take to avoid being slashed. We put this together because, as Polkadot validators ourselves, we felt the need to take a close look at this one facet of Polkadot and felt others would benefit from having it summarized. We encourage community members, especially other validators, to review this information, provide feedback, and incorporate this knowledge into their processes for the betterment of the Polkadot and Kusama ecosystems.
Before getting into particulars, it’s important to understand some overarching philosophy and mechanisms that apply to all forms of slashing.
Philosophically, protocols can shape slashing conditions around one of two questions:
- What is the impact on the network?
- What did the validator do?
Polkadot slashing is oriented around #1—looking at validator actions through the lens of the threat it represents to the protocol. For that reason, we see escalations in slashing penalties when more validators commit an offense.
In contrast, it isn’t uncommon for a protocol to instead view each offense in isolation. The tradeoff here is it is more likely for a validator to be penalized for accidental actions, like temporarily being offline or double-signing, due to overly aggressive redundancies. Both approaches encourage validators to maintain a high level of operational excellence—as well they should!
Slashing penalties are applied to the offending validator and their nominators equally. For instance, if a slashing penalty is 1%, then the validator and all of their nominators will each be slashed 1%. Nominators should be diligent when selecting a validator for this reason! We recommend reading Polkadot’s blog post on Nominating and Validator Selection.
Slashing penalties are not withdrawn immediately. After committing an offense, the validator enters the ‘unapplied slash’ state, which lasts 28 days on Polkadot and 7 days on Kusama. Note that this lines up well with unbonding periods, so validators and nominators cannot unbond their stake in hopes of withdrawing it before a penalty is applied.
The ‘unapplied slash’ state is an opportunity for governance to intervene. If the community feels the validator should not be penalized, perhaps because the offense was due to a bug outside their control, they can vote to negate the penalty.
Validators are also ‘chilled’ by the protocol when they commit a slashable offense. This removes them from the active set and disqualifies them from re-election in the next round. If the slashed amount is non-zero, chilling also removes all of the validator’s nominations. Even if governance reverses a slashing penalty, the validator’s nominations are still lost, so it is important to not be overly reliant on governance to amend accidental incidents.
Additional points worth noting on slashing mechanics:
- Slashed DOT goes to the Treasury. This makes governance reversions easy and helps fund worthy causes. When liveliness slashing occurs, 100% of the slashed DOT goes to the treasury. When equivocation/double-signing occurs, 90% of the slashed DOT goes to the treasury and 10% goes to the validator that reported the offense.
- Because slashing penalties are percentage based, validators with more stake are slashed more DOT/KSM.
“Your best ability is availability.”
Validators can only produce and finalize blocks if they are online and available. If one or two are unresponsive the impact is negligible, but at scale this can significantly slow or halt the network.
A validator is considered unavailable if they do not author a block or sign a heartbeat ‘I’m Online’ message within a 4 hour session (1 hour on Kusama—see here for more on Polkadot’s parameters). Signing a block is the primary signal that a validator is online, whereas the heartbeat acts as a failsafe mechanism as not every validator will be assigned to produce a block within a session. If the validator hasn’t produced a block within the session yet, it will automatically submit a heartbeat at a randomized interval between 25% - 80% session completion1 beginning with the
heartbeatAfter2 block. If the validator has not submitted a heartbeat after 80% session completion, something with the node has probably gone wrong and the node operator should be alerted. It is possible to submit a heartbeat manually, but the automated process is robust and reliable.
If a single validator is offline there’s no obvious reason to suspect malicious intent. The validator is not slashed, which means they also keep all their nominations. They are chilled, however, so an online validator can join the active set. Validator liveliness becomes a concern when a larger number of validators are all offline at the same time because the network may struggle to produce blocks. In practice, the protocol considers it problematic when 10% of the validator set is offline—that’s when slashing conditions kick in.
The penalty starts relatively small, 0.021% slashed if 10% are offline. But penalties increase linearly from there until the maximum penalty of 7% is hit when 44% of the network is offline.
A Note on Network Stalls
On occasion, we’ve seen Polkadot or Kusama stall for a period of time. If the entire network is stalled, no one is at risk of liveliness slashing at that time. Slashing events are only emitted at the end of a session, so if the network never reaches the end of a session, no one gets slashed.
Once the network begins moving again, there is a slashing risk if some validators come back online while others remain stalled. For instance, if an emergency patch is released and most validators upgrade but some don’t, those that don’t are at the greatest risk of being slashed in the next session if they remain offline (or even the current epoch if they never signed a block or submitted a heartbeat).
Manually Chilling Your Validator to Avoid Unresponsiveness
We’ve investigated manually chilling our validator to avoid liveliness slashing and found it is only useful if your validator becomes unresponsive within the final session of an era because manually chilling doesn’t take effect until the next era begins.
Manually chilling your validator is a lot like being chilled by the protocol with some key differences. All nominations are kept and there is no risk of being slashed. It is a great tool for extended downtime, like maintenance windows or vacations where you know you’ll be unavailable should there be an incident.
While slashing for unresponsiveness occurs on the session-level, manual chilling only works across eras. For example, say your validator encounters an issue part way through the 2nd session of an era. You’ve signed a block within this session, but you are not sure you’ll be able to fix your issue within the next 4 hour session and you risk being considered offline. Manually chilling at this time will remove you from the active set 4 sessions from now, at the start of the next era. Unfortunately this timeline doesn’t help avoid being considered unresponsive by the protocol.
There is one session where this does help, however—the 6th, or last session in an era. Manually chilling in the 6th session will remove you from the active set in the very next session at the start of the next era.
Equivocation / Double-Signing
The other slashing condition in Polkadot is taking an action that drives the network away from consensus—equivocation. There’s an equivocation penalty for both BABE and GRANDPA and the formula for determining slashing penalties for each is exactly the same, but in practice penalties differ.
In BABE (block production) a validator can be slashed if they produce an invalid conflicting block. A block can be conflicting if it tries to rewrite history (the parent hash in the block header), or has the same height as another proposed block. In non-malicious cases, this happens when a validator copies their entire setup, including the key, into a second instance and runs both at the same time.
If a single validator were to commit a BABE equivocation with the validator set at its current size (297), they can expect a slashing penalty of 0.01%. With an average stake of 2.3M DOT per validator, a 0.01% penalty applied to them and all their nominators is roughly equivalent to 232 DOT.
In theory, if 33% of the validator set committed a BABE equivocation at the same block, they would all lose 100% of their stake. In practice, it isn’t possible for more than a handful of validators to commit simultaneous BABE equivocations because of how block production is assigned.
BABE assigns block production slots to validators, which can result in 0-2 selections per slot. A secondary round-robin style validator selection runs in the background to select a validator for each slot. Under these circumstances, the absolute maximum number of simultaneous BABE equivocations is 3, which, with a validator set of 297, would result in a slashing penalty of 0.092% applied to all three validators and their nominators.
While BABE is responsible for block production, GRANDPA is responsible for block finalization. Equivocation happens when a validator sends pre-vote or pre-commit messages in the same round for two chains that conflict with each other.3
Slashing penalties for GRANDPA equivocation follow the same algorithm as BABE. Unlike BABE where only a small number of validators are assigned block production slots, every validator participates equally in GRANDPA. This means far more validators could equivocate simultaneously and, as a result, it is possible to see much greater slashing penalties.
GRANDPA equivocation has the harshest slashing penalty of all offenses, and it also exemplifies slashing based on the threat to the network—if 33% or more of the validator set commits a GRANDPA equivocation, the offenders lose 100% of their stake.
Given that it is highly unlikely a single validator would accidentally commit a GRANDPA equivocation, having 33% of the network simultaneously equivocate would almost certainly be a sign of malicious intent, justifying the complete loss of stake.
Parachains have just launched on Kusama, meaning relay chain validators have the new task of verifying blocks produced by parachains’ collators. Collators are not responsible for ensuring the blocks they produce are valid - that responsibility falls upon the relay chain validators.
If validators disagree about the legitimacy of a parachain block, the Disputes protocol resolves this disagreement and punishes offending validators. Once initiated, all validators are expected to participate in the dispute, which concludes when ⅔ supermajority reaches the same conclusion regarding the block’s validity. The validators on the losing side of this dispute have objectively behaved maliciously and are slashed 100% of their stake.
What can an honest validator do to avoid being slashed?
Equivocations can generally be avoided by sticking to best practices and running up to date software. If the Polkadot node client performs as expected and you don’t use the same validator key in two places, there’s almost no chance an honest validator equivocates or backs a bad parablock.
Liveliness slashing is more likely and there’s a fairly large design space for mitigation solutions. We recommend planning ahead for given scenarios so, for instance, you know whether manually chilling is a viable solution before your validator goes offline. You can also take measures to make it easier to completely recover a validator if necessary, such as running a backup node to make node restorations faster.
Above all else - stay informed and pay attention! Make sure you are signed up for emails on the and in the right Matrix channels (such as
#polkadot-announcements:matrix.parity.io) so you’re notified of network issues as soon as they happen. Together, we can provide a sound foundation upon which the Polkadot ecosystem will grow.
I’d like to thank Will Pankiewicz, Ryan Hendricks, Drew Rothstein, and Elliot Cameron for their input, feedback, and review.
Heartbeat message submission used to be hardcoded to occur at a specific percentage of network completion, but this caused a huge spike in heartbeat messages from validators on Kusama. ↩
Manually submitted heartbeats sent before the
heartbeatAfterblock will not be accepted by the relay chain (we’ve tried). ↩