How Long Does it take to a Recover a Large Database From Persistence (RDB/AOF)?
Last updated 18, Apr 2024
Question
How long does it take to recover a big (multi TB) cache from persistence (RDB/AOF) in case of a major crash (the whole region or a DC is gone)?
Answer
If an entire DC is unavailable, the Redis Enterprise deployment recovery (installing and configuring the cluster) is likely a tiny part of the overall RTO. But setting that aside, Redis Enterprise is a distributed database, so the aggregate size of the data is not the only relevant factor to consider. Each shard would recover from its own RDB and/or AOF in parallel. How long that takes depends on a bunch of factors:
- Primarily, what data structures are used and what aof-rewrite configs were predefined.
- Another factor to consider is if it's a cluster using auto tiering, the shards might be twice the size and heavily influence the recovery time.
- In some cases, restoring from AOF can be significantly slower than RDB (in the order of hours instead of minutes)
- Performance of storage devices
- Whether the volume is locally attached or remote (if it is remote, the network is a factor)
References
- Recover a failed cluster. This article shows the instructions for recovering a cluster.