Backup and Restore Procedures (Nodes)

Abstract

This section gives a description of the backup and restore procedures for nodes.

Motivation

Due to the decentralised architecture of a blockchain network, the data stored on a node is basically secured and not lost in case of a problem. Nonetheless, it can be useful to backup the node data and protect it against misuse.

Elaboration

Since the blockchain network guarantees by design the replication of data, there is no need to backup the data that is stored on a node. In case of data loss, the node will synchronize all the blocks that are missing from the peer nodes to which it has established a connection.

However, it is critical to save pieces of data related to the identity of the operating node. This ensures that, in case of data loss, the node will be again running under the same identity, which is important to the node continuity, its recognition in the network, but also essential to the interaction with smart contracts, which otherwise may not grant it access.

This backup procedure includes the following data:

node id/enode
node keys/keystores
related passwords (if at all saved)

Furthermore, it is recommended to save all configurations that contribute to the running of the operating node. Such data includes:

peer connectivity
opened ports
mining configuration
internal paths

For some blockchain technologies changes to the genesis block may need to be backed up.

In cases where the ledger data needs to be secured by the network leader (e.g. Iron Throne in the Bloxberg case), then the following aspects should be observed when backing up and restoring data:

Saving the node configuration
Backup of the node data
Time interval for the backup
Automation of the backup routine
Specify recovery routine
Node reintegration procedure
Routine for corrupt data

Drawback: Due to the decentralised architecture, a backup should actually no longer be necessary and thus causes additional effort.

Unresolved issued: A standard implementation of backup and distributed data that automatically reconciles and restores the node configuration. Where no manual intervention is required.

References to best practices, examples

A possible implementation can be found at Hyperledger Besu:

Backups In a decentralized blockchain, data replicates between nodes so it’s not lost. But backing up configuration and data ensures a smoother recovery from corrupted data or other failures.

Genesis file The genesis file for a network must be accessible on every node. We recommend storing the genesis file under source control.

Data backups If installed locally, the default data location is the Besu installation directory.

We recommend mounting a separate volume to store data. Use the `--data-path` command line option to pass the path to Besu.

The default data location is the Besu installation directory, or `/opt/besu/database` if using the Besu Docker image.

Having some data reduces the time to synchronize a new node. You can perform periodic backups of the data directory and send the data to your preferred backup mechanism. For example, cron job and rsync, archives to the cloud such as s3, or `tar.gz` archives.

Data restores

To restore data: 1. If the node is running, stop the node. 2. If required, move the data directory to another location for analysis. 3. Restore the data from your last known good backup to the same directory. 4. Ensure user permissions are valid so you can read from and write to the data directory. 5. Restart the node.

Corrupted data

If log messages signify a corrupt database, the cleanest way to recover is:

Stop the node.

Restore the data from a previous backup.

Restart the node.

Finding peers after restarting The process for finding peers after restarting is the same as for finding peers after upgrading and restarting.

Bibliography of selected references

Hyperledger Besu Backup and Restore

RFC-0851
Authors: David Maas, Andrei Ionita
Status: work in progress
Last modified: 2021-03-18