Skip to content

Commit

Permalink
Merge pull request #490 from o1-labs/node-op-edits
Browse files Browse the repository at this point in the history
fix and improve for HF: Node Operator docs editorial grooming (block producers, archive node: overview and redundancy, connecting to devnet)
  • Loading branch information
barriebyron authored Jul 13, 2023
2 parents a585283 + 9b4ca5b commit 871847d
Show file tree
Hide file tree
Showing 4 changed files with 347 additions and 171 deletions.
86 changes: 60 additions & 26 deletions docs/node-operators/archive-redundancy.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,14 @@
---
title: Archive Redundancy
hide_title: true
description: Steps to install mina and connect to the Devnet network.
keywords:
- mina historical archive data
- what is an archive node
- archive node redundancy
- mina historical lookup
- archive process
- how do I back up block data
---

:::note
Expand All @@ -11,35 +19,41 @@ A new version of Mina Docs is coming soon! This page will be rewritten.

# Archive Redundancy

The [archive node](/node-operators/archive-node) will store its data in a PostgreSQL database that node operators host on a provider of their choice, including self-hosting, if desired. However, for redundancy, archive node data can also be stored to an object storage (e.g. [Google Cloud Storage](#upload-block-data-to-google-cloud-storage); soon S3 & others) or to a [`mina.log`](#save-block-data-from-logs) file, which can live on your computer or be streamed to any typical logging service (e.g. LogDNA).
The [archive node](/node-operators/archive-node) stores its data in a PostgreSQL database that node operators host on a provider of their choice, including self-hosting. For redundancy, archive node data can also be stored to an object storage like [Google Cloud Storage](#upload-block-data-to-google-cloud-storage); soon S3 and others) or to a [`mina.log`](#save-block-data-from-logs) file that reside on your computer or be streamed to any typical logging service, for example, LogDNA.

Archive data is critical for applications that require historical lookup. On the protocol side, archive data is currently important for disaster recovery to reconstruct a certain state, but may not be required in a future version of Mina. To that end, having a single [archive node setup](/node-operators/archive-node) might not be sufficient. If the daemon that sends blocks to the archive process or if the archive process itself fails for some reason, there can be missing blocks in the database. To minimize the risk of archive data loss there are a few redundancy techniques that can be employed.
Archive data is critical for applications that require historical lookup.

A single archive node setup has a daemon sending blocks to an archive process which writes them to the database.
On the protocol side, archive data is important for disaster recovery to reconstruct a certain state. A single [archive node set up](/node-operators/archive-node) might not be sufficient.

It is possible to connect multiple daemons to the archive process by specifying the address of an archive process in multiple daemons, thereby reducing the dependency on a single daemon to provide blocks to the archive process.
If the daemon that sends blocks to the archive process or if the archive process itself fails for some reason, there can be missing blocks in the database. To minimize the risk of archive data loss, employ redundancy techniques.

For example, the server-port of an archive process is 3086, then the daemons can connect to it using the flag `archive-address`
A single archive node setup has a daemon sending blocks to an archive process that writes them to the database.

To connect multiple daemons to the archive process, specify the address of an archive process in multiple daemons to reduce the dependency on a single daemon to provide blocks to the archive process.

For example, the server port of an archive process is 3086. The daemons can connect to that port using the flag `archive-address`

```
mina daemon \
.....
--archive-address <Ip-address>:3086\
```

Similarly, it is possible to have multiple archive processes write to the same database. In this case the postgres uri passed to the archive process would be same across multiple archive processes.
Similarly, it is possible to have multiple archive processes write to the same database. In this case, the postgres uri passed to the archive process would be same across multiple archive processes.

However, multiple archive processes writing to a database concurrently could cause data inconsistencies (explained in https://github.com/MinaProtocol/mina/issues/7567). To avoid this, set the transaction isolation level of the archive database to `Serializable` using the following query:
However, multiple archive processes concurrently writing to a database could cause data inconsistencies (explained in https://github.com/MinaProtocol/mina/issues/7567). To avoid this, set the transaction isolation level of the archive database to `Serializable` with the following query:

ALTER DATABASE <DATABASE NAME> SET DEFAULT_TRANSACTION_ISOLATION TO SERIALIZABLE ;

This should be done after creating the [database](/node-operators/archive-node) and before connecting an archive process to it.
Set the transaction level after you create the [database](/node-operators/archive-node) and before you connect an archive process to it.

## Backing up block data
## Back up block data

To further ensure there that archive data can be restored one can use the following features to backup block data and restore them when necessary.
To ensure that archive data can be restored, use the following features to back up and restore block data.

We have a mechanism in place for logging a high-fidelity machine-readable representation of blocks using JSON including some opaque information deep within. We use these logs internally to quickly replay blocks to get to certain chain-states for debugging. This information suffices to recreate exact states of the network.
A mechanism for logging a high-fidelity machine-readable representation of blocks using JSON includes some opaque information deep within.

These logs are used internally to quickly replay blocks to get to certain chain states for debugging. This information suffices to recreate exact states of the network.

Some of the internal data look like this:

Expand All @@ -51,33 +65,51 @@ This JSON will evolve as the format of the block and transaction payloads evolve

### Upload block data to Google Cloud Storage

To indicate a daemon to upload block data to Google Cloud Storage, pass the flag `--upload-blocks-to-gcloud` . To successfully upload the file, daemon requires the following environment variables to be set: 1. `GCLOUD_KEYFILE` : Key file for authentication 2. `NETWORK_NAME`: Network name to be used in the filename to easily distinguish between blocks in different networks (main-net and testnets) 3. `GCLOUD_BLOCK_UPLOAD_BUCKET` : Google Cloud Storage bucket where the files are uploaded
The daemon generates a file for each block with the name `<network-name>-<protocol-state-hash>.json` . These files are called precomputed blocks and have all the fields of a block.

To specify a daemon to upload block data to Google Cloud Storage, pass the flag `--upload-blocks-to-gcloud`.

Set the following environment variables:

- `GCLOUD_KEYFILE`: Key file for authentication

- `NETWORK_NAME`: Network name to use in the filename to easily distinguish between blocks in different networks (Mainnet and Testnets)

The daemon generates a file for each block with the name `<network-name>-<protocol-state-hash>.json` . These are called precomputed blocks and will have all the fields of a block.
- `GCLOUD_BLOCK_UPLOAD_BUCKET`: Google Cloud Storage bucket where the files are uploaded

### Save block data from logs

The daemon also logs the block data if the flag `-log-precomputed-blocks` is passed. The log to look for is `Saw block with state hash $state_hash` that contains `precomputed_block` in the metadata and has the block information. This is the same information (precomputed blocks) that gets uploaded to Google Cloud Storage.
The daemon logs the block data if the flag `-log-precomputed-blocks` is passed.

The log to look for is `Saw block with state hash $state_hash` that contains `precomputed_block` in the metadata and has the block information. These precomputed blocks contain the same information that gets uploaded to Google Cloud Storage.

### Generate block data from another archive database

From a fully synced archive database, one can generate block data for each block using the `mina-extract-blocks` tool.
From a fully synced archive database, you can generate block data for each block using the `mina-extract-blocks` tool.

The `mina-extract-blocks` tool generates a file for each block with name `<protocol-state-hash>.json`.

The tool takes an `--archive-uri`, an `--end-state-hash`, and an optional `--start-state-hash` and writes all the blocks in the chain starting from start-state-hash and ending at end-state-hash (including start and end).

The tool takes an `--archive-uri`, an `--end-state-hash`, and an optional --start-state-hash and writes all the blocks in the chain starting from start-state-hash and ending at end-state-hash (including start and end).
If only the end hash is provided, then the tool generates blocks starting with the unparented block closest to the end block. This would be the genesis block if there are no missing blocks in between. The block data in these files are called extensional blocks. Since these are generated from the database, they have only the data stored in the archive database and do not contain any other information pertaining to a block (for example, blockchain SNARK) like the precomputed blocks and can only be used to restore blocks in the archive database.

If only the end hash is provided, then the tool generates blocks starting with the unparented block closest to the end block. This would be the genesis block if there are no missing blocks in between. The tool generates a file with name `<protocol-state-hash>.json` for each block. The block data in these files are called extensional blocks. Since these are generated from the database, they would have only the data stored in the archive database and would not contain any other information pertaining to a block (for example, blockchain snark) that the precomputed blocks would have and therefore, can only be used to restore blocks in the archive database.
Provide the flag `--all-blocks` to write out all blocks contained in the database.

Alternatively, instead of specifying state hashes, you can provide the flag `--all-blocks`, and the tool will write out all blocks contained in the database.
## Identify missing blocks

## Identifying missing blocks
To determine any missing blocks in an archive database, use the `mina-missing-block-auditor` tool.

The tool `mina-missing-block-auditor` can be used to determine any missing blocks in an archive database. The tool outputs a list of state hashes of all the blocks in the database that are missing a parent. This can be used to monitor the archive database for any missing blocks. The URI of the postgres database can be specified using the flag `--archive-uri`
The tool outputs a list of state hashes of all the blocks in the database that are missing a parent. This list can be used to monitor the archive database for any missing blocks. Specify the URI of the postgres database by using the flag `--archive-uri`.

## Restoring blocks
## Restore blocks

Missing blocks in an archive database can be restored if there is block data (precomputed or extensional) available from the options listed [above](#backing-up-block-data) using the tool `mina-archive-blocks`.
When you have block data (precomputed or extensional) available from [Back up block data](/node-operators/archive-redundancy#back-up-block-data), you can restore missing blocks in an archive database using the tool `mina-archive-blocks`.

1. Restore precomputed blocks: (from option [1](#upload-block-data-to-google-cloud-storage) and [2](#save-block-data-from-logs) above)
1. Restore precomputed blocks from:

- [Upload block data to Google Cloud Storage]](/node-operators/archive-redundancy#upload-block-data-to-google-cloud-storage)

- [Save block data from logs](/node-operators/archive-redundancy#save-block-data-from-logs)

```
mina-archive-blocks --precomputed --archive-uri <postgres uri> FILES
Expand All @@ -91,10 +123,12 @@ Missing blocks in an archive database can be restored if there is block data (pr

## Staking ledgers

Staking ledgers are used to determine slot winners for each epoch. Mina daemon stores staking ledger for the current and the next epoch (after it is finalized). When transitioning to a new epoch, the "next" staking ledger from the previous epoch is used to determine slot winners of the new epoch and a new "next" staking ledger is chosen. Since staking ledgers for older epochs are no longer accessible, users may want to still keep them around for reporting or other purposes.
Staking ledgers are used to determine slot winners for each epoch. Mina daemon stores staking ledger for the current and the next epoch after it is finalized. When transitioning to a new epoch, the "next" staking ledger from the previous epoch is used to determine slot winners of the new epoch and a new "next" staking ledger is chosen. Since staking ledgers for older epochs are no longer accessible, you can still keep them around for reporting or other purposes.

Currently these ledgers can be exported using the cli command-
Export these ledgers using the mina cli command:

mina ledger export [current-staged-ledger|staking-epoch-ledger|next-epoch-ledger]

Epoch ledger transition happens once every 14 days (given slot-time = 3mins and slots-per-epoch = 7140). The window to backup a staking ledger is ~27 days considering "next" staking ledger is finalized after k (currently 290) blocks in the current epoch and therefore will be available for the rest of the current epoch and the entire next epoch.
Epoch ledger transition happens once every 14 days (given slot-time = 3mins and slots-per-epoch = 7140).

The window to backup a staking ledger is ~27 days considering "next" staking ledger is finalized after k (currently 290) blocks in the current epoch and therefore is available for the rest of the current epoch and the entire next epoch.
Loading

1 comment on commit 871847d

@vercel
Copy link

@vercel vercel bot commented on 871847d Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Successfully deployed to the following URLs:

docs2 – ./

docs2-minadocs.vercel.app
docs.minaprotocol.com
docs2-git-main-minadocs.vercel.app

Please sign in to comment.