Merge pull request #490 from o1-labs/node-op-edits

fix and improve for HF: Node Operator docs editorial grooming (block producers, archive node: overview and redundancy, connecting to devnet)
o1-labs · Jul 13, 2023 · 871847d · 871847d · vercel · Jul 13, 2023
2 parents a585283 + 9b4ca5b
commit 871847d
Show file tree

Hide file tree

Showing 4 changed files with 347 additions and 171 deletions.
diff --git a/docs/node-operators/archive-redundancy.mdx b/docs/node-operators/archive-redundancy.mdx
@@ -1,6 +1,14 @@
 ---
 title: Archive Redundancy
 hide_title: true
+description: Steps to install mina and connect to the Devnet network.
+keywords:
+  - mina historical archive data
+  - what is an archive node
+  - archive node redundancy
+  - mina historical lookup
+  - archive process
+  - how do I back up block data
 ---
 
 :::note
@@ -11,35 +19,41 @@ A new version of Mina Docs is coming soon! This page will be rewritten.
 
 # Archive Redundancy
 
-The [archive node](/node-operators/archive-node) will store its data in a PostgreSQL database that node operators host on a provider of their choice, including self-hosting, if desired. However, for redundancy, archive node data can also be stored to an object storage (e.g. [Google Cloud Storage](#upload-block-data-to-google-cloud-storage); soon S3 & others) or to a [`mina.log`](#save-block-data-from-logs) file, which can live on your computer or be streamed to any typical logging service (e.g. LogDNA).
+The [archive node](/node-operators/archive-node) stores its data in a PostgreSQL database that node operators host on a provider of their choice, including self-hosting. For redundancy, archive node data can also be stored to an object storage like [Google Cloud Storage](#upload-block-data-to-google-cloud-storage); soon S3 and others) or to a [`mina.log`](#save-block-data-from-logs) file that reside on your computer or be streamed to any typical logging service, for example, LogDNA.
 
-Archive data is critical for applications that require historical lookup. On the protocol side, archive data is currently important for disaster recovery to reconstruct a certain state, but may not be required in a future version of Mina. To that end, having a single [archive node setup](/node-operators/archive-node) might not be sufficient. If the daemon that sends blocks to the archive process or if the archive process itself fails for some reason, there can be missing blocks in the database. To minimize the risk of archive data loss there are a few redundancy techniques that can be employed.
+Archive data is critical for applications that require historical lookup. 
 
-A single archive node setup has a daemon sending blocks to an archive process which writes them to the database.
+On the protocol side, archive data is important for disaster recovery to reconstruct a certain state. A single [archive node set up](/node-operators/archive-node) might not be sufficient. 
 
-It is possible to connect multiple daemons to the archive process by specifying the address of an archive process in multiple daemons, thereby reducing the dependency on a single daemon to provide blocks to the archive process.
+If the daemon that sends blocks to the archive process or if the archive process itself fails for some reason, there can be missing blocks in the database. To minimize the risk of archive data loss, employ redundancy techniques.
 
-For example, the server-port of an archive process is 3086, then the daemons can connect to it using the flag `archive-address`
+A single archive node setup has a daemon sending blocks to an archive process that writes them to the database.
+
+To connect multiple daemons to the archive process, specify the address of an archive process in multiple daemons to reduce the dependency on a single daemon to provide blocks to the archive process.
+
+For example, the server port of an archive process is 3086. The daemons can connect to that port using the flag `archive-address`
 
 ```
 mina daemon \
     .....
   --archive-address <Ip-address>:3086\
 ```
 
-Similarly, it is possible to have multiple archive processes write to the same database. In this case the postgres uri passed to the archive process would be same across multiple archive processes.
+Similarly, it is possible to have multiple archive processes write to the same database. In this case, the postgres uri passed to the archive process would be same across multiple archive processes.
 
-However, multiple archive processes writing to a database concurrently could cause data inconsistencies (explained in https://github.com/MinaProtocol/mina/issues/7567). To avoid this, set the transaction isolation level of the archive database to `Serializable` using the following query:
+However, multiple archive processes concurrently writing to a database could cause data inconsistencies (explained in https://github.com/MinaProtocol/mina/issues/7567). To avoid this, set the transaction isolation level of the archive database to `Serializable` with the following query:
 
     ALTER DATABASE <DATABASE NAME> SET DEFAULT_TRANSACTION_ISOLATION TO SERIALIZABLE ;
 
-This should be done after creating the [database](/node-operators/archive-node) and before connecting an archive process to it.
+Set the transaction level after you create the [database](/node-operators/archive-node) and before you connect an archive process to it.
 
-## Backing up block data
+## Back up block data
 
-To further ensure there that archive data can be restored one can use the following features to backup block data and restore them when necessary.
+To ensure that archive data can be restored, use the following features to back up and restore block data.
 
-We have a mechanism in place for logging a high-fidelity machine-readable representation of blocks using JSON including some opaque information deep within. We use these logs internally to quickly replay blocks to get to certain chain-states for debugging. This information suffices to recreate exact states of the network.
+A mechanism for logging a high-fidelity machine-readable representation of blocks using JSON includes some opaque information deep within. 
+
+These logs are used internally to quickly replay blocks to get to certain chain states for debugging. This information suffices to recreate exact states of the network.
 
 Some of the internal data look like this:
 
@@ -51,33 +65,51 @@ This JSON will evolve as the format of the block and transaction payloads evolve
 
 ### Upload block data to Google Cloud Storage
 
-To indicate a daemon to upload block data to Google Cloud Storage, pass the flag `--upload-blocks-to-gcloud` . To successfully upload the file, daemon requires the following environment variables to be set: 1. `GCLOUD_KEYFILE` : Key file for authentication 2. `NETWORK_NAME`: Network name to be used in the filename to easily distinguish between blocks in different networks (main-net and testnets) 3. `GCLOUD_BLOCK_UPLOAD_BUCKET` : Google Cloud Storage bucket where the files are uploaded
+The daemon generates a file for each block with the name `<network-name>-<protocol-state-hash>.json` . These files are called precomputed blocks and have all the fields of a block.
+
+To specify a daemon to upload block data to Google Cloud Storage, pass the flag `--upload-blocks-to-gcloud`. 
+
+Set the following environment variables:
+
+- `GCLOUD_KEYFILE`: Key file for authentication
+
+- `NETWORK_NAME`: Network name to use in the filename to easily distinguish between blocks in different networks (Mainnet and Testnets)
 
-The daemon generates a file for each block with the name `<network-name>-<protocol-state-hash>.json` . These are called precomputed blocks and will have all the fields of a block.
+- `GCLOUD_BLOCK_UPLOAD_BUCKET`: Google Cloud Storage bucket where the files are uploaded
 
 ### Save block data from logs
 
-The daemon also logs the block data if the flag `-log-precomputed-blocks` is passed. The log to look for is `Saw block with state hash $state_hash` that contains `precomputed_block` in the metadata and has the block information. This is the same information (precomputed blocks) that gets uploaded to Google Cloud Storage.
+The daemon logs the block data if the flag `-log-precomputed-blocks` is passed. 
+
+The log to look for is `Saw block with state hash $state_hash` that contains `precomputed_block` in the metadata and has the block information. These precomputed blocks contain the same information that gets uploaded to Google Cloud Storage.
 
 ### Generate block data from another archive database
 
-From a fully synced archive database, one can generate block data for each block using the `mina-extract-blocks` tool.
+From a fully synced archive database, you can generate block data for each block using the `mina-extract-blocks` tool.
+
+The `mina-extract-blocks` tool generates a file for each block with name `<protocol-state-hash>.json`. 
+
+The tool takes an `--archive-uri`, an `--end-state-hash`, and an optional `--start-state-hash` and writes all the blocks in the chain starting from start-state-hash and ending at end-state-hash (including start and end).
 
-The tool takes an `--archive-uri`, an `--end-state-hash`, and an optional --start-state-hash and writes all the blocks in the chain starting from start-state-hash and ending at end-state-hash (including start and end).
+If only the end hash is provided, then the tool generates blocks starting with the unparented block closest to the end block. This would be the genesis block if there are no missing blocks in between. The block data in these files are called extensional blocks. Since these are generated from the database, they have only the data stored in the archive database and do not contain any other information pertaining to a block (for example, blockchain SNARK) like the precomputed blocks and can only be used to restore blocks in the archive database.
 
-If only the end hash is provided, then the tool generates blocks starting with the unparented block closest to the end block. This would be the genesis block if there are no missing blocks in between. The tool generates a file with name `<protocol-state-hash>.json` for each block. The block data in these files are called extensional blocks. Since these are generated from the database, they would have only the data stored in the archive database and would not contain any other information pertaining to a block (for example, blockchain snark) that the precomputed blocks would have and therefore, can only be used to restore blocks in the archive database.
+Provide the flag `--all-blocks` to write out all blocks contained in the database.
 
-Alternatively, instead of specifying state hashes, you can provide the flag `--all-blocks`, and the tool will write out all blocks contained in the database.
+## Identify missing blocks
 
-## Identifying missing blocks
+To determine any missing blocks in an archive database, use the `mina-missing-block-auditor` tool. 
 
-The tool `mina-missing-block-auditor` can be used to determine any missing blocks in an archive database. The tool outputs a list of state hashes of all the blocks in the database that are missing a parent. This can be used to monitor the archive database for any missing blocks. The URI of the postgres database can be specified using the flag `--archive-uri`
+The tool outputs a list of state hashes of all the blocks in the database that are missing a parent. This list can be used to monitor the archive database for any missing blocks. Specify the URI of the postgres database by using the flag `--archive-uri`.
 
-## Restoring blocks
+## Restore blocks
 
-Missing blocks in an archive database can be restored if there is block data (precomputed or extensional) available from the options listed [above](#backing-up-block-data) using the tool `mina-archive-blocks`.
+When you have block data (precomputed or extensional) available from [Back up block data](/node-operators/archive-redundancy#back-up-block-data), you can restore missing blocks in an archive database using the tool `mina-archive-blocks`. 
 
-1. Restore precomputed blocks: (from option [1](#upload-block-data-to-google-cloud-storage) and [2](#save-block-data-from-logs) above)
+1. Restore precomputed blocks from: 
+
+  - [Upload block data to Google Cloud Storage]](/node-operators/archive-redundancy#upload-block-data-to-google-cloud-storage)
+
+  - [Save block data from logs](/node-operators/archive-redundancy#save-block-data-from-logs)
 
 ```
   mina-archive-blocks --precomputed --archive-uri <postgres uri> FILES
@@ -91,10 +123,12 @@ Missing blocks in an archive database can be restored if there is block data (pr
 
 ## Staking ledgers
 
-Staking ledgers are used to determine slot winners for each epoch. Mina daemon stores staking ledger for the current and the next epoch (after it is finalized). When transitioning to a new epoch, the "next" staking ledger from the previous epoch is used to determine slot winners of the new epoch and a new "next" staking ledger is chosen. Since staking ledgers for older epochs are no longer accessible, users may want to still keep them around for reporting or other purposes.
+Staking ledgers are used to determine slot winners for each epoch. Mina daemon stores staking ledger for the current and the next epoch after it is finalized. When transitioning to a new epoch, the "next" staking ledger from the previous epoch is used to determine slot winners of the new epoch and a new "next" staking ledger is chosen. Since staking ledgers for older epochs are no longer accessible, you can still keep them around for reporting or other purposes.
 
-Currently these ledgers can be exported using the cli command-
+Export these ledgers using the mina cli command:
 
     mina ledger export [current-staged-ledger|staking-epoch-ledger|next-epoch-ledger]
 
-Epoch ledger transition happens once every 14 days (given slot-time = 3mins and slots-per-epoch = 7140). The window to backup a staking ledger is ~27 days considering "next" staking ledger is finalized after k (currently 290) blocks in the current epoch and therefore will be available for the rest of the current epoch and the entire next epoch.
+Epoch ledger transition happens once every 14 days (given slot-time = 3mins and slots-per-epoch = 7140). 
+
+The window to backup a staking ledger is ~27 days considering "next" staking ledger is finalized after k (currently 290) blocks in the current epoch and therefore is available for the rest of the current epoch and the entire next epoch.