Merge pull request #190 from NetApp/add_warm_performance_tier

Add warm performance tier
NetApp · Sep 9, 2024 · 66433d0 · 66433d0
2 parents b3ec0e1 + 05dab67
commit 66433d0
Show file tree

Hide file tree

Showing 3 changed files with 64 additions and 26 deletions.
diff --git a/Management-Utilities/README.md b/Management-Utilities/README.md
@@ -6,7 +6,7 @@ This subfolder contains tools that can help you manage your FSx ONTAP file syste
 | [auto_create_sm_relationships](/Management-Utilities/auto_create_sm_relationships) | This tool will automatically create SnapMirror relationships between two FSx ONTAP file systems. |
 | [autto_set_fsxn_auto_grow](/Management-Utilities/auto_set_fsxn_auto_grow) | This tool will automatically set the auto size mode of an FSx for ONTAP volume to 'grow'. |
 | [fsx-ontap-aws-cli-scripts](/Management-Utilities/fsx-ontap-aws-cli-scripts) | This repository contains a collection of AWS CLI scripts that can help you manage your FSx ONTAP file system. |
-| [fsxn-rotate-secret](/Management-Utilities/fsxn-rotate-secret) | This is a Lambda function to be used with an AWS Secrets Manager secret to rotate the FSx for ONTAP admin password. |
+| [fsxn-rotate-secret](/Management-Utilities/fsxn-rotate-secret) | This is a Lambda function that can be used with an AWS Secrets Manager secret to rotate the FSx for ONTAP admin password. |
 | [iscsi-vol-create-and-mount](/Management-Utilities/iscsi-vol-create-and-mount) | This tool will create an iSCSI volume on an FSx ONTAP file system and mount it to an EC2 instance running Windows. |
 | [warm_performance_tier](/Management-Utilities/warm_performance_tier) | This tool to warm up the performance tier of an FSx ONTAP file system volume. |
 

diff --git a/Management-Utilities/warm_performance_tier/README.md b/Management-Utilities/warm_performance_tier/README.md
@@ -2,29 +2,33 @@
 
 ## Introduction
 This sample provides a script that can be used to warm a FSx for ONTAP
-volume. In other words, it ensures that all the blocks for a volume are in
+volume. In other words, it tries to ensure that all the blocks for a volume are in
 the "performance tier" as opposed to the "capacity tier." It does that by
 simply reading every byte of every file in the volume. Doing that
 causes all blocks that are currently in the capacity tier to be pulled
 into the performance tier before being returned to the reader. At that point,
-assuming the tiering policy is not set to 'all', all the data should remain
+assuming the tiering policy is not set to 'all' or 'snapshot', all the data should remain
 in the performance tier until ONTAP tiers it back based on the volume's
 tiering policy.
 
-Note that Data ONTAP will not store data in the performance
-tier from the capacity tier if it detects that the data is being read
-sequentially. This is to keep things like backups and virus scans from
-filling up the performance tier. Because of that, this script will
-read files in "reverse" order. Meaning it will read the last block of
-the file first, then the second to last block, and so on. 
+Note that, by default, Data ONTAP will not store data in the performance
+tier from the capacity tier if it detects that the data is being read sequentially.
+This is to keep things like backups and virus scans from filling up the performance tier.
+You can, and should, override this behavior by setting
+the cloud-retrieval-policy to "on-read" for the volume. Examples on
+how to do that are shown below.
+
+In an additional effort to try to get ONTAP to keep data in the performance tier
+after reading it in, this script will read files in "reverse" order. Meaning
+it will read the last block of the file first, then the second to last block, and so on.
 
 To speed up the process, the script will spawn multiple threads to process
 the volume. It will spawn a separate thread for each directory
 in the volume, and then a separate thread for each file in that directory.
 The number of directory threads is controlled by the -t option. The number
 of reader threads is controlled by the -x option. Note that the script
-will spawn -x reader threads **per** directory thread. So for example, if you have 4
-directory threads and 10 reader threads, you could have up to 40 reader
+will spawn -x reader threads **per** directory thread. So, for example, if you have 2
+directory threads and 5 reader threads, you could have up to 10 reader
 threads running at one time.
 
 Since the goal of this script is to force all the data that is currently
@@ -34,20 +38,41 @@ You can use the `volume show-footprint` ONTAP command to see how much space
 is currently in the capacity tier. You can then use `storage aggregate show`
 to see how much space is available in the performance tier.
 
+Note that it will not be uncommon for there to still be data in the
+capacity tier after running this script. There can be several reasons
+for that. For example:
+
+* Space is from snapshots that aren't part of the live volume anymore.
+* Space from blocks that are part of an object in the object store, but aren't
+part of the volume. This space will get consolidated eventually.
+* Some is from metadata that is always kept in the capacity tier.
+
+Even with the reasons mentioned above, we have found that running the
+script twice does, typically, get more data into the performance tier so
+if you are trying to get as much data as possible into the performance tier,
+it is recommended to run the script twice.
+
 ## Set Up
-The script is meant to be run on a Linux based host that is able to NFS
+The first step is to ensure the volume's tiering policy is set
+to something other than "all" or "snapshot-only". You should also ensure
+that the cloud-retrieval-policy is set to "on-read". You can make
+both of these changes with the following commands:
+```
+set advanced -c off
+volume modify -vserver <vserver> -volume <volume> -tiering-policy auto -cloud-retrieval-policy on-read
+```
+Where `<vserver>` is the name of the SVM and `<volume>` is the name of the volume.
+
+The next step is to copy the script to a Linux based host that is able to NFS
 mount the volume to be warmed. If the volume is already mounted, then
-any user that has read access to the files in the volume can run it.
+any user that has read access to all the files in the volume can run it.
 Otherwise, the script needs to be run as 'root' so it can mount the
 volume before reading the files.
 
-If the 'root' user can't read the files in the volume, then you should use 'root' user just
+If the 'root' user can't read the all files in the volume, then you should use the 'root' user just
 to mount the volume and then run the script from a user ID that can read the contents
 of all the files in the volume. 
 
-Make sure you have set the tiering policy on the volume set to something
-other than "all" or "snapshot-only", otherwise the script will be ineffective.
-
 # Running The Script
 There are two main ways to run the script. The first is to just provide
 the script with a directory to start from using the -d option. The script will then read
@@ -66,7 +91,7 @@ To run this script you just need to change the UNIX permissions on
 the file to be executable, then run it as a command:
 ```
 chmod +x warm_performance_tier
- ./warm_performance_tier -d /path/to/mount/point
+./warm_performance_tier -d /path/to/mount/point
 ```
 The above example will force the script to read every file in the /path/to/mount/point
 directory and any directory under it.
@@ -88,8 +113,8 @@ Where:
   -v volume_name - Is the name of the volume.
   -n nfs_type - Is the NFS version to use. Default is nfs4.
   -d directory - Is the root directory to start the process from.
-  -t max_directory_threads - Is the maximum number of threads to use to process directories. The default is 10.
-  -x max_read_threads - Is the maximum number of threads to use to read files. The default is 4.
+  -t max_directory_threads - Is the maximum number of threads to use to process directories. The default is 2.
+  -x max_read_threads - Is the maximum number of threads to use to read files. The default is 5.
   -V - Enable verbose output. Displays the thread ID, date (in epoch seconds), then the directory or file being processed.
   -h - Prints this help information.
 
@@ -101,6 +126,15 @@ Notes:
     reading files.
 ```
 
+## Finishing Step
+After running the script, you should set the cloud-retrieval-policy back to "default" by running
+the following commands:
+```
+set advanced -c off
+volume modify -vserver <vserver> -volume <volume> -cloud-retrieval-policy default
+```
+Where `<vserver>` is the name of the SVM and `<volume>` is the name of the volume.
+
 ## Author Information
 
 This repository is maintained by the contributors listed on [GitHub](https://github.com/NetApp/FSx-ONTAP-samples-scripts/graphs/contributors).

diff --git a/Management-Utilities/warm_performance_tier/warm_performance_tier b/Management-Utilities/warm_performance_tier/warm_performance_tier
@@ -34,8 +34,8 @@ Where:
   -v volume_name - Is the ID of the volume.
   -n nfs_type - Is the NFS version to use. Default is nfs4.
   -d directory - Is the root directory to start the process from.
-  -t max_directory_threads - Is the maximum number of threads to use to process directories. The default is 10.
-  -x max_read_threads - Is the maximum number of threads to use to read files. The default is 4.
+  -t max_directory_threads - Is the maximum number of threads to use to process directories. The default is 5.
+  -x max_read_threads - Is the maximum number of threads to use to read files. The default is 2.
   -V - Enable verbose output. Displays the thread ID, date (in epoch seconds), then the directory or file being processed.
   -h - Prints this help information.
 
@@ -98,10 +98,14 @@ isMounted () {
 ################################################################################
 readFile () {
   local file=$1
-  local blockSize=$((4*1024*1024))
+  local blockSize=$((2*1024*1024))
 
   fileSize=$(stat -c "%s" "$file")
-  fileBlocks=$(($fileSize/$blockSize))
+  fileBlocks=$((fileSize/blockSize))
+  if [ $((fileSize % blockSize)) -ne 0 -o $fileSize -eq 0 ]; then
+    let fileBlocks+=1
+  fi
+
   while [ $fileBlocks -ge 0 ]; do
     if dd if="$file" of=/dev/null bs=$blockSize count=1 skip=$fileBlocks > /dev/null 2>&1; then
       :
@@ -164,8 +168,8 @@ processDirectory () {
 ################################################################################
 #
 # Set some defaults.
-maxDirThreads=4
-maxFileThreads=10
+maxDirThreads=2
+maxFileThreads=5
 nfsType=nfs4
 verbose=false
 #