Skip to content

Commit

Permalink
update notebook 6
Browse files Browse the repository at this point in the history
  • Loading branch information
EC2 Default User committed Sep 30, 2024
1 parent 9c45912 commit ea20517
Showing 1 changed file with 8 additions and 5 deletions.
13 changes: 8 additions & 5 deletions docs/source/api/notebooks/Notebook_6_API2CLI.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,9 @@
"\n",
"We can reuse most of the code about the customized `RGAT` module in Notebook 4, , i.e., `Ara_GatLayer`, `Ara_GatEncoder`, and `RgatNCModel`, in the training and inference files.\n",
"\n",
"For the training file, we can copy and paste the code of training pipeline, and enclose them in a `fit()` function. Similarly, for the inference file, we can copy and paste the code of infernece pipeline, and enclose them in a `infer()` function.\n",
"For the training file, we can copy and paste the code of the `4.1 Training pipeline` section in Notebook 4, and enclose them in a `fit()` function. Similarly, for the inference file, we can copy and paste the code of the `4.3 Inference pipeline` section in Notebook 4, and enclose them in a `infer()` function.\n",
"\n",
"We have provided the two files, named `demo_run_train.py` and `demo_run_infer.py` under the [GraphStorm API documentation folder](https://github.com/awslabs/graphstorm/tree/main/docs/source/api). With the two files, we can call GraphStorm's task-agnostic CLI to run our custom model as shown below."
"We have provided the two files, named `demo_run_train.py` and `demo_run_infer.py` under the [GraphStorm API documentation folder](https://github.com/awslabs/graphstorm/tree/main/docs/source/api/notebooks). With the two files, we can call GraphStorm's task-agnostic CLI to run our custom model as shown below."
]
},
{
Expand Down Expand Up @@ -130,7 +130,9 @@
"metadata": {},
"source": [
"## GraphStorm `GSConfig` object explanation\n",
"Once obtained these arguments, we can use them to create a `GSConfig` object and then pass the object to different modules to get related configurations. The `GSConfig` object checks every argument's format and value to ensure compliance with GraphStorm specifications. Below cells show the code of creating the `GSConfig` object and examples of how to use it. Please refer to the [GSConfig API doc](https://graphstorm.readthedocs.io/en/latest/api/generated/graphstorm.config.GSConfig.html#graphstorm.config.GSConfig) for more details of this class."
"Once obtained these arguments, we can use them to create a `GSConfig` object and then pass the object to different modules to get related configurations. The `GSConfig` object checks every argument's format and value to ensure compliance with GraphStorm specifications. Below cells show the code of creating the `GSConfig` object and examples of how to use it to pass configurations. For example, we can pass the IP list file, GraphStorm backend, and the local rank configurations to GraphStorm distributed context initialization function, `gs.initialize()`, to start GraphStorm distributed context.\n",
"\n",
"For more details of `GSConfig`, please refer to the [GSConfig API documentation page](https://graphstorm.readthedocs.io/en/latest/api/generated/graphstorm.config.GSConfig.html#graphstorm.config.GSConfig) ."
]
},
{
Expand All @@ -151,6 +153,7 @@
" # Utilize GraphStorm's GSConfig class to accept arguments\n",
" config = GSConfig(gs_args)\n",
"\n",
" # Initialize distributed training and inference context\n",
" gs.initialize(ip_config=config.ip_config, backend=config.backend, local_rank=config.local_rank)\n",
" acm_data = gs.dataloading.GSgnnData(part_config=config.part_config)\n",
"\n",
Expand Down Expand Up @@ -206,7 +209,7 @@
"\n",
"It is easy to modify the command in the above cell to run them on a [Distributed clusters](https://graphstorm.readthedocs.io/en/latest/cli/model-training-inference/distributed/cluster.html). We need conduct three additional operations:\n",
"\n",
"1. Partition the ACM data in multiple partitions, e.g., 2 partition, and record its JSON file path, e.g., `./acm_gs_2p/acm.json`.\n",
"1. As demonstrated in [User Your Own Data tutorial](https://graphstorm.readthedocs.io/en/latest/tutorials/own-data.html#run-graph-construction), partition the ACM data in multiple partitions, e.g., 2 partitions by setting the argument `--num-parts 2`, and record its JSON file path, e.g., `./acm_gs_2p/acm.json`.\n",
"2. Follow the [tutorial of creating a GraphStorm cluster](https://graphstorm.readthedocs.io/en/latest/cli/model-training-inference/distributed/cluster.html#create-a-graphstorm-cluster) to prepare a cluster with 2 machines.\n",
"3. Prepare an IP list file, e.g., `ip_list.txt` on the cluster, and record its file path, e.g., `./ip_list.txt`.\n",
"\n",
Expand Down Expand Up @@ -260,7 +263,7 @@
"1. Partition the ACM data in multiple partitions, e.g., 2 partition, and upload them to an Amazon S3 location, e.g., `s3://<PATH_TO_DATA>/acm_gs_2p`.\n",
"2. Upload the configuration yaml file to an Amazon S3 location, e.g., `s3://<PATH_TO_TRAINING_CONFIG>/acm_nc.yaml`.\n",
"3. Git clone [GraphStorm source code](https://github.com/awslabs/graphstorm), and move the `demo_run_train.py` and `demo_run_infer.py` files from the `graphstorm/docs/source/api/notebooks/` folder to the `graphstorm/python/graphstorm/` folder.\n",
"4. Follow the [Setup GraphStorm SageMaker Docker Image](https://graphstorm.readthedocs.io/en/latest/cli/model-training-inference/distributed/sagemaker.html#step-1-build-a-sagemaker-compatible-docker-image) tutorial to create a docker image. When building the docker image, please make sure the local graphstorm source code have the two Python files moved.\n",
"4. Follow the [Setup GraphStorm SageMaker Docker Image](https://graphstorm.readthedocs.io/en/latest/cli/model-training-inference/distributed/sagemaker.html#step-1-build-a-sagemaker-compatible-docker-image) tutorial to create a docker image.\n",
"\n",
"Then use the following SageMaker CLIs to run custom model on an Amazon SageMaker cluster. Please refer to the [GraphStorm Model Training and Inference on on SageMaker](https://graphstorm.readthedocs.io/en/latest/cli/model-training-inference/distributed/sagemaker.html#) for more details."
]
Expand Down

0 comments on commit ea20517

Please sign in to comment.