Skip to content

Data-Mining-HSE/hadoop-spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hadoop-Spark

1 DataNode

docker-compose -f docker-compose.yml up -d
docker cp data/diabetes_prediction_dataset.csv namenode:/
docker cp -L src/. spark-master:/opt/bitnami/spark/
docker exec -it namenode bash

hdfs dfs -put diabetes_prediction_dataset.csv /
exit

For general run use the following command

docker exec -it spark-master spark-submit --master spark://spark-master:7077 main.py -d hdfs://namenode:9000/diabetes_prediction_dataset.csv -n 1 -i 5

docker cp spark-master:/opt/bitnami/spark/not_optimized_num_nodes_1.png images

For optimized run use the following command

docker exec -it spark-master spark-submit --master spark://spark-master:7077 main.py -d hdfs://namenode:9000/diabetes_prediction_dataset.csv -n 1 -i 5 -o

docker cp spark-master:/opt/bitnami/spark/optimized_num_nodes_1.png images

docker-compose -f docker-compose.yml down

3 DataNodes

docker-compose -f docker-compose-3d.yml up -d
docker cp data/diabetes_prediction_dataset.csv namenode:/
docker cp -L src/. spark-master:/opt/bitnami/spark/
docker exec -it namenode bash

hdfs dfs -put diabetes_prediction_dataset.csv /
exit

For general run use the following command

docker exec -it spark-master spark-submit --master spark://spark-master:7077 main.py -d hdfs://namenode:9000/diabetes_prediction_dataset.csv -n 3 -i 5

docker cp spark-master:/opt/bitnami/spark/not_optimized_num_nodes_3.png images

For optimized run use the following command

docker exec -it spark-master spark-submit --master spark://spark-master:7077 main.py -d hdfs://namenode:9000/diabetes_prediction_dataset.csv -n 3 -i 5 -o

docker cp spark-master:/opt/bitnami/spark/optimized_num_nodes_3.png images

docker-compose -f docker-compose-3d.yml down

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published