eXtreme Gradient Boosting (GBDT, GBRT or GBM) Library for large-scale and distributed machine learning, on single node, hadoop yarn and more.
#
#
#
#
#
#
Hi i am stuck in my work in submitting a spark job to hadoop yarn master in cluster mode
please find my environment setup below
i have a linux machine having 128 GB of RAM, 2TB Hard disk, 2x16 cores.
i have set up cloud era hadoop containers in a docker mount point having 50 GB(this mount point is almost full). i have one datanode, namenode and yarnmaster containers runnings.
i am submitting spark job from my host machine to run Rscript in cluster mode. R server and libraries are set up in datanode.
When i submit the spark job it remains in the accepted state for long time. please find the spark submit command i am using below
spark-submit --master yarn --name RechargeModel --deploy-mode cluster --executor-memory 3G --num-executors 4 rechargemodel.R