R API and SparkR ========================= https://spark.apache.org/docs/3.1.1/sparkr.html Interactive Jobs ---------------- .. code-block:: bash module load R/4.0/4.0.2 # launch interactive sparkR session sparkR .. code-block:: console Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 3.1.1 /_/ SparkSession Web UI available at http://hdpen01.chicagobooth.edu:4040 SparkSession available as 'spark'(master = yarn, app id = application_1629131983124_2674). > Batch Jobs ---------- .. code-block:: bash # ssh into the cluster ssh @vulcan.chicagobooth.edu # load the R module module load R/4.0/4.0.2 # client mode enables stdout spark-submit script.R # cluster mode disables stdout but allows long-running jobs to continue after logging off # --master client is used by default unless specified otherwise spark-submit --deploy-mode cluster script.R Examples -------- Apache Spark ships with a few example scripts that serve as useful demos. You can find the examples at the following path: ``${SPARK_HOME}/examples/src/main/r/``. .. code-block:: bash # load R module module load R/4.0/4.0.2 # run a prediction model with Alternative Least Squares spark-submit ${SPARK_HOME}/examples/src/main/r/ml/als.R