Spark Installation:
For installing Spark, we just need to download a stable release of the Spark from the official Spark downloads page and unpack the tarball in a suitable location.
% tar xzf spark-x.y.z-bin-abc.tgz
It’s convenient to put the Spark binaries on your path as follows:
% export SPARK_HOME=~/sw/spark-x.y.z-bin-abc % export PATH=$PATH:$SPARK_HOME/bin
You use the commands spark-shell.cmd and pyspark.cmd to run Spark Shell using Scala and Python respectively.
We can start spark-shell like
hdadmin@ubuntu:~$ cd spark-2.0.1-bin-hadoop2.4/ hdadmin@ubuntu:~/spark-2.0.1-bin-hadoop2.4$ bin/spark-shell Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). 17/12/07 22:15:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/12/07 22:15:44 WARN Utils: Your hostname, ubuntu resolves to a loopback address: 127.0.1.1; using 192.168.30.130 instead (on interface eth0) 17/12/07 22:15:44 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 17/12/07 22:15:47 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect. Spark context Web UI available at http://192.168.30.130:4040 Spark context available as 'sc' (master = local[*], app id = local-1512713747204). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.0.1 /_/ Using Scala version 2.11.8 (Java HotSpot(TM) Client VM, Java 1.8.0_151) Type in expressions to have them evaluated. Type :help for more information. scala>