diff --git a/assign9.md b/assign9.md index bd79cf51865198a49d1a06c3320cab58be6f4159..14bb1bf6d7c27df74ad81f314074e6a20bd567cb 100644 --- a/assign9.md +++ b/assign9.md @@ -26,30 +26,10 @@ the YARN resource manager. Note that bash is the default shell everywhere, but the `.cshrc` is set up correctly if you feel like dropping into `tcsh`. -### Mac and Homebrew -``` - brew install apache-spark - brew install psutils -``` - -### Docker -From the distribution directory: -``` - docker build spark424:alpha . - docker run -it spark424:alpha -``` -The second command logs you in to a `tcsh` shell in the container. The -distribution directory is mounted read/write at `/assign9`. -### Other -- Download spark from [here](https://archive.apache.org/dist/spark/spark-3.0.1/spark-3.0.1-bin-hadoop2.7.tgz). -- Move the downloaded file to the `assignment9/` directory (so it is available -in '/vagrant' on the virtual machine), and uncompress it using: `tar zxvf -spark-3.0.1-bin-hadoop2.7.tgz` in your VM. -- This will create a new directory: `spark-3.0.1-bin-hadoop2.7`. - - ### Vagrant +This is the **recommended** way to do this project. + As before, we have provided a VagrantFile in the `assignment9` directory. Since the Spark distribution is large, we ask you to download that directly from the Spark website. @@ -61,6 +41,17 @@ This step is included in the VagrantFile, but if you get any error We are ready to use Spark. +### Mac and Homebrew +I'd rather not pollute my mac w/ all the detritus from starting +`spark` up, so I have not pioneered a mac approach, though you are +free to. + +If you have apple silicon, I recommend you wait until the docker +approach has been checked out (soon). + +### Docker +Probably before Thanksgiving. + ### Spark and Python Spark primarily supports three languages: Scala (Spark is written in Scala),