diff --git a/assign9.md b/assign9.md index 14bb1bf6d7c27df74ad81f314074e6a20bd567cb..9a12f45d5771ab7a5a06408ea884a524da819c3f 100644 --- a/assign9.md +++ b/assign9.md @@ -1,11 +1,16 @@ ## Assignment 9: Spark ### Due Dec 8, 2023, 11:59PM -Assignment 8 focuses on using Apache Spark for doing large-scale data analysis +Assignment 9 focuses on using Apache Spark for doing large-scale data analysis tasks. For this assignment, we will use relatively small datasets and we won't run anything in distributed mode; however Spark can be easily used to run the same programs on much larger datasets. +### Setup + +Download files for Assignment 9 <a href="https://ceres.cs.umd.edu/424/assign/assignment9Dist.tgz?1">here</a>. +*We do not provide a new Vagrant file as you can use your existing VM's with this project, or just natively on Macs.* + ## Getting Started with Spark This guide is basically a summary of the excellent tutorials that can be found