diff --git a/assign9.md b/assign9.md index 6980a355fde6fe80797db5d5f6d10c3316153319..c026ef89c769b045b58f60afac5f6fdda4b50b74 100644 --- a/assign9.md +++ b/assign9.md @@ -103,7 +103,7 @@ the number of times each word appears in the file `Dockerfile`. Use `>>> counts = textFile.flatMap(lambda line: line.split(" ")).map(lambda word: (word, 1)).reduceByKey(lambda a, b: a + b)` -In more detail: +In more detail, from the docker container created as above: ``` root@d36910b1feb0:/assign9# $SPARKHOME/bin/pyspark Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux