🔗Related content
You can find post related in:
You can find video related in:
You can find repo related in:
🐱🏍GitHub
You can connect with me in:
Resume 🧾
I will install Spark program and will use a library of Python to write a job that answer the question, how many row exists by each rating?
Before start we setup environment to run Spark Standalone Cluster.
1st – Mount Google Drive 🚠
We will mount Google Drive to can use it files.
I use following script:
from google.colab import drive
drive.mount('/content/gdrive')
2nd – Install Spark 🎇
Later got a Colab notebook up, to get Spark running you have to run the following script (I apologize for how ugly it is).
I use following script:
%%bash
apt-get install...
[gs-fb-comments]