Musings on the Data Engineering Zoomcamp: Week 5
Post motivated by #learninginpublic and Alexey!
Continuing with the Week-5 of the Data Engineering Zoomcamp organized by DataTalks.club. This week’s topic was Batch Processing (primarily PySpark). We continued using different google cloud products (e.g., Dataproc).
Topics covered in the Week-5:
Fundamentals of Batch Processing Engineering
Spark (Installation, Spark internals and functions)
Running Spark in cloud
Running Dataproc cluster
PySpark & BigQuery
Relevant links:
Week-1 of DE Zoomcamp 2023: link.
Week-2 of DE Zoomcamp 2023: link.
Week-3 of DE Zoomcamp 2023: link.
Week-4 of DE Zoomcamp 2023: link.
#dezoomcamp #dataengineering #learninginpublic