Apache Spark Core Concepts: Introduction to Distributed Data Processing
Monday, January 16, 2023 - 7:00 PM UTC, for 1 hour.
Regular, 60 minute presentation
Room: Campsite 4
Big data is only getting bigger, and being able to make quick, data-driven decisions at scale is more important than ever. That’s why thousands of organizations in both industry and academia use Apache Spark for scalable computing. This talk introduces Spark concepts in an approachable, visual manner that will leave you with a strong foundation for using this powerful data processing and analytics engine.
None - this talk is designed to be approachable by everyone.
- Learn strategies for optimizing Spark jobs
- Visualize data partitions, data shuffling, drivers & executors, and the layers of spark computation
- Gain a firm understanding of parallel processing frameworks like Apache Spark