Data Science with Spark
Get started with Spark for data science using this unique video tutorial
The real power and value proposition of Apache Spark is its speed and platform to execute Data Science tasks. Spark's unique use case is that it combines ETL, batch analytic, real-time stream analysis, machine learning, graph processing, and visualizations to allow Data Scientists to tackle the complexities that come with raw unstructured data sets. Spark embraces this approach and has the vision to make the transition from working on a single machine to working on a cluster, something that makes data science tasks a lot more agile.
In this course, you’ll get a hands-on technical resource that will enable you to become comfortable and confident working with Spark for Data Science. We won't just explore Spark’s Data Science libraries, we’ll dive deeper and expand on the topics.
This course starts by taking you through Spark and the needed steps to build machine learning applications. You will learn to collect, clean, and visualize data coming from Twitter with Spark streaming. Then, you will get acquainted with Spark Machine learning algorithms and different machine learning techniques. You will also learn to apply statistical analysis and mining operations on our Tweet dataset. Finally, the course will end by giving you some ideas on how to perform awesome analysis including graph processing. By the end of the course, you will be able to do your Data scientist job in a very visual way, comprehensive and appealing for business and other stakeholders.
About The Author
Eric Charles has 10 years’ experience in the field of Data Science and is the founder of Datalayer, a social network for Data Scientists. He is passionate about using software and mathematics to help companies get insights from data.
His typical day includes building efficient processing with advanced machine learning algorithms, easy SQL, streaming and graph analytics. He also focuses a lot on visualization and result sharing.
He is passionate about open source and is an active Apache Member. He regularly gives talks to corporate clients and at open source events.