This intensive 4-day hands-on course is designed to teach attendees how to use AWS features to perform data analysis in the Amazon AWS environment. Each module is complemented with a hands-on lab giving attendees practical experience with the topics covered. Day one focuses on data analytics basics, including data acquisition, scrubbing, manipulation, and storage. A number of practical use cases are examined during class and lab sessions where students will gain exposure to S3, Glue, and other tools. Day two focuses on data warehousing tools, introducing attendees to Red Shift, the Hive MetaStore and the Presto high-performance query engine as well as powerful Athena automated and Kinesis streaming query services. Day three focuses on the Amazon EMR platform and the Apache Spark platform. Attendees will learn how to work with RDDs, DataFrames, SparkSQL and Spark Streaming. Day four introduces machine learning concepts and the key AWS tools for creating machine learning-driven solutions to common business problems. Upon course completion, attendees will have a clear understanding of data analysis, data processing, and machine learning operations and their applications on the AWS platform.
Available for Instructor-Led (ILT) in-person/onsite training or Virtual Instructor-Led training (VILT) delivery; Open Enrollment options may be available.
Who Should Attend
Application Developers, Analysts and Data Scientists
What Attendees will learn
This course is designed to provide attendees with a comprehensive introduction to data science on the AWS platform. Learning modules include:
- Data science on AWS
- AWS data warehousing
- EMR analytics
- Machine learning on AWS
The ability to run a 64 bit virtual machine (provided) good internet access, basic computing background and experience with SQL and Python.