This intensive two day hands-on course is designed to help working technology professionals master the essential aspects and operation of Apache Gobblin. The course covers all of the key concepts and tasks necessary to deploy and use a production Gobblin service. Attendees will become familiar with core Gobblin concepts such as jobs, work units, extractors, converters, quality checkers, fork operators and data writers. Hands on labs will provide practical experience wherein students will install, configure and run their own Gobblin jobs. The course will also cover common Gobblin use cases and best practices. Attendees will leave with a clear understanding of Apache Gobblin and the practical skills necessary to begin using Gobblin immediately for a range of distributed data integration tasks.
Delivery
Gobblin training is available for Instructor-Led (ILT) in-person/onsite training or Virtual Instructor-Led training (VILT) delivery; Open Enrollment options may be available.
Who Should Attend
This course is intended for Developers, Data Engineers, Technical Managers, DataOps, Devops, and SRE personnel.
What Attendees will learn
This course is designed to give attendees a comprehensive introduction to the Apache Gobblin distributed data integration framework. In addition to providing an overview of Gobblins component’s, upon completion attendees will be prepared to begin utilizing Gobblin to ingest, monitor, manage, and publish massively scalable data flows.
Prerequisites
Each attendee will require the ability to ssh into a cloud hosted virtual machine (provided with the course). Basic Linux command line skills are valuable but not required.