Online Course – Certified Professional Specialization in Serverless Data Processing with Dataflow by Google Cloud Institute

Building big data applications that can expand and grow. Advanced solutions for all big data needs.

Suggested by: Coursera (What is Coursera?)

Professional Certificate

Intermediate level

No prior knowledge required

Time to complete the course

7-day free trial

No unnecessary risks

Skills you will acquire in the course

  • Problem analysis ability
  • Communication skills
  • Teamwork
  • Technological skills
  • Time management
  • Creative thinking
  • Conflict resolution
  • professionalism
  • Project organization and management
  • Independent learning

What you will learn in the course

Courses for which the course is suitable

  • Data key
  • Data Engineer
  • Data Analyst
  • Data Project Manager
  • Big Data Expert
  • Pipeline developer
  • Information Systems Manager
  • Google Cloud Expert

Internship – a three-unit course series

Maintains technology that is always at the cutting edge

This is becoming increasingly apparent, especially with the increasing demands of data-driven businesses. Anyone who works with big data is familiar with the three Vs of big data: volume, velocity, and variety. What if there was a growth-proof technology designed to meet these demands?

Meet Google Cloud Dataflow

Google Cloud Dataflow simplifies data processing by unifying ATCH processing and dataflow processing, providing a serverless experience that lets users focus on analytics, not infrastructure. This specialization is for customers and partners who want to deepen their understanding of Dataflow to advance their data processing applications.

There are three courses in this specialization:

  • Basics:
    Explains how Apache Beam and Dataflow work together to meet your data processing needs without the risk of committing to a single vendor.
  • Pipeline development:
    Covers how to convert our business logic into data processing applications that can run on Dataflow.
  • Actions:
    An overview of the most important lessons for managing a data application on Dataflow, including monitoring, troubleshooting, testing, and reliability.

Hands-on Learning Project

This specialization features hands-on labs using the Qwiklabs platform. The labs are built on the concepts learned in the course modules. Where necessary, we have provided Java and Python versions of the labs. For labs that require adding/updating code, we have provided a recommended solution for your reference.

Details of the courses that make up the specialization

Serverless Data Processing with Dataflow: Fundamentals Course

Course 1 • 3 hours • 4.2 (79 ratings)

Course Details

What you’ll learn
  • This course is part 1 of a 3-course series on serverless data processing using Dataflow.
  • In the first course, we will start with a brief explanation of what Apache Beam is and what relationship it has with Dataflow.
  • Then we will talk about the vision of Apache Beam and the benefits of Beam’s portability principle.
  • Beam’s portability principle allows developers to use their preferred programming language along with their desired execution platform.
  • Then we’ll show how Dataflow allows you to separate compute from storage while saving money.
  • How management, access, and discovery tools work with your Dataflow pipelines.
  • Finally, we will look at how to implement the appropriate security model for your Dataflow use case.

Prerequisites:

  • The course series on serverless data processing with Dataflow builds on the terms discussed in the Data Engineering specialization.
  • We recommend the following courses as preparation:
    • (i) Building Batch Data Pipelines in Google Cloud: Covers basic principles of Dataflow
    • (ii) Building Resilient Streaming Analytics Systems in Google Cloud: Covers basic streaming concepts like windows, triggers, and watermarks

>> By registering for this course, you agree to the Qwiklabs Terms of Service detailed in the FAQ and on the page: https://qwiklabs.com/terms_of_service <<

Serverless Data Processing with Dataflow: Developing Pipelines

Course 2 • 31 hours • 4.0 (40 ratings)

Course Details

What you’ll learn
  • Overview of the main Apache Beam concepts discussed in the Data Engineering course on Google Cloud
  • Overview of key streaming concepts discussed in DE (unlimited PCollections, windows, watermarks, and triggers)
  • Selecting and adapting the input/output (I/O) of your choice for your Dataflow pipeline
  • Using schemas to simplify your Beam code and improve your pipeline performance

Serverless data processing with Dataflow: Operations

Course 3 • 9 hours • 3.6 (17 ratings)

Course Details

What you’ll learn
  • Perform monitoring, troubleshooting, testing, and CI/CD on Dataflow pipelines.
  • Implement Dataflow pipelines with a focus on reliability to maximize the stability of your data processing platform.