Online Course – Certified Professional Internship in Serverless Data Processing from Google Cloud Institute

Developing scalable Big Data applications with advanced technologies and customized solutions.

Suggested by: Coursera (What is Coursera?)

Professional Certificate

Beginners

No prior knowledge required

Time to complete the course

7-day free trial

No unnecessary risks

Skills you will acquire in the course

  • Approximate error
  • graph
  • Determining Causality
  • Data model
  • Extract, Transform, and Load (ETL)
  • Analytics
  • Status (Computer Science)

What you will learn in the course

Courses for which the course is suitable

  • Big Data Developer
  • Data Engineer
  • Data Analyst
  • Data Project Manager
  • Google Cloud Expert
  • Data Processing Application Developer
  • Information Systems Manager
  • Data Solutions Specialist

Internship – Series of 3 courses

As the demands of a data-driven business grow, it becomes more difficult to keep technology up to date. All big data professionals are familiar with the three Vs of big data: volume, velocity, and variety. What if there was a technology that was unafraid of limitations, designed to meet these demands?

Meet Google Cloud Dataflow. Google Cloud Dataflow simplifies data processing by unifying batch and streaming processing, providing a serverless experience that lets users focus on analysis, not infrastructure. This specialization is for customers and partners who want to improve their understanding of Dataflow to advance their data processing applications.

The specialization includes three courses:

  • Basics
    – Which explains how Apache Beam and Dataflow work together to fulfill data processing needs without relying on a service provider.
  • Route development
    – which deals with how to convert our business logic into data processing applications that can run in Dataflow.
  • Actions
    – which goes over the most important lessons for managing a data application in Dataflow, including monitoring, troubleshooting, testing, and reliability.

Hands-on Learning Project

This specialization includes hands-on labs using the Qwiklabs platform. The labs are based on the information learned in the course modules. When needed, Java and Python versions of the labs are provided. For labs that require code addition/update, we provide a recommended solution for your use.

Details of the courses that make up the specialization

Serverless Data Processing with Dataflow: The Basics

Course 1 • 3 hours

Course Details

What you’ll learn:
  • Demonstrate how Apache Beam and Cloud Dataflow work together to meet your organization’s data processing needs.
  • Summarize the benefits of the Beam mobility framework and enable it to work with data flow pipelines.
  • Enable Shuffle & Streaming Engine for both batch and streaming data processing pipelines to achieve maximum performance.
  • Enable flexible resource planning for more cost-effective performance.

Serverless Data Processing with Dataflow: Developing Pipelines

Course 2 • 18 hours

Course Details

What you’ll learn:
  • In the second part of the Dataflow course series, we’ll dive deeper into pipeline development using the Beam SDK. We’ll start with an overview of Apache Beam concepts.
  • Next, we will look at processing data in a flow with windows, watermarks, and triggers.
  • Next, we’ll cover the options for sources and sinks in your pipelines, the schemas for expressing data that can be structured, and how to perform stateful transformations using the State and Timer APIs.
  • Next, we will review the best methodologies for converting performance in pipelines.
  • At the end of the course, we will introduce SQL and Dataframes to represent your business logic in Beam and how to iteratively develop pipelines using Beam notebooks.
Skills you will acquire:
  • Category: Data Model
  • Category: Extract, Transform, and Load (ETL)
  • Category: Analytics
  • Category: Status (Computer Science)

Serverless data processing with Dataflow: Operations

Course 3 • 9 hours

Course Details

What you’ll learn:
  • Perform monitoring, troubleshooting, testing, and CI/CD operations on Dataflow pipelines.
  • Implement Dataflow pipelines with reliability in mind to maximize the stability of your data processing platform.
Skills you will acquire:
  • Category: Proximity error
  • Category: Graph
  • Category: Regression
  • Category: Causality