Target Audience

Perfect for developers who:
Manage ETL processes to ensure clean, validated data.
Design robust data pipelines and architectures.
Integrate analytics and ML into workflows seamlessly.
Query datasets, visualize insights, and deliver impactful reports

COURSE AGENDA

Introduction to Data Engineering

  • Explore data engineering challenges and solutions.
  • Learn about data lakes, data warehouses, and transactional databases.
  • Understand data governance and access management.
  • Build production-ready data pipelines.
  • Lab: Perform data analysis using BigQuery.

Building a Data Lake

  • Understand the role and structure of data lakes.
  • Learn storage and ETL options on Google Cloud.
  • Build and secure a data lake with Cloud Storage.
  • Explore relational data lakes with Cloud SQL.

Building a Data Warehouse

  • Dive into modern data warehouse concepts.
  • Get started with BigQuery: loading data and exploring schemas.
  • Optimize schemas with partitioning and clustering.
  • Labs: Load data into BigQuery and work with JSON and array data.

Introduction to Building Batch Data Pipelines

  • Learn the differences between EL, ELT, and ETL processes.
  • Address data quality considerations in pipelines.
  • Use ETL to resolve data quality issues.

Executing Spark on Dataproc

  • Explore the Hadoop ecosystem and how it integrates with Google Cloud.
  • Run optimized Apache Spark jobs on Dataproc.
  • Lab: Execute Spark jobs using Dataproc.

Serverless Data Processing with Dataflow

  • Understand why customers value Dataflow for real-time and batch processing.
  • Learn Dataflow pipelines, including aggregation, side inputs, and windowing.
  • Labs: Build Dataflow pipelines with Python/Java, including MapReduce and side inputs.

Manage Data Pipelines with Cloud Data Fusion & Cloud Composer

  • Build batch data pipelines visually with Cloud Data Fusion.
  • Use Cloud Composer (Apache Airflow) to orchestrate workflows.
  • Labs: Build pipelines with Data Fusion and explore workflow orchestration with Composer

Production ML Pipelines

  • Explore ML workflows on Google Cloud using Vertex AI Pipelines and AI Hub.
  • Lab: Run production-ready ML pipelines on Vertex AI.

Custom Model Building with AutoML

  • Explore the power of AutoML for building vision, NLP, and tabular models.
  • Understand how AutoML simplifies the machine learning process.
Custom Model Building with AutoML

CONTACT US TO START YOUR GOOGLE CLOUD JOURNEY