Skip to content Skip to content
Vladimir Chavkov

Professional Data Engineer Training

Master data engineering on Google Cloud with this comprehensive 5-day training. Learn to design, build, and operationalize data processing systems, ensuring reliability, security, and scalability of data solutions.

Duration5 days (40 hours)
LevelAdvanced
DeliveryIn-person, Live online, Hybrid
CertificationGoogle Cloud Certified: Professional Data Engineer
  • Data engineers building data pipelines
  • Analytics engineers working with big data
  • ML engineers preparing data
  • Anyone preparing for Professional Data Engineer certification

After completing this training, you’ll be able to:

  • Design data processing systems
  • Build and operationalize data pipelines
  • Operationalize machine learning models
  • Ensure solution quality and reliability
  • Implement security and compliance for data
  • Optimize costs of data solutions

Module 1: Data Engineering on GCP

  • Data engineering lifecycle
  • GCP data services overview
  • Data governance and compliance
  • Hands-on: Plan data architecture

Module 2: BigQuery Fundamentals

  • BigQuery architecture and storage
  • SQL and query optimization
  • Partitioning and clustering
  • Hands-on: Build BigQuery datasets

Module 3: Data Loading and Export

  • Batch loading strategies
  • Streaming with BigQuery API
  • Data Transfer Service
  • Hands-on: Load data into BigQuery

Module 4: Dataflow for Batch Processing

  • Apache Beam programming model
  • Dataflow pipelines
  • Transforms and windowing
  • Hands-on: Build batch pipeline

Module 5: Streaming Data Processing

  • Pub/Sub architecture
  • Dataflow streaming
  • Real-time analytics
  • Hands-on: Build streaming pipeline

Module 6: Data Storage Options

  • Cloud Storage for data lakes
  • Cloud SQL and Cloud Spanner
  • Bigtable for time-series data
  • Hands-on: Choose storage solution

Day 3: Machine Learning and Advanced Analytics

Section titled “Day 3: Machine Learning and Advanced Analytics”

Module 7: BigQuery ML

  • Creating ML models in BigQuery
  • Model evaluation and prediction
  • Feature engineering
  • Hands-on: Build ML model with BQML

Module 8: Vertex AI

  • AutoML and custom training
  • Model deployment and monitoring
  • ML pipelines
  • Hands-on: Deploy ML model

Module 9: Data Analysis and Visualization

  • Looker and Data Studio
  • Jupyter notebooks on Vertex AI
  • Interactive analysis
  • Hands-on: Create dashboards

Module 10: Data Quality and Validation

  • Data quality patterns
  • Data validation with Great Expectations
  • Monitoring data pipelines
  • Hands-on: Implement data quality checks

Module 11: Security and Privacy

  • Data encryption strategies
  • Column-level security in BigQuery
  • Data Loss Prevention API
  • Hands-on: Implement data security

Module 12: Compliance and Governance

  • Data Catalog for metadata
  • Policy tags and access controls
  • Audit logging
  • Hands-on: Implement governance

Module 13: Performance Optimization

  • BigQuery query optimization
  • Dataflow pipeline tuning
  • Cost optimization strategies
  • Hands-on: Optimize performance

Module 14: Operations and Monitoring

  • Cloud Monitoring for data pipelines
  • Logging and error handling
  • Alerting strategies
  • Hands-on: Monitor pipelines

Module 15: Exam Preparation

  • Exam format and case studies
  • Data engineering scenarios
  • Practice questions
  • 2+ years data engineering experience
  • SQL and programming knowledge (Python or Java)
  • Understanding of data processing concepts
  • GCP fundamentals
FormatDescription
In-PersonOn-site at your company’s location, hands-on with direct interaction
Live OnlineInteractive virtual sessions with screen sharing and real-time labs
HybridCombination of on-site and remote sessions, flexible scheduling

All formats include hands-on labs, course materials, practice exams, and post-training support.