Practical Data Science on the AWS Cloud (PDS) Specialization

Practical Data Science on the AWS Cloud (PDS) Specialization

What you will learn

  • Prepare data, detect statistical data biases, and perform feature engineering at scale to train models
  • Automatically train, evaluate, and tune models with automated machine learning (AutoML)
  • Store and manage machine learning features using a feature store
  • Debug, profile, tune and evaluate models while tracking data lineage and model artifacts
  • Build, deploy, monitor, and operationalize end-to-end machine learning pipelines.
  • Build data labeling and human-in-the-loop pipelines to improve model performance with human intelligence.

Skills you will gain

  • Automated Machine Learning (AutoML)
  • Natural Language Processing with BERT
  • ML Pipelines and ML Operations (MLOps)
  • A/B Testing, Model Deployment, and Monitoring
  • Data Labeling at Scale
  • Data Ingestion
  • Exploratory Data Analysis
  • Statistical Data Bias Detection
  • Multi-class Classification with FastText and BlazingText
  • Feature Engineering and Feature Store
  • Model Training, Tuning, and Deployment with BERT
  • Model Debugging, Profiling, and Evaluation
  • ML Pipelines and MLOps
  • Artifact and Lineage Tracking
  • Distributed Model Training and Hyperparameter Tuning
  • Cost Savings and Performance Improvements
  • Human-in-the-Loop Pipelines

Development environments might not have the exact requirements as production environments. Moving data science and machine learning projects from idea to production requires state-of-the-art skills. You need to architect and implement your projects for scale and operational efficiency. Data science is an interdisciplinary field that combines domain knowledge with mathematics, statistics, data visualization, and programming skills. 

The Practical Data Science on the AWS Cloud Specialization brings together these disciplines using purpose-built ML tools in the AWS cloud. It helps you develop the practical skills to effectively deploy your data science projects and overcome challenges at each step of the ML workflow using Amazon SageMaker. 

This Specialization is designed for data-focused developers, scientists, and analysts familiar with the Python and SQL programming languages who want to learn how to build, train, and deploy scalable, end-to-end ML pipelines – both automated and human-in-the-loop – in the AWS cloud.

Each of the 10 weeks features a comprehensive lab developed specifically for this Specialization that provides hands-on experience with state-of-the-art algorithms for natural language processing (NLP) and natural language understanding (NLU), including BERT and FastText using Amazon SageMaker.

By the end of this program, you will be ready to: 

  1. Ingest, register, and explore datasets
  2. Detect statistical bias in a dataset
  3. Automatically train and select models with AutoML
  4. Create machine learning features from raw data
  5. Save and manage features in a feature store
  6. Train and evaluate models using built-in algorithms and custom BERT models
  7. Debug, profile, and compare models to improve performance
  8. Build and run a complete ML pipeline end-to-end
  9. Optimize model performance using hyperparameter tuning
  10. Deploy and monitor models
  11. Perform data labeling at scale
  12. Build a human-in-the-loop pipeline to improve model performance
  13. Reduce cost and improve performance of data products
  • 3 Courses
  • >3 months (5 hours/week)
  • >Advanced



Antje Barth

Antje Barth

Principal Developer Advocate, Generative AI, Amazon Web Services (AWS)
Shelbee Eigenbrode

Shelbee Eigenbrode

Principal Solutions Architect, Generative AI, Amazon Web Services (AWS)
Sireesha Muppala

Sireesha Muppala

Principal Solutions Architect, AI and Machine Learning, Amazon Web Services (AWS)
Chris Fregly

Chris Fregly

Principal Solutions Architect, Generative AI, Amazon Web Services (AWS)

Sign Up

Be notified of new courses

Frequently Asked Questions