
Course 1: Analyze Datasets and Train ML Models using AutoML
In the first course, you will learn foundational concepts for exploratory data analysis (EDA), automated machine learning (AutoML), and text classification algorithms. With Amazon SageMaker Clarify and Amazon SageMaker Data Wrangler, you will analyze a dataset for statistical bias, transform the dataset into machine-readable features, and select the most important features to train a multi-class text classifier. You will then perform automated machine learning (AutoML) to automatically train, tune, and deploy the best text-classification algorithm for the given dataset using Amazon SageMaker Autopilot. Next, you will work with Amazon SageMaker BlazingText, a highly optimized and scalable implementation of the popular FastText algorithm, to train a text classifier with very little code.
Week 1: Explore the Use Case and Analyze the Dataset
- Ingest, explore, and visualize a product review data set for multi-class text classification.
Week 2: Data Bias and Feature Importance
- Determine the most important features in a data set and detect statistical biases.
Week 3: Automated Machine Learning
- Inspect and compare models generated with automated machine learning (AutoML).
Week 4: Built-in Algorithms
- Train a text classifier with BlazingText and deploy the classifier as a real-time inference endpoint to serve predictions.