AI 539: Machine Learning Challenges in the Real World

How does machine learning perform in the wild?

In this class, we will explore the challenges that machine learning systems face when they move from the laboratory into the real world.

We will be inspired by machine learning applied to problems from astronomy, planetary science, autonomous driving, criminal justice, marketing, etc. Topics will include problem formulation, data collection/labeling, and evaluation techniques, and we will address thorny (but common) obstacles such as missing values, data that is not independently and identically distributed, concept/domain shift, explainability, and more.

You will have the opportunity to apply these concepts and strategies to a data set of your choice. Student work will include reading, implementation, experimentation, analysis of results, and communication of findings. If you're curious about how to solve real problems with machine learning, this is the class for you. Prior experience with supervised machine learning methods (CS 434, CS/AI 534, or instructor permission) is required.

Photo by J. Balla Photography on Unsplash

Instructor: Kiri Wagstaff

Teaching Assistant: Grace Diehl

Class meetings (Winter 2022):
Tuesdays and Thursdays, 2-3:20 p.m. (BEXL 320)

Credits: 3

Evaluation:

20% warm-up reading
30% try-it-out assignments
50% hands-on project

Syllabus: Syllabus (PDF)

Schedule:

Date	Topics
Jan. 4	What you will get out of this class Examples of ML gone wrong
Getting to know your data
Jan. 6	What's in your data? Data set profiling What to do when your data has holes (missing values)
Jan. 11	The tyranny of the majority: what to do about class imbalance
Jan. 13	Is your data set representative of its intended use? Detrimental (and beneficial) sampling bias
Jan. 18	Algorithm and data bias
Getting to know your model
Jan. 20	Would you use your own classifier? Methods for informative performance evaluation
Jan. 25	What kind of errors matter most? Problem-specific evaluation
Data complexities
Jan. 27	What if your data has dependencies?
Feb. 1	The space-time continuum: structured data
Feb. 3	"Change is inevitable; growth is optional." - John Maxwell Dealing with domain shift
Feb. 8	What have we learned so far?
Sending your model out into the world
Feb. 10	What can you trust? Noisy data, noisy labels
Feb. 15	How can you keep things running? Deployment, maintenance, and trust
Feb. 17	Why did it do that? (explainability)
Feb. 22	When should you believe a prediction? Confidence, uncertainty, and calibration
Going beyond the standard setting
Feb. 24	When to have a human in the loop The merits of active learning
March 1	The dark side: combating adversaries
March 3	Exploration and discovery (unsupervised learning)
March 8	Student project presentations Bonus topic: Continual learning
March 10	Student project presentations Bonus topic: Machine learning values