AI 539: Machine Learning Challenges in the Real World

How does machine learning perform in the wild?

In this class, we will explore the challenges that machine learning systems face when they move from the laboratory into the real world.

We will be inspired by machine learning applied to problems from astronomy, planetary science, autonomous driving, criminal justice, marketing, etc. Topics will include problem formulation, data collection/labeling, and evaluation techniques, and we will address thorny (but common) obstacles such as missing values, data that is not independently and identically distributed, concept/domain shift, explainability, and more.

You will have the opportunity to apply these concepts and strategies to a data set of your choice. Student work will include reading, implementation, experimentation, analysis of results, and communication of findings. If you're curious about how to solve real problems with machine learning, this is the class for you. Prior experience with supervised machine learning methods (CS 434, CS/AI 534, or instructor permission) is required.

Photo by J. Balla Photography on Unsplash

Instructor: Kiri Wagstaff

Teaching Assistant: Grace Diehl

Class meetings (Winter 2022):
Tuesdays and Thursdays, 2-3:20 p.m. (BEXL 320)

Credits: 3

Evaluation:

Syllabus: Syllabus (PDF)

Schedule:

DateTopics
Jan. 4
  • What you will get out of this class
  • Examples of ML gone wrong
  • Getting to know your data
    Jan. 6
  • What's in your data? Data set profiling
  • What to do when your data has holes (missing values)
  • Jan. 11
  • The tyranny of the majority: what to do about class imbalance
  • Jan. 13
  • Is your data set representative of its intended use?
  • Detrimental (and beneficial) sampling bias
  • Jan. 18
  • Algorithm and data bias
  • Getting to know your model
    Jan. 20
  • Would you use your own classifier?
  • Methods for informative performance evaluation
  • Jan. 25
  • What kind of errors matter most?
  • Problem-specific evaluation
  • Data complexities
    Jan. 27
  • What if your data has dependencies?
  • Feb. 1
  • The space-time continuum: structured data
  • Feb. 3
  • "Change is inevitable; growth is optional." - John Maxwell
  • Dealing with domain shift
  • Feb. 8
  • What have we learned so far?
  • Sending your model out into the world
    Feb. 10
  • What can you trust?
  • Noisy data, noisy labels
  • Feb. 15
  • How can you keep things running?
  • Deployment, maintenance, and trust
  • Feb. 17
  • Why did it do that? (explainability)
  • Feb. 22
  • When should you believe a prediction?
  • Confidence, uncertainty, and calibration
  • Going beyond the standard setting
    Feb. 24
  • When to have a human in the loop
  • The merits of active learning
  • March 1
  • The dark side: combating adversaries
  • March 3
  • Exploration and discovery (unsupervised learning)
  • March 8
  • Student project presentations
  • Bonus topic: Continual learning
  • March 10
  • Student project presentations
  • Bonus topic: Machine learning values