Overview

How can we get most useful information at minimum cost?

This question is fundamental to many domains such as sponsored search, autonomous robotic exploration, design of experiments, sensor placement, understanding human perception, and many others. In this course, we will study some of the fundamental theoretical approaches, including

Online decision making: Bandit problems, reinforcement learning, exploration / exploitation tradeoffs, learning under partial observability.
Statistical approaches to active learning: When does active data acquisition help learn more quickly than traditional, passive machine learning algorithms?
Combinatorial approaches: Bayesian experimental design, submodular function optimization

There will be introductory lectures providing background in the areas, as well as student presentations of relevant research papers. A major focus of this seminar course will be to trace the research frontier and identify important open problems. Students are encouraged to apply the course material to their research in a class project.

Details

Instructor: Andreas Krause (krausea [at] caltech.edu)
Teaching Assistant: Ryan Gomes (gomes [at] caltech.edu)
Time and Location: Winter ’08/’09, Tue & Thu 10:30-12 in 74 Jorgensen
Prerequisites: Basic algorithms and probability
9 Units (3-0-6): Grading mainly based on a student presentation and a class project
Office hours:

Andreas Krause: Monday 3pm-4:30pm in 260 Jorgensen
Ryan Gomes: Wednesday 4-6pm in 109 Moore

Collaboration policy:

Homeworks: Discussion of the problems is allowed, however, everybody must turn in their own solutions.
Project: Groups of 2-3 students (exceptions possible with instructor's permission)

Late day policy: Everybody has 3 late days that can be used for any homework assignment (e.g., you can be later for HW2 for 2 days and HW1 for 1 day). No credits when turning in after the late days have been used. Late days can not be used for the project proposal and reports. Start early!

Homeworks

Homework 1 [pdf] due Thursday January 29
Homework 2 [pdf] due Thursday February 19
Homework 3 [pdf] due Thursday March 5

Project

Possible project ideas
Proposals (1-2 pages) due Tuesday January 27
Project milestone due Tuesday February 24

Final project due Tuesday March 17

Writeup: 8 pages in NIPS format
Describe formal problem statement, your approach towards solving the problem and your results (experiments, proofs, etc.)
Poster session: March 17, 1pm-2:30pm 2nd floor atrium in Powell-Booth. Easels, boards (42" by 30") and cookies will be provided. Bring a demo on laptop if possible / suitable. There will be a best project award based on popular vote.

Lecture notes

Jan 6: Introduction [pdf]
Jan 8: Bandit Problems [pdf]
Jan 13: MDPs and Reinforcement learning [pdf, Demo]
Jan 15: Online linear optimization

Dave Buchfuhrer: [A.3] A. Kalai, S. Vempala. "Efficient Algorithms for Online Decision Problems" Journal of Computer System Sciences 2005 [pdf]
Daniel Obenshain: [A.8] B. McMahan, A. Blum. "Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary", COLT 2004 [pdf]

Jan 20: Bandit algorithms

Cliff Chang [A.2] S. Bubeck, R. Munos, G. Stoltz, C. Szepesvari. "Online Optimization in X-Armed Bandits" NIPS 2008 [pdf]
Benjamin Flora [A.4] J. Langford and T. Zhang. "The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits" NIPS 2007 [pdf]

Jan 22: Gaussian Process optimization [pdf]
Jan 27: Learning and sample complexity [pdf]
Jan 29: Introduction to active learning [pdf]
Feb 3: Practical algorithms

Krzysztof Chalupka [B.7] S. Tong, D. Koller. "Support Vector Machine Active Learning with Applications to Text Classification." JMLR 2001 [pdf]

Krzysztof created a pretty cool demo for active learning using SVMs, have a look! Here's the Matlab code. Also requires this toolbox.

Cheng Hong [B.4] R. Castro, C. Kalish, R. Nowak, R. Qian, T. Rogers, X. Zhu. "Human Active Learning", NIPS 2008 [pdf]

Feb 5: Sample complexity for active learning

Peter Sadowski [B.6] S. Dasgupta. "Coarse Sample complexity bounds for active learning", NIPS 2005 [pdf]
David Lee [B.8] S. Dasgupta, D. Hsu, C. Monteleoni. "A General Agnostic Active Learning Algorithm", NIPS 2007 [pdf]

Feb 10: Bayesian experimental design [pdf]
Feb 12: Submodular functions [pdf]
Feb 17: Combinatorial optimization for experimental design [pdf]
Feb 19: Submodular function optimization for active learning

Esther Wang [C.3] S. Hoi, R. Jin, J. Zhu, M. Lyu "Batch mode active learning and its application to medical image classification", ICML 2006 [pdf]
Shankar Kalyanaraman [C.2] F. Radlinski, R. Kleinberg, T. Joachims. "Learning diverse rankings with multi-armed bandits" ICML 2008 [pdf]

Feb 24: Applications

Jon Napolitano [C.5] M. Munie, Y. Shoham. "Optimal Testing of Structured Knowledge", AAAI 2008 [pdf]

Feb 26: Guest lecture on Sparse PCA and LDA: Baback Moghaddam from the JPL Machine Learning group
Mar 3: Informative path planning

Andy Matuschak [C.8] R. Sim, N. Roy. "Global A-Optimal Robot Exploration in SLAM", ICRA 2005 [pdf]
Matt Wu [C.4] G. Hollinger and S. Singh. "Proofs and Experiments in Scalable, Near-Optimal Search by Multiple Robots", RSS 2008 [pdf]

Mar 5: Applications of Bayesian experimental design

Matthew Maurer [C.6] L. Renninger, J. Coughlan, P. Verghese, J. Malik. "An Information Maximization Model of Eye Movements" NIPS 2004 [pdf]

Alex Roper [C.10] J. Lewi, R. Butera, L. Paninski. "Real-time adaptive information-theoretic optimization of neurophysiology experiments", NIPS 2006 [pdf]

Mar 10: Compressive sensing

Pete Trautman [C.7] M. Seeger, H. Nickisch "Compressed Sensing and Bayesian Experimental Design" ICML 2008 [pdf]
Summary lecture [pdf]

Mar 17: 1pm-2:30pm Poster session, 2nd floor atrium in Powell-Booth (CACR Atrium)

Papers

Online decision making and bandit problems

[A.1] S. Pandey, D. Chakrabarti, D. Agrawal. "Multi-armed Bandit Problems with Dependent Arms" ICML 2006
[A.2] S. Bubeck, R. Munos, G. Stoltz, C. Szepesvari. "Online Optimization in X-Armed Bandits" NIPS 2008. Presenter: Cliff Chang
[A.3] A. Kalai, S. Vempala. "Efficient Algorithms for Online Decision Problems" Journal of Computer System Sciences 2005. Presenter: Dave Buchfuhrer
[A.4] J. Langford and T. Zhang. "The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits" NIPS 2007. Presenter: Benjamin Flora
[A.5] R. Kleinberg, A. Niculescu-Mizil, and Y. Sharma. "Regret bounds for sleeping experts and bandits" COLT 2008. Presenter: Patrick Liang
[A.6] D. Lizotte, T. Wang, M. Bowling, D. Schuurmans. "Automatic Gait Optimization with Gaussian Process Regression" IJCAI 2007
[A.7] R. Martinez-Cantin, N. de Freitas, A. Doucet and J. Castellanos. "Active Policy Learning for Robot Planning and Exploration under Uncertainty" RSS 2007. Presenter: Anthony Roy
[A.8] B. McMahan, A. Blum. "Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary", COLT 2004. Presenter: Daniel Obenshain
[A.9] P. Auer, N. Cesa-Bianchi, P. Fischer. "Finite-time Analysis of the Multiarmed Bandit Problem", Machine Learning, 2002
[A.10] P. Auer, R. Ortner. "Logarithmic Online Regret Bounds for Undiscounted
Reinforcement Learning", NIPS 2006

Statistical approaches to active learning

[B.1] S. Dasgupta. "Analysis of a greedy active learning strategy" NIPS 2004
[B.2] M. Balcan, S. Hanneke, J. Wortmann. "The true sample complexity of Active Learning", COLT 2008. Presenter: Mason Smith
[B.3] S. Dasgupta, D.J. Hsu. "Hierarchical sampling for active learning", ICML 2008. Presenter: Alex Fogel
[B.4] R. Castro, C. Kalish, R. Nowak, R. Qian, T. Rogers, X. Zhu. "Human Active Learning", NIPS 2008. Presenter: Cheng Hong
[B.5] R. Castro, R. Nowak. "Minimax Bounds for Active Learning", COLT 2007
[B.6] S. Dasgupta. "Coarse Sample complexity bounds for active learning", NIPS 2005. Presenter: Peter Sadowski
[B.7] S. Tong, D. Koller. "Support Vector Machine Active Learning with Applications to Text Classification." JMLR 2001. Presenter: Krzysztof Chalupka
[B.8] S. Dasgupta, D. Hsu, C. Monteleoni. "A General Agnostic Active Learning Algorithm", NIPS 2007. Presenter: David Lee

Combinatorial approaches

[C.1] M. Streeter, D. Golovin "An Online Algorithm for Maximizing Submodular Functions" NIPS 2008
[C.2] F. Radlinski, R. Kleinberg, T. Joachims. "Learning diverse rankings with multi-armed bandits" ICML 2008. Presenter: Shankar Kalyanaraman
[C.3] S. Hoi, R. Jin, J. Zhu, M. Lyu "Batch mode active learning and its application to medical image classification", ICML 2006. Presenter: Esther Wang
[C.4] G. Hollinger and S. Singh. "Proofs and Experiments in Scalable, Near-Optimal Search by Multiple Robots", RSS 2008. Presenter: Matt Wu
[C.5] M. Munie, Y. Shoham. "Optimal Testing of Structured Knowledge", AAAI 2008. Presenter: Jon Napolitano
[C.6] L. Renninger, J. Coughlan, P. Verghese, J. Malik. "An Information Maximization Model of Eye Movements" NIPS 2004. Presenter: Matthew Maurer
[C.7] M. Seeger, H. Nickisch "Compressed Sensing and Bayesian Experimental Design" ICML 2008. Presenter: Dominic Rizzo
[C.8] R. Sim, N. Roy. "Global A-Optimal Robot Exploration in SLAM", ICRA 2005. Presenter: Andy Matuschak
[C.9] A. Krause, C. Guestrin, A. Gupta, J. Kleinberg. "Near-optimal sensor placements in Gaussian Processes: Maximizing Information while Minimizing Communication Cost", IPSN 2006
[C.10] J. Lewi, R. Butera, L. Paninski. "Real-time adaptive information-theoretic optimization of neurophysiology experiments", NIPS 2006. Presenter: Alex Roper
[C.11] J. Williams, J. Fisher III, A. Willsky. "Performance guarantees for information-theoretic active inference", AISTATS '07. Presenter: Peter Trautman
[C.12] A. Doucet, B. Vo, C. Andrieu, M. Davy. "Particle filtering for multi-target tracking and sensor management", Information Fusion 2002