# ML Classes for Fall 2017

Pittsburgh Campus

To choose between the Introduction to Machine Learning courses (10-401, 10-601, 10-701, and 10-715), please read the Intro to ML Course Comparison.

You may also wish to take our self-assessment exam to evaluate your readiness for various Machine Learning courses.

For information about pre-requisites and timing, please see the Schedule of Classes or Student Information Online.

## 10-601 Introduction to Machine Learning (Master's)

Machine Learning (ML) develops computer programs that automatically improve their performance through experience. This includes learning many types of tasks based on many types of experience, e.g. spotting high-risk medical patients, recognizing speech, classifying text documents, detecting credit card fraud, or driving autonomous vehicles. 10601 covers all or most of: concept learning, decision trees, neural networks, linear learning, active learning, estimation the bias-variance tradeoff, hypothesis testing, Bayesian learning, the MDL principle, the Gibbs classifier, Naive Bayes, Bayes Nets Graphical Models, the EM algorithm, Hidden Markov Models, K-Nearest-Neighbors and nonparametric learning, reinforcement learning, bagging, boosting and discriminative training. Intro to ML Course Comparison.

## 10-605 Machine Learning with Large Datasets

Large datasets are difficult to work with for several reasons. They are difficult to visualize, and it is difficult to understand what sort of errors and biases are present in them. They are computationally expensive to process, and often the cost of learning is hard to predict - for instance, and algorithm that runs quickly in a dataset that fits in memory may be exorbitantly expensive when the dataset is too large for memory. Large datasets may also display qualitatively different behavior in terms of which learning methods produce the most accurate predictions. This course is intended to provide a student practical knowledge of, and experience with, the issues involving large datasets. Among the issues considered are: scalable learning techniques, such as streaming machine learning techniques; parallel infrastructures such as map-reduce; practical techniques for reducing the memory requirements for learning methods, such as feature hashing and Bloom filters; and techniques for analysis of programs in terms of memory, disk usage, and (for parallel methods) communication complexity.

## 10-606 Mathematical Background for Machine Learning I

This course provides a place for students to practice the necessary mathematical background for further study in machine learning -- particularly for taking 10-601 and 10-701. Topics covered include probability, linear algebra (inner product spaces, linear operators), multivariate differential calculus, optimization, and likelihood functions. The course assumes some background in each of the above, but will review and give practice in each. (It does not provide from-scratch coverage of all of the above, which would be impossible in a course of this length.) Some coding will be required: the course will provide practice with translating the above mathematical concepts into concrete programs. The Mathematical Background for Machine Learning sequence is split into two minis, with 10-606 being a prerequisite for 10-607.

## 10-607 Mathematical Background for Machine Learning II

This course provides a place for students to practice the necessary mathematical background for further study in machine learning -- particularly for taking 10-601 and 10-701. Topics covered include probability, linear algebra (inner product spaces, linear operators), multivariate differential calculus, optimization, and likelihood functions. The course assumes some background in each of the above, but will review and give practice in each. (It does not provide from-scratch coverage of all of the above, which would be impossible in a course of this length.) Some coding will be required: the course will provide practice with translating the above mathematical concepts into concrete programs. The Mathematical Background for Machine Learning sequence is split into two minis, with 10-606 being a prerequisite for 10-607.

## 10-701 Introduction to Machine Learning (PhD)

Machine learning studies the question How can we build computer programs that automatically improve their performance through experience? This includes learning to perform many types of tasks based on many types of experience. For example, it includes robots learning to better navigate based on experience gained by roaming their environments, medical decision aids that learn to predict which therapies work best for which diseases based on data mining of historical health records, and speech recognition systems that learn to better understand your speech based on experience listening to you. This course is designed to give PhD students a thorough grounding in the methods, mathematics and algorithms needed to do research and applications in machine learning. Students entering the class with a pre-existing working knowledge of probability, statistics and algorithms will be at an advantage, but the class has been designed so that anyone with a strong numerate background can catch up and fully participate. Intro to ML Course Comparison.

## 10-707 Topics in Deep Learning

Building intelligent machines that are capable of extracting meaningful representations from high-dimensional data lies at the core of solving many AI related tasks. In the past few years, researchers across many different communities, from applied statistics to engineering, computer science and neuroscience, have developed deep (hierarchical) models -- models that are composed of several layers of nonlinear processing. An important property of these models is that they can learn useful representations by re-using and combining intermediate concepts, allowing these models to be successfully applied in a wide variety of domains, including visual object recognition, information retrieval, natural language processing, and speech perception. This is an advanced graduate course, designed for Masters and Ph.D. level students, and will assume a reasonable degree of mathematical maturity. The goal of this course is to introduce students to the recent and exciting developments of various deep learning methods. Some topics to be covered include: restricted Boltzmann machines (RBMs) and their multi-layer extensions Deep Belief Networks and Deep Boltzmann machines; sparse coding, autoencoders, variational autoencoders, convolutional neural networks, recurrent neural networks, generative adversarial networks, and attention-based models with applications in vision, NLP, and multimodal learning. We will also address mathematical issues, focusing on efficient large-scale optimization methods for inference and learning, as well as training density models with intractable partition functions.

## 10-709 Fundamentals of Learning from the Crowd

Crowdsourcing is a burgeoning area that is popular in academic research, industrial applications, and also in societal causes. In this course, we will cover the foundational theoretical principles behind crowdsourcing and learning from the crowd. We will study this field via the lens of game theory (how to incentivize people to provide better data) and that of learning theory (how to make sense of this data). We will also touch upon literature in psychology and economics that studies the behavior of people. Along the way, we will discuss several fascinating paradoxes and conduct some live experiments in the class. Almost all lectures will be taught on the board. Required background material such as scoring rules, Nash equilibrium, concentration inequalities, random matrix theory will be taught in class. Evaluation will be based on homeworks, a final project, and class participation. The prerequisites are basic probability (e.g., the student should be comfortable with conditional expectations, the Gaussian distribution, union bound), basic linear algebra (e.g., singular value decomposition) and basic programming.

## 10-715 Advanced Introduction to Machine Learning

The rapid improvement of sensory techniques and processor speed, and the availability of inexpensive massive digital storage, have led to a growing demand for systems that can automatically comprehend and mine massive and complex data from diverse sources. Machine Learning is becoming the primary mechanism by which information is extracted from Big Data, and a primary pillar that Artificial Intelligence is built upon. This course is designed for Ph.D. students whose primary field of study is machine learning, or who intend to make machine learning methodological research a main focus of their thesis. It will give students a thorough grounding in the algorithms, mathematics, theories, and insights needed to do in-depth research and applications in machine learning. The topics of this course will in part parallel those covered in the general graduate machine learning course (10-701), but with a greater emphasis on depth in theory and algorithms. The course will also include additional advanced topics such as privacy in machine learning, interactive learning, reinforcement learning, online learning, Bayesian nonparametrics, and additional material on graphical models. Students entering the class are expected to have a pre-existing strong working knowledge of algorithms, linear algebra, probability, and statistics. Intro to ML Course Comparison.

## 10-725 Convex Optimization

Nearly every problem in machine learning can be formulated as the optimization of some function, possibly under some set of constraints. This universal reduction may seem to suggest that such optimization tasks are intractable. Fortunately, many real world problems have special structure, such as convexity, smoothness, separability, etc., which allow us to formulate optimization problems that can often be solved efficiently. This course is designed to give a graduate-level student a thorough grounding in the formulation of optimization problems that exploit such structure, and in efficient solution methods for these problems. The main focus is on the formulation and solution of convex optimization problems, though we will discuss some recent advances in guarantees for non-convex optimization. These general concepts will also be illustrated through applications in machine learning and statistics. Students entering the class should have a pre-existing working knowledge of algorithms, though the class has been designed to allow students with a strong numerate background to catch up and fully participate. Though not required, having taken 10-701 or an equivalent machine learning or statistics class is strongly encouraged, since we will use applications in machine learning and statistics to demonstrate the concepts we cover in class. Students will work on an extensive optimization-based project throughout the semester.

## 10-805 Machine Learning with Large Datasets

10-805 will share lectures with 10-605, but 10-805 students need to make class presentations and complete a research project, and will do fewer programming assignments, so 10-805 students are expected to be capable of surveying recent literature and conducting research. Four lecture sessions for 10-605 will also be reserved for 10-805 students presentations.

## 10-808 Language Grounding to Vision & Control

The course will study the question: how can we understand what language really means, how not to form realistically looking sentences but rather deliver the right meaning needed for a task. Thus language will be studied as a means of 1) communication 2) weak supervision for learning execution graphs of vision or control routines and the inverse, how other modalities can provide supervision towards language grounding. We will cover topics on visual question answering, language pragmatics, program induction, automatic theorem proving, language models for parsing, translation, grounded conversation models. The format of the course is the following: that at each lecture there will be some part which is formal presentation of background and preliminaries and then paper discussion. There will not be homeworks, but grading will be based on a) participation and b) the completion of a relevant final project. Prerequisites: Familiarity with Computer Vision, Deep learning and basic reinforcement learning is assumed. Familiarity with computational linguistics is not necessary.

## Research

The following course numbers are available for students conducting research in Machine Learning:

- 10-500 Senior Research Project
*(ML Minor)* - 10-611 MS DAP Research
*(ML MS)* - 10-620 Independent Study
*(general undergraduate and MS)* - 10-697 Reading and Research
*(ML MS)* - 10-821 DAP Preparation
*(ML MS and ML PhD)* - 10-910 PhD DAP Research
*(ML PhD)* - 10-920 Graduate Reading and Research
*(ML PhD)* - 10-930 Dissertation Research
*(ML PhD)* - 10-935 Practicum
*(ML PhD)* - 10-940 Independent Study
*(general PhD)*

Students should contact Dorothy Holland-Minkley for help registering for 500- and 600-level courses, or Diane Stidle for 700-level and above courses.