Machine Learning Department Research
Lab or Research Group
AI Institute for Societal Decision Making
|As part of the AI Institute for Societal Decision Making (AI-SDM), we develop AI to augment human decision-making in societal domains such as public health and disaster management that require complex, often life-saving, decisions to be made under uncertain, dynamic, and resource-constrained circumstances, while also accounting for people’s perceptions of risk, trust and equity. Using data-driven recommendations via bandit and reinforcement learning algorithms, adaptive control trials, counterfactual reasoning, and personalized interventions, we enable public health and emergency management organizations to identify effective and ethical policy decisions that are socially accepted. We are also training and upskilling the workforce at the intersection of AI and social sciences through targeted engagement with high schools, community colleges, universities, corporations, and government partners, all while creating public awareness about AI's potential to positively impact society.
|The databases group at Carnegie Mellon University focuses on high performance database architectures, multimedia, and data mining. We participate in a number of cross-disciplinary efforts, and closely collaborate with a number of other groups at CMU.
|Delphi Research Group (Epidemiological Forecasting)
(Roni Rosenfeld, Ryan Tibshirani, Larry Wasserman, Valerie Ventura, Alex Reinhardt, Bryan Wilder)
|Epidemiological forecasting is critically needed for decision making by public health officials, commercial and non-commercial institutions, and the general public. We developed multiple award winning forecasting technologies, based on statistical machine learning and other techniques. Our long term vision is to make epidemiological forecasting as universally accepted and useful as weather forecasting is today. We have participated, and done very well, in all epidemiological forecasting challenges organized by the U.S. government to date: Influenza 2013—2014 (CDC); Chikungunya 2015 (DARPA); Dengue 2009—2014 (White House OSTP); Influenza 2014—2015 (CDC, winner); Influenza 2015—2016 (CDC, winner); Influenza 2016—2017 (CDC, winner), Influenza 2017—2018 (CDC, triple winner).
|Our principal research interests lie in the development of machine learning and statistical methodology, and large-scale computational system and architecture, for solving problems involving automated learning, reasoning, and decision-making in high-dimensional, multimodal, and dynamic possible worlds in artificial, biological, and social systems.
|Our long-term research goal is to develop smarter methods for learning and making decisions. Toward this goal, we look at ways to design, analyze, understand, and control complex real-world systems. Our research spans the entire spectrum from theoretical foundations to real-world applications.
|The AUTON Lab
(Artur Dubrawski, Barnabas Poczos, Jeff Schneider)
|Our main research is focusing on exploring useful data structures and algorithms, and making interesting statistical and learning methods applicable on small and large volumes of data. We are very interested in the underlying computer science, mathematics, statistics, and in practical applications of our work. We collaborate closely with food safety analysts, public health agencies, nuclear safety experts, managers of fleets of equipment, social networkers, astrophysicists, biologists, chemists, drug companies, exploration companies and roboticists.
|Researchers throughout the world who investigate neural networks in the brain are trying to answer detailed questions using data sets that are large but noisy, creating new challenges for statistics and machine learning. The NeuroStats group contributes by focusing especially on methods for reliably identifying coordinated neural activity across multiple brain areas.
|Foundations of Machine Learning, Theory of Computing, and Algorithmic Game Theory
|Next Gen Statistical Machine Learning
|We focus on two fundamental aspects of "next gen statistical machine learning": Graceful AI, where we wish to learn models that are "graceful" beyond just high average case performance, and Scrappy AI, where we wish to learn models under resource constraints. Under Graceful AI, the group is engaged in research on Explainable AI (XAI), Robust ML, Adversarial ML, Reliable/Resilient ML for "out of distribution" (OOD) test environments, and Statistical Game Theory. Under Scrappy AI, the group is engaged in research on Structural Causal Models and Directed Graphical Models, Incorporating Domain Knowledge (what DARPA calls "Third Wave AI”), and Self-supervised Learning.
|Learning from People
|Many applications which involve evaluation of a number of items by a set of people, but where each item is evaluated by only a subset of people and each person evaluates only a subset of items. This distributed nature of evaluations results in many issues of bias and unfairness. We address these problems via foundational theoretical analyses, algorithm design, real-world experiments, and emphasize making an impact. Our focus application is Peer Review, the backbone of scientific research. Our work also applies to other applications such as hiring, admissions, crowdsourcing, healthcare, online ratings and recommendations, and peer grading.
|How does the human brain organize information while achieving complex tasks such as understanding a sentence or a visual scene? How does brain activity from humans performing everyday tasks relate to activations of AI algorithms processing the same information? We align brain recordings with AI representations to help us understand how information is processed in different brain areas and how these areas communicate with each other. This alignment can also help us improve our understanding of AI algorithms, and even propose ways to improve them. By studying how individual brains differ in terms of the information they represent, we can also predict differences in behavior, and propose tools to help understand neurological and psychiatric illnesses.
|Foundations of Machine Learning, Deep Learning Theory
|The rapid advance in ML models and ML-specific hardware makes it increasingly challenging to build efficient and scalable learning systems that can take full advantage of the performance capability of modern hardware and runtime environments. Today's ML systems heavily rely on human effort to optimize the training and deployment of ML models on specific target platforms. Unlike conventional application domains, learning systems need to address a continuously growing complexity and diversity in machine learning models, hardware backends, and runtime time environments. Our response to this unique challenge in ML systems is Catalyst (CMU automated learning systems group), a joint research group across the areas of machine learning, systems, programming languages, and computer architecture. Our mission is to build ML algorithms and learning systems that automate cross stack optimizations by leveraging mathematical and statistical properties of ML computations and by co-designing systems, hardware, and ML algorithms.
|Approximately Correct Machine Intelligence (ACMI) Lab
|Building intelligent systems applicable in the real world requires more than prediction. Driving decisions requires causal insights. Reliability requires models that are provably robust under clear assumptions. Deploying data-driven technology in society requires accounting for the complex dynamics and feedback loops mediating this interaction. Aligning with social desiderata such as fairness requires a philosophically coherent treatment. ACMI lab studies core machine learning methods, their applications in healthcare, and their social impacts. We seek to address these outer loop questions, while leveraging breakthroughs in representation learning to address the diverse raw data sources that deep learning has made accessible.
|A team led by Bob Murphy, faculty emeritus in the Computational Biology Department, is combining image-derived modeling methods with active learning to build a continuously updating, comprehensive model of protein localization. Obtaining a complete picture of the localization of all proteins in cells and how it changes under various conditions is an important but daunting task, given that there are on the order of a hundred cell types in a human body, tens of thousands of proteins expressed in each cell types, and over a million conditions (which include presence of potential drugs or disease-causing mutations). Automated microscopy can help by allowing large numbers of images to be acquired rapidly, but even with automation it will not be possible to directly determine the localization of all proteins in all cell types under all conditions.