Title: Knowledge-Guided Machine Learning: A New Framework for Accelerating Scientific Discovery
Abstract: Process-based models of dynamical systems are often used to study engineering and environmental systems. Despite their extensive use, these models have several well-known limitations due to incomplete or inaccurate representations of the physical processes being modeled. There is a tremendous opportunity to systematically advance modeling in these domains by using state-of-the-art machine learning (ML) methods that have already revolutionized computer vision and language translation. However, capturing this opportunity is contingent on a paradigm shift in data-intensive scientific discovery since the “black box” use of ML often leads to serious false discoveries in scientific applications. Because the hypothesis space of scientific applications is often complex and exponentially large, an uninformed data-driven search can easily select a highly complex model that is neither generalizable nor physically interpretable, resulting in the discovery of spurious relationships, predictors, and patterns. This problem becomes worse when there is a scarcity of labeled samples, which is quite common in science and engineering domains.
This talk makes a case that in real-world systems that are governed by physical processes, there is an opportunity to take advantage of fundamental physical principles to inform the search of a physically meaningful and accurate ML model. While this talk will illustrate the potential of the knowledge-guided machine learning (KGML) paradigm in the context of environmental problems (e.g., Fresh water science, Hydrology, Agroecology), the paradigm has the potential to greatly advance the pace of discovery in a diverse set of discipline where mechanistic models are used, e.g., power engineering, climate science, weather forecasting, and pandemic management.
Vipin Kumar is a Regents Professor and holds the William Norris Chair in the Department of Computer Science and Engineering at the University of Minnesota. His research spans data mining, high-performance computing, and their applications in Climate/Ecosystems and health care. He also served as the Director of the Army High-Performance Computing Research Center (AHPCRC) from 1998 to 2005. He has authored over 400 research articles, and co-edited or co-authored 10 books including the widely used textbook “Introduction to Parallel Computing”, and “Introduction to Data Mining.” Kumar’s current major research focus is on knowledge-guided machine learning and its applications to understanding the impact of human-induced changes on the Earth and its environment. Kumar’s research on this topic is funded by NSF’s BIGDATA, INFEWS, STC, and HDR programs, as well as ARPA-E, DARPA, and USGS. He has recently finished serving as the Lead PI of a 5-year, $10 Million project, “Understanding Climate Change – A Data-Driven Approach”, funded by the NSF’s Expeditions in Computing program. Kumar is a Fellow of the ACM, IEEE, AAAS, and SIAM. Kumar’s foundational research in data mining and high-performance computing has been honored by the ACM SIGKDD 2012 Innovation Award, which is the highest award for technical excellence in the field of Knowledge Discovery and Data Mining (KDD), the 2016 IEEE Computer Society Sidney Fernbach Award, one of IEEE Computer Society’s highest awards in high-performance computing, and Test-of-time award from 2021 Supercomputing conference (SC21).
Title: Scalable Physics-based Maximum Likelihood Estimation using Hierarchical Matrices
Abstract: Physics-based covariance models provide a systematic way to construct covariance models that are consistent with the underlying physical laws in Gaussian process analysis. The unknown parameters in the covariance models can be estimated using maximum likelihood estimation, but direct construction of the covariance matrix and classical strategies of computing with it requires n physical model runs n2 storage complexity and n3 computational complexity. To address such challenges, we propose to approximate the discretized covariance function using hierarchical matrices. By utilizing randomized range sketching for individual off-diagonal blocks, the construction process of the hierarchical covariance approximation requires O(log n) physical model applications, and the maximum likelihood computations require O(n log2n) effort per iteration. We propose a new approach to compute exactly the trace of products of hierarchical matrices which results in the expected Fischer information matrix being computable in O(n log2n) as well. The construction is totally matrix-free and the derivatives of the covariance matrix can then be approximated in the same hierarchical structure by differentiating the whole process. Numerical results are provided to demonstrate the effectiveness, accuracy, and efficiency of the proposed method for parameter estimations and uncertainty quantification.
Bio: Mihai Anitescu is a senior computational mathematician in the Mathematics and Computer Science Division at Argonne National Laboratory and a professor in the Department of Statistics at the University of Chicago. He obtained his engineering diploma (electrical engineering) from the Polytechnic University of Bucharest in 1992 and his Ph.D. in applied mathematical and computational sciences from the University of Iowa in 1997. He specializes in the areas of numerical optimization, computational science, numerical analysis, and uncertainty quantification in which he has published more than 100 papers in scholarly journals and book chapters. He is on the editorial board of the SIAM Journal on Optimization and he is a senior editor for Optimization Methods and Software, he is a past member of the editorial boards of the Mathematical Programming A and B, SIAM Journal on Uncertainty Quantification, and SIAM Journal on Scientific Computing. He has been recognized for his work in applied mathematics by his selection as a SIAM Fellow in 2019.
Title: Hybrid Operational Digital Twins for Complex Systems: Fusing physics-based and deep learning algorithms for fault diagnostics and prognostics
Abstract: Deep learning algorithms need large amounts of representative data to learn relevant patterns. Although increasing amounts of condition monitoring data have been recently collected for complex systems, these data lack labels (in form of faults) and often also representativeness due to the high variability in operating conditions. Integrating physics and structural inductive bias helps to overcome some of the limitations of deep learning algorithms. It reduces the amount of required training data, adds interpretability in the algorithms, and makes some of the problems solvable that were not solvable before. Furthermore, it helps to build trust in the algorithms by making the outputs interpretable.
The talk will give some insights into operational digital twins developed by fusing physics-based and deep learning algorithms for fault diagnostics and prognostics.
Bio: Olga Fink has been an assistant professor of intelligent maintenance and operations systems at EPFL since March 2022.
Olga is also a research affiliate at the Massachusetts Institute of Technology and an Expert of the Innosuisse in the field of ICT. Olga’s research focuses on Hybrid Algorithms Fusing Physics-Based Models and Deep Learning Algorithms, Hybrid Operational Digital Twins, Transfer Learning, Self-Supervised Learning, Deep Reinforcement Learning, and Multi-Agent Systems for Intelligent Maintenance and Operations of Infrastructure and Complex Assets. Before joining the EPFL faculty, Olga was an assistant professor of intelligent maintenance systems at ETH Zurich from 2018 to 2022, being awarded the prestigious professorship grant of the Swiss National Science Foundation (SNSF). Between 2014 and 2018 she was heading the research group “Smart Maintenance” at the Zurich University of Applied Sciences (ZHAW). Olga received her Ph.D. degree from ETH Zurich with the thesis on “Failure and Degradation Prediction by Artificial Neural Networks: Applications to Railway Systems”, and a Diploma degree in industrial engineering from Hamburg University of Technology. She has gained valuable industrial experience as a reliability engineer with Stadler Bussnang AG and as a reliability and maintenance expert with Pöyry Switzerland Ltd. In 2018, Olga was selected as one of the “Top 100 Women in Business, Switzerland” and in 2019, she was selected as young scientist of the World Economic Forum.
Title: Deep Latent Variable Models for Sequential Data
Abstract: I will introduce a powerful framework for designing probabilistic models for high-dimensional time series. Drawing on deep latent variable models and variational inference, I present several non-linear extensions of hidden Markov models and Kalman filters that can cope with the complexities of high-dimensional data. I will thereby address three challenges: how to scale up forecasting methods to high-dimensional data at the scales of a video, how to deal with temporal discontinuities in the data such as abrupt distribution shifts, and how to adapt time series models to individual sequence instances, e.g., for personalization. I will provide applications of these methods in modeling language evolution, continuous-time event forecasting, simulating fluid dynamics, and video compression.
Bio: Stephan Mandt is an Assistant Professor of Computer Science and Statistics at the University of California, Irvine. From 2016 until 2018, he was a Senior Researcher and Head of the statistical machine learning group at Disney Research, first in Pittsburgh and later in Los Angeles. He held previous postdoctoral positions at Columbia University and Princeton University. Stephan holds a Ph.D. in Theoretical Physics from the University of Cologne, where he received the German National Merit Scholarship. He is furthermore a Kavli Fellow of the U.S. National Academy of Sciences, an NSF CAREER Awardee, a member of the ELLIS Society, and a former visiting researcher at Google Brain. Stephan regularly serves as an Area Chair, Action Editor, or Editorial Board member for NeurIPS, ICML, AAAI, ICLR, TMLR, and JMLR. His research is currently supported by NSF, DARPA, DOE, Disney, Intel, and Qualcomm.
Title: How to Train Your Digital Twin: Practical Deep Learning Approaches to Modeling As-built Components
Abstract: As-built Components Modern engineering approaches rely on the ability accurately predict the performance of components in the field. Traditionally, idealized models have been used to estimate system behavior based on the assumption that parts will be built to match their design. Recently, a paradigm shift toward data-driven modeling has enabled the simulation of as-built parts, capturing a better estimate of variance in performance due to imperfections that are inevitable in the real world. This talk will cover a set of techniques developed under the Sandia National Laboratories LDRD program that leverage deep learning to automate steps necessary to build digital twins. Unsupervised anomaly detection, volumetric segmentation for image-based simulation, and predictive physics-informed multimodal autoencoders enable the construction of digital twins. We apply these techniques to high-throughput additive manufacturing systems, providing a means of AI-enhanced diagnostics and optimal process control.
Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC, a wholly-owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. SAND2022-4407 A.
Bio: Cari Martinez is a principal computer scientist and leads a machine learning team in the Applied Machine Intelligence organization at Sandia National Laboratories. Her research focuses on developing deep learning methods that leverage uncertainty quantification and explainability techniques to improve modeling capabilities and solve critical national security problems. She holds a BS in Honors Mathematics from the University of Notre Dame, an MS in Computer Science from the University of New Mexico, and she is currently completing her Computer Science Ph.D. at Arizona State University under the advisement of Dr. Stephanie Forrest.
Title: Scalable Operator Learning with Quantified Uncertainty
Abstract: Supervised operator learning is an emerging machine learning paradigm with applications to modeling the evolution of spatio-temporal dynamical systems and approximating general black-box relationships between functional data. We propose a novel operator learning method, LOCA (Learning Operators with Coupled Attention), motivated by the recent success of the attention mechanism. In our architecture, the input functions are mapped to a finite set of features which are then averaged with attention weights that depend on the output query locations. By coupling these attention weights together with an integral transform, LOCA is able to explicitly learn correlations in the target output functions, enabling us to approximate nonlinear operators even when the number of output function measurements in the training set is very small. Our formulation is accompanied by rigorous approximation theoretic guarantees on the universal expressiveness of the proposed model. Empirically, we evaluate the performance of LOCA on several operator learning scenarios involving systems governed by ordinary and partial differential equations, as well as a black-box climate prediction problem. Through these scenarios, we demonstrate state-of-the-art accuracy, robustness with respect to noisy input data, and a consistently small spread of errors over testing data sets, even for out-of-distribution prediction tasks. We also present a simple and effective approach for posterior uncertainty quantification which can further enhance robustness and predictive accuracy, as well as effectively detect out-of-distribution and adversarial examples.
Bio: Paris Perdikaris is an Assistant Professor in the Department of Mechanical Engineering and Applied Mechanics at the University of Pennsylvania. He received his Ph.D. in Applied Mathematics at Brown University in 2015, and, prior to joining Penn in 2018, he was a postdoctoral researcher at the Department of Mechanical Engineering at the Massachusetts Institute of Technology. His current research interests include physics-informed machine learning, uncertainty quantification, and engineering design optimization. His work and service have received several distinctions including the DOE Early Career Award (2018), the AFOSR Young Investigator Award (2019), the Ford Motor Company Award for Faculty Advising (2020), the SIAG/CSE Early Career Prize (2021), and the Scialog Fellowship (2021).
Title: The Continuing Advances of Differentiable Simulation
Abstract: Differentiable simulation techniques are the core of scientific machine learning methods which are used in the automatic discovery of mechanistic models through infusing neural network training into the simulation process. In this talk, we will start by showcasing some of the ways that differentiable simulation is being used, from discovery of extrapolatory epidemic models to nonlinear mixed-effects models in pharmacology. From there, we will discuss the computational techniques behind the training process, focusing on the numerical issues involved in handling differentiation of highly stiff and chaotic systems. The viewers will leave with an understanding of how compiler techniques are being infused into the simulation stack to provide the future of differentiable simulators.
Bio: Chris is an Applied Mathematics Instructor at MIT and the lead developer of the SciML Open Source Software Organization, which includes DifferentialEquations.jl solver suite along with hundreds of state-of-the-art packages for mixing machine learning into mechanistic modeling. Chris’ work on high-performance differential equation solving is the engine accelerating many applications from the MIT-CalTech CLiMA climate modeling initiative to the SIAM Dynamical Systems award-winning DynamicalSystems.jl toolbox. As the Director of Scientific Research at Pumas-AI, Chris is the lead developer of Pumas, where he has received a top presentation award at every ACoP in the last 3 years for improving methods for uncertainty quantification, automated GPU acceleration of nonlinear mixed-effects modeling (NLME), and machine learning assisted construction of NLME models with DeepNLME. For these achievements, Chris received the Emerging Scientist award from ISoP, the highest early career award in pharmacometrics. As the Director of Modeling and Simulation at Julia Computing, Chris is the lead developer of JuliaSim, where the work is credited for the 15,000x acceleration of NASA Launch Services simulations and recently demonstrated a 60x-570x acceleration over Modelica tools in HVAC simulation, earning Chris the US Air Force Artificial Intelligence Accelerator Scientific Excellence Award.
Title: The Future of A.I. for Social Good
Abstract: The A.I. Industry has powered a futuristic reality of self-driving cars and voice assistants to help us with almost any need. However, the A.I. Industry has also created systematic challenges. For instance, while it has led to platforms where workers label data to improve machine learning algorithms, my research has uncovered that these workers earn less than minimum wage. We are also seeing the surge of A.I. algorithms that privilege certain populations and racially exclude others. If we were able to fix these challenges we could create greater societal justice and enable A.I. that better addresses people’s needs, especially groups we have traditionally excluded.
In this talk, I will discuss some of these urgent global problems that my research has uncovered from the A.I. Industry. I will present how we can start to address these problems through my proposed “A.I. For Good” framework. My framework uses value-sensitive design to understand people’s values and rectify harm. I will present case studies where I use this framework to design A.I. systems that improve the labor conditions of the workers operating behind the scenes in our A.I. industry. I conclude by presenting a research agenda for studying the impact of A.I. in society; and researching effective socio-technical solutions in favor of workers.
Bio: Saiph Savage is an Assistant Professor at Northeastern University in the Khoury College of Computer Sciences where she conducts research in the intersection of Human Computer Interaction, A.I., and Civic Technology. She is one of the 35 Innovators under 35 by the MIT Technology Review, a Google Anita Borg Scholarship recipient, and a fellow at the Center for Democracy & Technology. Her work has been covered in the BBC, Deutsche Welle, the Economist, and the New York Times, as well as published in top venues such as ACM CHI, CSCW, and the Web Conference, where she has also won honorable mentions and impact awards. Dr. Savage has been awarded grants from the National Science Foundation, the United Nations, diverse industry actors, and has also formalized new collaborations with Federal and local Governments where she is driving them to adopt Human Centered Design and A.I. to deliver better experiences and government services for citizens. Dr. Savage’s students have obtained fellowships and internships in industry (Facebook Research, Twitch Research, Twitter, Snap, and Microsoft Research) as well as academia (Oxford Internet Institute). Saiph holds a bachelor’s degree in Computer Engineering from the National Autonomous University of Mexico (UNAM), and a master’s and Ph.D. in Computer Science from the University of California, Santa Barbara (UCSB). Dr. Savage has also worked at the University of Washington, and Carnegie Mellon University (CMU). Additionally, Dr. Savage has been a tech worker at Microsoft Bing, Intel Labs, and a crowd research worker at Stanford.