Mathematical and Scientific Machine Learning

To launch the conference series of Mathematical and Scientific Machine Learning, the invited talks will be given by some of our board members. The list of speakers and the abstracts for their scheduled talks are below:

Click here to download a pdf of the abstracts.

Roberto Car (Chemistry, Princeton University)

Boosting ab-initio molecular dynamics with machine learning

Accessible time and size scales severely limit the range of ab initio molecular dynamics simulations. Machine learning techniques are rapidly changing this state of affairs as deep neural networks can learn the interatomic potential energy surface from ab-initio data, making possible simulations with quantum mechanical accuracy at the cost of empirical force fields. In addition, approaches like the deep potential method, can also represent properties such as the polarization and the polarizability surfaces that require explicit electronic structure information. Using incremental learning techniques deep potentials models covering a vast range of thermodynamic conditions can be generated from a minimal amount of training data, making possible simulation studies that until recently were outside the range of ab initio molecular dynamics. Basic concepts and key applications of the deep potential methodology will be reviewed, mentioning current limitations and future directions.

Weinan E (Mathematics, Princeton University)

Towards a Mathematical Understanding of Supervised Learning: what we know and what we don't know

Two of the biggest puzzles in machine learning are: Why is it so successful and why is it quite fragile? This talk will present a framework for unraveling these puzzles from the perspective of approximating functions in high dimensions. We will discuss what's known and what's not known about the approximation/generalization properties of neural network type of hypothesis space as well as the dynamics and generalization properties of the training process.

This is joint work with Chao Ma, Stephan Wojtowytsch and Lei Wu.

Anna Gilbert (Mathematics, Yale University)

Metric representations: Algorithms and Geometry

Given a set of distances amongst points, determining what metric representation is most “consistent” with the input distances or the metric that best captures the relevant geometric features of the data is a key step in many machine learning algorithms. In this talk, we focus on 3 specific metric constrained problems, a class of optimization problems with metric constraints: metric nearness (Brickell et al. (2008)), weighted correlation clustering on general graphs (Bansal et al. (2004)), and metric learning (Bellet et al. (2013); Davis et al. (2007)).

Because of the large number of constraints in these problems, however, these and other researchers have been forced to restrict either the kinds of metrics learned or the size of the problem that can be solved. We provide an algorithm, PROJECT AND FORGET, that uses Bregman projections with cutting planes, to solve metric constrained problems with many (possibly exponentially) inequality constraints. We also prove that our algorithm converges to the global optimal solution. Additionally, we show that the optimality error decays asymptotically at an exponential rate. We show that using our method we can solve large problem instances of three types of metric constrained problems, out-performing all state of the art methods with respect to CPU times and problem sizes.

Finally, we discuss the adaptation of PROJECT AND FORGET to specific types of metric constraints, namely tree and hyperbolic metrics.

George Karniadakis (Applied Mathematics, Brown University)

(PINNs) - Physics Informed Neural Networks: Algorithms, Theory and Applications

Physics informed neural networks (PINNs) are deep learning based techniques for solving partial differential equations (PDEs). Guided by data and physical laws, PINNs find a neural network that approximates the solution to a system of PDEs. Such a neural network is obtained by minimizing a loss function in which any prior knowledge of PDEs and data are encoded using automatic differentiation. PINNs have different extensions for stochastic and fractional PDEs, conservation laws, variational forms, etc. They are the most popular method in solving inverse ill-posed problems and have been adopted by the industry, NVIDIA, ANSYS, etc. In addition to several examples of the effectiveness of PINNs across disciplines and in realistic application, we will also present the first mathematical foundation of the PINNs methodology. As the number of data grows, PINNs generate a sequence of minimizers which correspond to a sequence of neural networks. We will answer the question: Does the sequence of minimizers converge to the solution to the PDE? This question is also related to the generalization of PINNs. We consider two classes of PDEs: elliptic and parabolic. By adapting the Schuader approach, we show that the sequence of minimizers strongly converges to the PDE solution in L2. Furthermore, we show that if each minimizer satisfies the initial/boundary conditions, the convergence can be improved to H1.

This is joint work with Yeonjong Shin.

Stéphane Mallat (Mathematics, Collège de France, ENS Paris, Flatiron Institute)

Descartes versus Bayes: Harmonic Analysis for High Dimensional Learning and Deep Nets

Is high-dimensional learning about function approximation or Bayes probability estimation? Algorithmic solutions go through finding discriminative variables which concentrate, according to Bayes and statistical physics. Harmonic analysis gives a mathematical framework to define and analyze such variables from prior information on symmetries. The results of deep neural network architectures are opening new horizons beyond Fourier, wavelets and sparsity. What is being learned through optimization? Phase was long forgotten and is making its way back. This lecture outlines harmonic analysis challenges raised by classification and data generation with deep convolutional neural networks. We consider applications to image generation and classification with ImageNet.

Nathan Kutz (Applied Mathematics, University of Washington)

Deep Learning for the Discovery of Coordinates and Dynamics

The discovery of governing equations from scientific data has the potential to transform data-rich fields that lack well-characterized quantitative descriptions. Advances in sparse regression are currently enabling the tractable identification of both the structure and parameters of a nonlinear dynamical system from data. The resulting models have the fewest terms necessary to describe the dynamics, balancing model complexity with descriptive ability, and thus promoting interpretability and generalizability. This provides an algorithmic approach to Occam's razor for model discovery. However, this approach fundamentally relies on an effective coordinate system in which the dynamics have a simple representation. We design a custom deep autoencoder network to discover a coordinate transformation into a reduced space where the dynamics may be sparsely represented.

Thus, we simultaneously learn the governing equations and the associated coordinate system. We demonstrate this approach on several example high-dimensional systems with low-dimensional behavior. The resulting modeling framework combines the strengths of deep neural networks for flexible representation and sparse identification of nonlinear dynamics (SINDy) for parsimonious models. It is the first method of its kind to place the discovery of coordinates and models on an equal footing. It can also be modified for multi scale physics applications and with Koopman operators.

Stanley Osher (Mathematics, UCLA)

A Machine Learning Framework for Solving High-Dimensional Mean Field Game Problems

MFG's are critical classes of multi-agent models for efficient analysis of massive populations of interacting agents. We provide a flexible machine learning framework for the numerical solution of potential MFG models. Grid based methods run into the curse of dimensionality. We approximately solve high dimensional problems by using a machine learning framework, combining Lagrangian PDE solvers and neural networks, extending the reach of existing models.

Joint work with L. Ruthotto, S. Wu Fong, W. Li and L. Nurbekyan

Lexing Ying (Mathematics, Stanford University)

Solving Inverse Problems with Deep Learning

This talk is about some recent progress on solving inverse problems using deep learning. Compared to traditional machine learning problems, inverse problems are often limited by the size of the training data set. We show how to overcome this issue by incorporating mathematical analysis and physics into the design of neural network architectures. We first describe neural network representations of pseudodifferential operators and Fourier integral operators. We then continue to discuss applications including electric impedance tomography, optical tomography, inverse acoustic/EM scattering, seismic imaging, and travel-time tomography.

Lenka Zdeborová (Physics, CNRS France)

The role of data structure in learning shallow neural networks

Methods from statistical physics are able to provide sharp generalization and sample complexity results for shallow neural networks in the teacher-student setting with iid input data. To advance towards a more relevant theory of learning, the data structure needs to be more general. In this talk, I will describe our recent results in this direction, establishing sharp analysis of learning with shallow neural networks for data coming from a broad range of generative models.

The talk is based on arxiv: 1909.11500, 2002.09339, 2006.14709