Efficient bayesian inference via monte carlo and machine learning algorithms
- Llorente Fernández, Fernando
- David Delgado Gómez Director/a
- Luca Martino Codirector/a
Universidad de defensa: Universidad Carlos III de Madrid
Fecha de defensa: 16 de diciembre de 2022
- Valero Laparra Pérez-Muelas Presidente
- Michael Peter Wiper Secretario/a
- Omer Deniz Akyildiz Vocal
Tipo: Tesis
Resumen
In many fields of science and engineering, we are faced with an inverse problem where we aim to recover an unobserved parameter or variable of interest from a set of observed variables. Bayesian inference is a probabilistic approach for inferring this unknown parameter that has become extremely popular, finding application in myriad problems in fields such as machine learning, signal processing, remote sensing and astronomy. In Bayesian inference, all the information about the parameter is summarized by the posterior distribution. Unfortunately, the study of the posterior distribution requires the computation of complicated integrals, that are analytically intractable and need to be approximated. Monte Carlo is a huge family of sampling algorithms for performing optimization and numerical integration that has become the main horsepower for carrying out Bayesian inference. The main idea of Monte Carlo is that we can approximate the posterior distribution by a set of samples, obtained by an iterative process that involves sampling from a known distribution. Markov chain Monte Carlo (MCMC) and importance sampling (IS) are two important groups of Monte Carlo algorithms. This thesis focuses on developing and analyzing Monte Carlo algorithms (either MCMC, IS or combination of both) under different challenging scenarios presented below. In summary, in this thesis we address several important points, enumerated (a)–(f), that currently represent a challenge in Bayesian inference via Monte Carlo. A first challenge that we address is the problematic exploration of the parameter space by off-the-shelf MCMC algorithms when there is (a) multimodality, or with (b) highly concentrated posteriors. Another challenge that we address is the (c) proposal construction in IS. Furtheremore, in recent applications we need to deal with (d) expensive posteriors, and/or we need to handle (e) noisy posteriors. Finally, the Bayesian framework also offers a way of comparing competing hypothesis (models) in a principled way by means of marginal likelihoods. Hence, a task that arises as of fundamental importance is (f) marginal likelihood computation. Chapters 2 and 3 deal with (a), (b), and (c). In Chapter 2, we propose a novel population MCMC algorithm called Parallel Metropolis-Hastings Coupler (PMHC). PMHC is very suitable for multimodal scenarios since it works with a population of states, instead of a single one, hence allowing for sharing information. PMHC combines independent exploration by the use of parallel Metropolis-Hastings algorithms, with cooperative exploration by the use of a population MCMC technique called Normal Kernel Coupler. In Chapter 3, population MCMC are combined with IS within the layered adaptive IS (LAIS) framework. The combination of MCMC and IS serves two purposes. First, an automatic proposal construction. Second, it aims at increasing the robustness, since the MCMC samples are not used directly to form the sample approximation of the posterior. The use of minibatches of data is proposed to deal with highly concentrated posteriors. Other extensions for reducing the costs with respect to the vanilla LAIS framework, based on recycling and clustering, are discussed and analyzed. Chapters 4, 5 and 6 deal with (c), (d) and (e). The use of nonparametric approximations of the posterior plays an important role in the design of efficient Monte Carlo algorithms. Nonparametric approximations of the posterior can be obtained using machine learning algorithms for nonparametric regression, such as Gaussian Processes and Nearest Neighbors. Then, they can serve as cheap surrogate models, or for building efficient proposal distributions. In Chapter 4, in the context of expensive posteriors, we propose adaptive quadratures of posterior expectations and the marginal likelihood using a sequential algorithm that builds and refines a nonparametric approximation of the posterior. In Chapter 5, we propose Regression-based Adaptive Deep Importance Sampling (RADIS), an adaptive IS algorithm that uses a nonparametric approximation of the posterior as the proposal distribution. We illustrate the proposed algorithms in applications of astronomy and remote sensing. Chapter 4 and 5 consider noiseless posterior evaluations for building the nonparametric approximations. More generally, in Chapter 6 we give an overview and classification of MCMC and IS schemes using surrogates built with noisy evaluations. The motivation here is the study of posteriors that are both costly and noisy. The classification reveals a connection between algorithms that use the posterior approximation as a cheap surrogate, and algorithms that use it for building an efficient proposal. We illustrate specific instances of the classified schemes in an application of reinforcement learning. Finally, in Chapter 7 we study noisy IS, namely, IS when the posterior evaluations are noisy, and derive optimal proposal distributions for the different estimators in this setting. Chapter 8 deals with (f). In Chapter 8, we provide with an exhaustive review of methods for marginal likelihood computation, with special focus on the ones based on Monte Carlo. We derive many connections among the methods and compare them in several simulations setups. Finally, in Chapter 9 we summarize the contributions of this thesis and discuss some potential avenues of future research.