{"id":8,"date":"2012-08-01T00:59:21","date_gmt":"2012-08-01T00:59:21","guid":{"rendered":"http:\/\/www.matthewpratola.com\/?page_id=8"},"modified":"2022-05-01T19:02:58","modified_gmt":"2022-05-01T23:02:58","slug":"research","status":"publish","type":"page","link":"https:\/\/matthewpratola.com\/research\/","title":{"rendered":"Research"},"content":{"rendered":"
My statistical research is in the areas of computer experiments<\/a>, computer model calibration<\/a>, uncertainty quantification<\/a> and statistical methods for large datastructures aka “big data”. Most of the applied problems that have motivated my statistical research have originated from climate\/environmental science and physics. I am also interested in leveraging mathematical models and observational data to understand environmental impacts on health. More recently, I have become obsessively interested in blockchain technology and decentralization. This was the result of finally having some time to learn new things during the pandemic (silver lining!), and I went down the proverbial rabbit hole. <\/p>\n You can view my CV here<\/a>.<\/p>\n Current Research<\/strong> My current research is supported by a generous KAUST CRG grant with Dr. Ying Sun and Dr. Brian Reich, a generous NSF grant with Dr. Rob McCulloch and Dr. Ed George, and a generous NSF grant as part of the BAND project. Thank you for your support!<\/p>\n Ongoing Projects 2022<\/strong> 2021<\/strong> 2020<\/strong> 2019<\/strong>
\nMy current research is continuing in the directions I have embarked on over the past few years: we are investigating further improvements to MCMC sampling for Bayesian regression trees, new ideas in discrepancy for computer model calibration motivated by applications in nuclear physics and biochemical reaction networks, and investigating the importance of measuring influence in BART. <\/p>\n
\n<\/em><\/span>John Yannotty: Bayesian Treed Model Mixing, joint work with Dr. Tom Santner
\nVincent Geels: Tree-based count data models, joint work with Dr. Radu Herbei
\nInfluence and outliers in BART – joint work with Dr. Rob McCulloch and Dr. Ed George
\nVincent Geels: The Taxicab Sampler, joint work with Dr. Radu Herbei
\nA Bayesian Framework for Quantifying Model Uncertainties in Nuclear Dynamics – joint work with D.R. Phillips and the BAND project team.
\nDesigning Optimal Experiments in Proton Compton Scattering – J.A. Melendez’s thesis work in Physics, supervisor: R.A. Furnstahl
\nAkira Horiguchi: Pareto Front Optimization with BART, joint work with Dr. Tom Santner and Dr. Ying Sun
\nQuantifying Correlated Truncation Errors in EFT models – J.A. Melendez’s thesis work in Physics, supervisor: R.A. Furnstahl
\nAkira Horiguchi: Variable Importance for BART with Dr. Tom Santner
\nGavin Collins: Somewhat Smarter Learners for BART with Dr. Rob McCulloch and Dr. Ed George
\nAdaptive Resolution BART for Image Data aka “the beach project” with Dr. Dave Higdon
\nMultivariate BART with Dr. Ying Sun and Dr. Wu Wang
\nThe infamous Dream Project<\/strong><\/a><\/p>\n
\nInfluence in BART<\/em><\/span>: Typically in modern statistical and machine learning models one does not consider the effect of influential or outlier observations on the fitted model. Does it matter? Our work shows that it indeed does, and we explore some diagnostics to screen for such problematic observations, and then propose an importance sampling algorithm to re-weight the posterior draws given the identified observations.
\nMulti-tree models for count data<\/em><\/span>: The second project from Vincent’s thesis, and really cool creative work. Instead of introducing n<\/em> latent continuous variable to facilitate modeling non-Gaussian data in the Bayesian tree setting, Vincent introduces a one-to-many transformation to facilitate the modeling of count data, which only introduces q<<n<\/em> latents.
\nBayesian Treed Model Mixing<\/em><\/span>: The first project from John’s thesis work, we use Bayesian trees to facilitate the construction of a model to perform model-mixing of multiple Physics simulators where the simulators have varying utility depending on the region of input space.
\nOptimally Batched Bayesian Trees<\/em><\/span>: Another joint project with Luo, this idea is incredibly cool. More soon…<\/p>\n
\nThe Taxicab Sampler<\/em><\/span>: In this work, Vincent considers modeling count data directly rather than by the standard transformation\/link function approach. This motivates the need for an efficient MCMC sampler for a discrete, countable random variable. The solution – the Taxicab sampler – is motivated by the so-called Hamming Ball sampler which also considers sampling a discrete data object, but in that case only targets 0\/1 type random variables.
\nPareto Front Optimization with BART<\/em><\/span>: We explore a BART-based model for performing Pareto Front optimization. Our approach models multiple response functions as independent BART ensembles, then constructs an estimator for the Pareto Front and Pareto Set, and includes uncertainty information for both. The code for this has been implemented in the OpenBT source.<\/p>\n
\nA Bayesian Framework for Quantifying Model Uncertainties in Nuclear Dynamics<\/em><\/span>: The so-called “manifesto” paper, outlining the broad vision\/goals of the BAND project. The particular contribution of my group is considering so-called Bayesian Model Mixing as a generalization of model averaging, and drawing connections from this idea to the popular Kennedy-O’Hagan framework of Model Calibration.
\nVariable Importance for BART<\/em><\/span>: In the first part of Akira’s Ph.D. thesis we arrive at a closed-form construction for calculating Sobol’ sensitivity indices in tree (e.g. BART) models. The code for this has been implemented in the OpenBT source.
\nDesigning Optimal Experiments in Proton Compton Scattering<\/em><\/span>: More work with Melendez and Furnstahl, essentially now adding an optimal experimental design construction to the EFT expansion emulator explored in the first paper.<\/p>\n
\nNear-Optimal Design<\/em><\/span>: In this joint work with Dr. Craigmile<\/a>, we explore new ideas in near-optimal design with our awesome Ph.D. student Sophie Nguyen (now at JP Morgan Chase). In challenging optimization problems, how good is our numerical approximation of the true underlying optimum?<\/p>\n