Statistics and Probability Seminar Series

See also our calendar for a complete list of the Department of Mathematics and Statistics seminars and events.

Please email Guangqu ZHENG (gzheng90@bu.edu) if you would like to be added to the email list.

Spring 2026 Dates

(Time/location: Thursdays 4-5pm/CDS 365)

Jan. 22, Speaker: Mickey Salins (BU)

  • Title: Superlinear multiplicative noise can cause or prevent explosion in SPDEs.
  • Abstract: I will outline a series of results about finite time explosion for SPDEs. In some settings, the addition of strong stochastic forces can cause solutions to explode in finite time. In other settings, strong stochastic perturbations have a regularizing effect and can prevent solutions from exploding.  

Jan. 29, Speaker: Ryan Martin (North Carolina State)

  • Title: No-prior Bayes reIMagined: probabilistic approximations of inferential models
  • Abstract: When prior information is lacking, the go-to strategy for probabilistic inference is to combine a “default prior” and the likelihood via Bayes’s theorem. Objective Bayes, (generalized) fiducial inference, etc. fall under this umbrella. This construction is natural, but the corresponding posterior distributions generally only offer limited, approximately valid uncertainty quantification. In this talk I’ll describe a reimagined approach, yielding posterior distributions with stronger reliability properties. The proposed construction starts with an inferential model (IM), one that takes the mathematical form of a data-driven possibility measure offering exactly valid uncertainty quantification, and then returns a so-called inner probabilistic approximation thereof. This inner probabilistic approximation inherits many of the original IM’s desirable properties, including credible sets with exact coverage and asymptotic efficiency. The approximation also agrees with the familiar Bayes/fiducial solution obtained in group invariant models. A Monte Carlo method for evaluating the probabilistic approximation is presented, along with numerical illustrations.  This talk is based largely on the paper here: https://arxiv.org/abs/2503.19748

Feb. 5, Speaker: Debdeep Pati (University of Wisconsin-Madison)

  • Title: Scientifically Guided Robust Inference
  • Abstract:  Motivated by the need for interpretable yet flexible models in modern scientific applications, we develop a likelihood-based inference framework constrained to lie close to a parametric family, balancing scientific structure with robustness. We connect this approach to nonparametric Bayesian inference centered on a base distribution and establish favorable asymptotic properties. We extend the framework to simulation-based inference, where likelihoods are analytically intractable and simulator calls are costly; existing methods such as neural density surrogates often require a multitude of simulations and scale poorly with dimension, rendering full Bayesian inference impractical. Our approach mitigates this by approximating the likelihood locally at the observed data rather than learning the full surface, yielding a substantial reduction in forward evaluations while preserving rigorous uncertainty quantification. We explore the efficacy of our approach through numerical studies and conclude with an application.

Feb. 12, Speaker: Roee Gutman (Brown University)

  • Title. Bayesian Entity Resolution (Record Linkage): A Data Science Problem with Applications in Health Services Research
  • Abstract: As researchers and policy analysts work to integrate administrative datasets and registries while complying with privacy regulations that restrict access to unique identifiers, the analysis of partially linked datasets is becoming increasingly crucial. Record linkage across data sources is at the core of data science research, combining computational tools, applied mathematics, and statistical methods to extract insights. Current tools for identifying records that represent the same entity across datasets mostly emphasize computational efficiency. Less attention has been given to features of the data that can improve linkage accuracy and to statistical inferences with linked records. To address these limitations, we frame record linkage as a missing data problem and develop Bayesian procedures that leverage data features commonly encountered in public health applications. These procedures improve linkage quality and yield more accurate and precise estimates of scientifically important associations. The first procedure incorporates associations between variables observed exclusively in one dataset into the linkage process. The second procedure ensures that individuals receiving care from the same provider in one dataset are linked to individuals receiving care from a similar provider in the other dataset, even when providers cannot be uniquely matched across datasets. Both procedures generate M datasets in which the links between the two datasets are imputed. These datasets can be analyzed separately and combined using standard multiple imputation rules. This approach reduces the time and expertise required of analysts while preserving flexibility for downstream analyses of the linked data. We describe two applications. The first links Medicare claims with Vital Statistics mortality records to examine the association between end-of-life medical expenditures and causes of death. The second links the National Trauma Data Bank with Medicare claims to study the relationship between injury characteristics and successful discharge to the community among patients with traumatic brain injury.

Feb. 19, Speaker: Jinsu Kim (POSTECH)

  • Title. Fast and slow mixing of continuous-time Markov chains with polynomial rates.
  • Abstract: Continuous-time Markov chains on infinite positive integer grids with polynomial rates are often used in modeling queuing systems, molecular counts of small-size biological systems, etc. In this talk, we will discuss continuous-time Markov chains that admit either fast or slow mixing behaviors. For a positive recurrent continuous-time Markov chain, the convergence rate to its stationary distribution is typically investigated with the Lyapunov function method and canonical path method. Recently, we discovered examples that do not lend themselves easily to analysis via those two methods but are shown to have either fast mixing or slow mixing with our new technique. The main ideas of the new methodologies are presented along with their applications to stochastic biochemical reaction network theory and DNA computing.

Feb. 26, Speaker: Xiaojing Wang (University of Connecticut)

  • Title. A Principled Bayesian Framework for Dynamic Latent Variable Modeling with Multi-Source Data
  • Abstract: Advances in measurement and sensing have enabled multi-source longitudinal data—item responses, response times, and mobile surveys—often collected at irregular, individual-specific time points. Although these streams promise richer insight, they also raise a key question: which sources improve inference on latent targets of interest, and when joint modeling is worth the complexity.This talk presents a principled Bayesian framework for dynamic latent variable modeling tailored to irregularly spaced longitudinal data. Inspired by state-space modeling ideas, the framework jointly integrates heterogeneous sources through shared time-varying latent structures. To capture nonlinear evolution, Gaussian processes (GPs) are used to model trajectories, with posterior propriety established under objective priors for GP hyperparameters. While joint modeling is often pursued to improve inference on latent variables, the inclusion of additional data does not inherently guarantee such benefits. To help researchers determine when meaningful gains are realized, this talk introduces new diagnostic tools based on decompositions of commonly used Bayesian model assessment criteria. These tools quantify each source’s incremental value for learning latent variables and help determine when joint modeling is warranted. The proposed framework is demonstrated through applications in multi-source educational assessment.

Mar. 2, Speaker: Junwei Lu (Harvard University)

  • Title: Preference Inference for Language Models Debiased by Fisher Random Walk
  • Abstract:  Human preference alignment has been shown to be effective in training the large language models (LLMs). It allows the LLM to understand human feedback and preferences. Despite the extensive literature dealing with algorithms aligning the rank of human preference, uncertainty quantification for the ranking estimation still needs to be explored and is of great practical significance. For example, it is important to overcome the problem of hallucination for LLM in the medical domain, and an inferential method for the ranking of LLM answers becomes necessary. In this talk, we will present a novel framework called ​“Fisher random walk” to conduct semi-parametric efficient preference inference for language models and illustrate its application in the language models for medical knowledge.

Previous Speakers

Fall 2025

Lulu Kang (Umass Amherst)

    Cheng Ouyang (University of Illinois at Chicago)

      Nathan Ross (University of Melbourne)

      Konstantin Riedl (University of Oxford)

      Davar Khoshnevisan (University of Utah)

      Ian Stevenson (University of Connecticut)

      Qiyang Han (Rutgers University)

      Yimin Xiao (Michigan State University)

      Le Chen (Auburn University)

      Igor Cialenco (Illinois Institute of Technology)

      Oanh Nguyen (Brown University)

      Spring 2025

      Yuchen Wu (University of Pennsylvania)

      Ye Tian (Columbia University)

      Charles Margossian (Flatiron Institute)

      Yuetian Luo (University of Chicago)

      Kai Tan (Rutgers University)  

      Anirban Chatterjee (University of Pennsylvania)

      Georgia Papadogeorgou (University of Florida)

      Murali Haran (Pennsylvania State University)

      Yuguo Chen (University of Illinois at Urbana-Champaign)

      Youssef Marzouk (MIT)

      Yao Xie (Georgia Institute of Technology)

      Andrea Rotnitzky (University of Washington)

      Nabarun Deb (University of Chicago)

      Jonathan Huggins (Boston University)

      Fall 2024

      Zhongyang Li (University of Connecticut)

      Devavrat Shah (MIT)

      Natesh Pillai (Harvard University)

      Pamela Reinagel (UC San Diego)

      Bodhisattva Sen (Columbia University)

      Susan Murphy (Harvard University)

      Luc Rey-Bellet (University of Massachusetts Amherst)

      James Murphy (Tufts University)

      Pragya Sur (Harvard University)

      Spring 2024

      Tracy Ke (Harvard University)

      Feng Liu (Stevens Institute of Technology)

      Rajarshi Mukherjee (Harvard University)

      Guido Consonni (Università Cattolica del Sacro Cuore)

      Fan Li (Duke University)

      Kavita Ramanan (Brown University)

      Fall 2023

      Cynthia Rush (Columbia University)

      James Maclaurin (New Jersey Institute of Technology)

      Ruoyu Wu (Iowa State University)

      Jonathan Pillow (Princeton University)

      Subhabrata Sen (Harvard University)

      Le Chen (Auburn University)

      Raluca Balan (University of Ottawa)

      Eric Tchetgen Tchetgen (University of Pennsylvania)

      Tyler VanderWeele (Harvard University)

      Jose Zubizarreta (Harvard University)