2026

Classical power analysis assumes point-identified parameters, but many credible research designs yield identified sets rather than point estimates. This thesis extends the coverage–power duality to partially identified parameters, deriving power formulas, sample-size equations, and minimum detectable bound frontiers for two confidence procedures: Stoye’s (2020) misspecification-adaptive interval CIMA and the Imbens–Manski/Stoye interval CI1α. Simulations demonstrate the validity of the method, with exact alignment under the required assumptions using a linear data generating process, and general but imperfect alignment in non-linear settings estimated via double machine learning. An application to Oster’s (2019) omitted-variable bias sensitivity analysis demonstrates feasibility using published regression outputs, though simulations reveal that the simplified Oster formula introduces non-trivial approximation error. We discuss implications for other sensitivity frameworks and directions for future work.

Understanding directed interactions between brain regions is a fundamental problem in neuroscience, particularly in the context of complex cognitive tasks. Intracranial stereo-electroencephalography (sEEG) provides high temporal and spatial resolution recordings, enabling the study of neural dynamics at fine scales. However, inferring causal structure from such time series remains challenging due to nonlinearity, noise, and strong temporal dependencies. In this work, we investigate causal relationships in sEEG recordings collected from human subjects performing a multi-attribute decision-making task. We first implement a Granger causality framework tailored to event-aligned neural data, incorporating preprocessing steps to enforce stationarity and evaluating directional influence across multiple time lags. To address the limitations of linear methods, we then explore a nonlinear causal discovery framework based on delay differential analysis (DDA), which models local temporal dynamics by fitting the data to a model which relates delay and differential embeddings in a polynomial with nonlinear terms. We evaluate these methods using both synthetic and observed neural data. Coupled FitzHugh–Nagumo network simulations provide controlled systems with known directed structure, enabling methodological evaluation across direct connections, indirect pathways, common-input relationships, and negative-control pairs. Observed sEEG recordings from a multi-attribute decision-making task provide a neural data application, with analyses focused on directed interactions among motor and premotor channels and a non-motor comparison group. This work combines linear predictive modeling, nonlinear delay-based dynamical analysis, and ground-truth simulation to study directed interactions in event-aligned neural time series. By applying both methods to controlled synthetic networks and observed intracranial recordings, the analysis provides a framework for comparing causal inference approaches in settings where ground truth is either known or biologically motivated but unobserved.

2025

Abstract: The purpose of this study is to introduce the use of Recurrent Switching Linear Dynamical Systems (rSLDS) within the framework of Distributionally Robust Optimization (DRO) for equity portfolios. In financial markets that are often subject to regime shifts, DRO can be overly conservative when the Wasserstein ambiguity set is required to span multiple regimes. Incorporating a regime-switching model can produce tighter, less conservative ambiguity sets and thus better portfolio performance. We show that rSLDS is an ideal framework for modeling regime shifts within a DRO setting: it flexibly captures diverse dynamics, can be configured to represent fundamental or factor models, and, crucially, operates as an unsupervised regime-detection tool without requiring a baseline distribution—an essential property for large datasets where manual regime identification is infeasible.

Abstract: Neural Architecture Search (NAS) is a set of algorithms that automate the design of neural networks. A combination of search space design and search strategy determines the possible set of architectures that a given NAS algorithm can choose from, which significantly impacts algorithm performance and generalization. We implement a reversible Markov chain Monte Carlo sampler with simulated annealing and delayed acceptance to improve the performance of zero-shot NAS methods, such as Zen-NAS. The sampler targets a tempered distribution in a delayed-acceptance two-stage approach. Stage one uses a proxy energy based on a zero-shot method combined with penalties for edge density and computational cost. Stage two uses validation loss to evaluate the error between the true and proxy energies, correcting for proxy error. The proposal kernel is a mixture of local multiple-try Metropolis, edge mutation, swaps, and birth/death. We demonstrate that this chain is reversible at a fixed temperature and present numerical results that show improved performance over the standard Zen-NAS implementation, showing our algorithm avoids getting stuck in local optima while ensuring the selected network meets predefined budget criteria.

Abstract: Discrete stochastic optimization focuses on optimizing a loss function given only noisy measurements, where the loss function accepts discrete-valued inputs. Of interest to the stochastic optimization literature are efficient and theoretically sound approaches to constrained stochastic optimization problems. Such algorithms find use cases in operations research, finance, and artificial intelligence, among others. For approaching the discrete case of stochastic optimization, our approach builds on the middle-point method of Simultaneous Perturbation Stochastic Approximation (SPSA) , which is a discrete analogue of the continuous case. For constrained discrete SPSA (DSPSA) in this framework, a projection method was shown to handle certain constraints. We propose a penalty function method in which a positive penalty is applied for iterates outside of a given feasible region. For theoretical analysis, generalized convergence and convergence rate results are shown for middle-point DSPSA which utilize a generalization of differential equations that gracefully deals with discontinuities. The motivation for this is to allow for a broader range of constraints and loss functions in that convergence can be shown for a wider range of loss function and penalty function compositions. Also included are theoretical results for a form of time-varying DSPSA that is beneficial to penalized DSPSA. Furthermore, we show two examples that are ideal for this framework. Additionally, numerical experiments are performed and documented as it relates to several classes of functions and constraints, including examples where theoretical convergence guarantees do not yet exist. Last, we include numerical experiments comparing the projection method and penalty method, demonstrating comparative performance.

Abstract: Circadian oscillators form a crucial component of biological systems. In particular, a circadian molecular clock is a set of genes that help regulate the behavior of an organism in a 24-hour cycle (hence the name circadian). These genes, which we refer to as clock core genes, are transcribed (DNA converted to mRNA) and translated (mRNA to protein) in a feedback-loop that is regulated by these constituent genes and external factors (called zeitgebers). Transcription and translation cause the gene expression levels to oscillate cyclically, causing rhythmic, periodic patterns. We formulate this as a dynamical system and explore methods to reproduce the dynamics using actual gene regulatory network data and infer the coupling. We compare the efficacy of each method and discuss our computational simulations and results as a basis of better understanding how genes interact, using probabilistic machine learning, physics-informed deep learning and signature methods, as well as data-driven dynamic mode decomposition methods, as novel applications to gene regulatory networks data to estimate the coupling dynamics without the need of a specified governing differential equation.

Abstract: Hamiltonian Monte Carlo (HMC) is an efficient Markov Chain Monte Carlo (MCMC) algorithm with fast convergence and scales well in the dimension of the target density. Nevertheless, its gradient requirement challenges implementation in large-scale problems. To compute a gradient for one leapfrog integration update, 2d measurements of the density are required. This study introduces HMC where gradients are obtained by means of the Simultaneous Perturbation Stochastic Approximation algorithm (SPSA). This approximation scheme requires only 2 density measurements per gradient evaluation and thus can facilitate simulations in high-dimensional settings. We prove convergence of the SPSA-HMC algorithm by extending the general framework in Zou and Gu [1] for unbiased gradient estimates. Furthermore, we analyze how two variance reduction methods further improve computational efficiency of the SPSA-HMC algorithm.

2024

Abstract: This thesis investigates the finding of closed orbits and limit cycles in continuous-time dynamical systems within the plane. The work uses a geometric approach to understanding possible regions of space for a closed-orbit trajectory. It is applied to quadratic systems that have been proven to have three or four limit cycles. Aspects of this geometric approach create an algorithm that utilizes Poincare maps sequentially to find closed orbits of several known systems. The results give tools to find closed orbits within a certain level of tolerance and to find small regions of the plane where closed orbits and limit cycles could be found.

n/a

Abstract: This thesis investigates an application of the k-fellow-traveler property for groups to finite graphs (which may or may not be Cayley), and what may be possibly revealed about the structure of the graph by analyzing its k-fellow-traveler constant. We prove that, if G is a finite graph with κ(G) ≥ 2, then diam(G) − 1 ≤ kG ≤ diam(G).

Abstract: This thesis explores the application of Time Series Classification (TSC) methods, specifically ROCKET and Signature Method, for predicting Type 2 Diabetes from synthetic patient data. The project attempts to enhance the predictability of Diabetes diagnoses by analyzing medical observation data through advanced mathematical and programming techniques.

Abstract: Graphical networks serve as powerful models for interpreting complex systems by abstracting complex scenarios with a simple network. This work leverages such networks to model the patterns in surface temperature observed within the Gulf of Mexico. To quantify seasonal dynamics, Voronoi diagrams are used to capture sub-regions that share common characteristics. Graphs of these diagrams can then be analyzed like graphs and the Gromov-Wasserstein distance metric captures in a single metric how different a daily graph is from some standard reference graph.

2023

Abstract: This study applies survival analysis to pollinator-plant relationships, modeling floral-resource utilization by pollinators. Three types of survival analysis models are compared, including the Cox proportional hazards model, a binary classification model using stacking, and a Logistic Hazards neural network model.

n/a

n/a

n/a

n/a

n/a

n/a

2022

Abstract: This thesis develops a methodology for applying modern manifold embedding algorithms to the problem of empirical asset pricing. Our technique combines traditional linear compression with geometric dimensionality reduction in order to characterize the time-evolving distribution of a nonconstant dimensional time series using a small number of latent factors.

Abstract: This thesis focuses on modeling sleep duration with the most prominent medical model for sleep, the two-process model, in conjunction with a novel mathematical architecture that aspires to capture both the circular nature and the complexity of daily schedules and habits. Specifically, the two aims of this thesis are (1) to develop a theoretical mathematical framework to describe cyclical sleep and wake patterns and (2) to test this framework computationally with empirical patient data.

n/a

n/a

n/a

2021

Abstract: We couple a multivariate description with a time-dependent Hawkes/INAR(p) process. This model can be updated by sensors and is essentially a Kalman filter for INAR coupled data streams. This is a way to automatically interrogate an incoming stream of data for change-points and to adjust the stationary distribution to a new stationary distribution when/if the underlying stream of data is seen to have changed.

Abstract: There are many techniques available to recover causal relationships from data, such as Granger causality, convergent cross mapping, and causal graph structure learning approaches such as PCMCI. Path signatures and their associated signed areas provide a new way to approach the analysis of causally linked dynamical systems, particularly in informing a model-free, data-driven approach to algorithmic causal discovery. With this paper, we explore the use of path signatures in causal discovery and propose the application of confidence sequences to analyze the significance of the magnitude of the signed area between two variables.

Abstract: In this work, we consider the challenge of node immunization where information regarding network topology is inferred only through agent exploration along an unbiased random walk. In the first part, we formulate this as a Markov decision process problem and derive heuristic-based policies for scale-free and uncorrelated networks. We demonstrate empirical evidence that these policies achieve their objectives near-optimally and provide a policy-of-policies for situations where information about network family does not exist. In the second part, we introduce our open-source contagion package and use it to illustrate immunization policy performance with contagion simulations.

Abstract: This thesis introduces the UMAP-SPSA algorithm to perform the UMAP dimension reduction without the need for the smooth approximator. Further, we analyze the algorithm’s computational performance and embedding accuracy.

Abstract: Predicting clinical outcomes from time-series medical data is a complex but essential endeavor. In this study, we propose a novel approach that combines traditional survival models like Cox proportional hazards, logistic, and multi-task logistic regression (MTLR) with the robust mathematical framework of signature methods. These methods are particularly effective in capturing the underlying dynamics of time-series data with stochastic error. We introduce the concept of rough paths to provide a foundational understanding of how these techniques can capture not only the data’s deterministic aspects but also its stochastic nature, thereby enriching the feature set used for making more accurate predictions.

2020

Abstract: Our results show that since mutual information remains invariant under homeomorphism, only feature engineering methods that alter the entropy of the dataset will change the outcome of the neural network. This means that for some datasets and tasks, neural networks require meaningful, human-driven feature engineering or changes in architecture to provide enough information for the neural network to generate a sufficient statistic.

2019

Abstract: We define Sobolev and total variation priors on image smoothness, which control the derivatives of the images, to regularize (i.e. reduce complexity by removing unreasonable parameter choices from) the high-dimensional parameter space prescribed by the rigid motion dimensions and the diffeomorphism dimensions. We show that the quality of rigid slice alignment brought by introducing a Sobolev prior on the image intensity of a phantom and the bat brain data is superior to that of the total variation priors.