math.PR
67 postsarXiv:2503.22418v1 Announce Type: new Abstract: Based on existing ideas in the field of imprecise probabilities, we present a new approach for assessing the reliability of the individual predictions of a generative probabilistic classifier. We call this approach robustness quantification, compare it to uncertainty quantification, and demonstrate that it continues to work well even for classifiers that are learned from small training sets that are sampled from a shifted distribution.
arXiv:2503.22030v1 Announce Type: new Abstract: Robots rely on motion planning to navigate safely and efficiently while performing various tasks. In this paper, we investigate motion planning through Bayesian inference, where motion plans are inferred based on planning objectives and constraints. However, existing Bayesian motion planning methods often struggle to explore low-probability regions of the planning space, where high-quality plans may reside. To address this limitation, we propose the use of heavy-tailed distributions -- specifically, Student's-$t$ distributions -- to enhance probabilistic inferential search for motion plans. We develop a novel sequential single-pass smoothing approach that integrates Student's-$t$ distribution with Monte Carlo sampling. A special case of this approach is ensemble Kalman smoothing, which depends on short-tailed Gaussian distributions. We validate the proposed approach through simulations in autonomous vehicle motion planning, demonstrating its superior performance in planning, sampling efficiency, and constraint satisfaction compared to ensemble Kalman smoothing. While focused on motion planning, this work points to the broader potential of heavy-tailed distributions in enhancing probabilistic decision-making in robotics.
arXiv:2404.05185v2 Announce Type: replace-cross Abstract: This paper deals with a class of neural SDEs and studies the limiting behavior of the associated sampled optimal control problems as the sample size grows to infinity. The neural SDEs with N samples can be linked to the N-particle systems with centralized control. We analyze the Hamilton--Jacobi--Bellman equation corresponding to the N-particle system and establish regularity results which are uniform in N. The uniform regularity estimates are obtained by the stochastic maximum principle and the analysis of a backward stochastic Riccati equation. Using these uniform regularity results, we show the convergence of the minima of objective functionals and optimal parameters of the neural SDEs as the sample size N tends to infinity. The limiting objects can be identified with suitable functions defined on the Wasserstein space of Borel probability measures. Furthermore, quantitative algebraic convergence rates are also obtained.
arXiv:2503.21855v1 Announce Type: new Abstract: We present a new class of numerical methods for solving stochastic differential equations with additive noise on general Riemannian manifolds with high weak order of accuracy. In opposition to the popular approach with projection methods, the proposed methods are intrinsic: they only rely on geometric operations and avoid coordinates and embeddings. We provide a robust and general convergence analysis and an algebraic formalism of exotic planar Butcher series for the computation of order conditions at any high order. To illustrate the methodology, an explicit method of second weak order is introduced, and several numerical experiments confirm the theoretical findings and extend the approach for the sampling of the invariant measure of Riemannian Langevin dynamics.
arXiv:2411.07064v2 Announce Type: replace-cross Abstract: We develop a powerful and general method to provide rigorous and accurate upper and lower bounds for Lyapunov exponents of stochastic flows. Our approach is based on computer-assisted tools, the adjoint method and established results on the ergodicity of diffusion processes. We do not require any structural assumptions on the stochastic system, work under mild hypoellipticity conditions and outside of perturbative regimes. Therefore, our method allows for the treatment of systems that were so far out of reach from existing mathematical tools. We demonstrate our method to exhibit the chaotic nature of three different systems. Finally, we show the robustness of our approach by combining it with continuation methods to produce bounds on Lyapunov exponents over large parameter regions.
arXiv:2407.15455v3 Announce Type: replace-cross Abstract: We propose a new algorithm for learning bridged diffusion processes using score-matching methods. Our method relies on reversing the dynamics of the forward process and using this to learn a score function, which, via Doob's $h$-transform, yields a bridged diffusion process; that is, a process conditioned on an endpoint. In contrast to prior methods, we learn the score term $\nabla_x \log p(t, x; T, y)$ directly, for given $t, y$, completely avoiding first learning a time-reversal. We compare the performance of our algorithm with existing methods and see that it outperforms using the (learned) time-reversals to learn the score term. The code can be found at https://github.com/libbylbaker/forward_bridge.
arXiv:2503.10428v1 Announce Type: new Abstract: In this work, we will establish that the Langevin Monte-Carlo algorithm can learn depth-2 neural nets of any size and for any data and we give non-asymptotic convergence rates for it. We achieve this via showing that under Total Variation distance and q-Renyi divergence, the iterates of Langevin Monte Carlo converge to the Gibbs distribution of Frobenius norm regularized losses for any of these nets, when using smooth activations and in both classification and regression settings. Most critically, the amount of regularization needed for our results is independent of the size of the net. The key observation of ours is that two layer neural loss functions can always be regularized by a constant amount such that they satisfy the Villani conditions, and thus their Gibbs measures satisfy a Poincare inequality.
arXiv:2503.10002v1 Announce Type: cross Abstract: Given $d>0$ and a positive integer $n$, let $G$ be a triangle-free graph on $n$ vertices with average degree $d$. With an elegant induction, Shearer (1983) tightened a seminal result of Ajtai, Koml\'os and Szemer\'edi (1980/1981) by proving that $G$ contains an independent set of size at least $(1+o(1))\frac{\log d}{d}n$ as $d\to\infty$. By a generalisation of Shearer's method, we prove that the number of independent sets in $G$ must be at least $\exp\left((1+o(1))\frac{(\log d)^2}{2d}n\right)$ as $d\to\infty$. This improves upon results of Cooper and Mubayi (2014) and Davies, Jenssen, Perkins, and Roberts (2018). Our method also provides good lower bounds on the independence polynomial of $G$, one of which implies Shearer's result itself. As certified by a classic probabilistic construction, our bound on the number of independent sets is sharp to several leading terms as $d\to\infty$.
arXiv:2503.10580v1 Announce Type: cross Abstract: We prove an upper bound on the expected $\ell_p$ injective norm of sums of subgaussian random tensors. Our proof is simple and does not rely on any explicit geometric or chaining arguments. Instead, it follows from a simple application of the PAC-Bayesian lemma, a tool that has proven effective at controlling the suprema of certain ``smooth'' empirical processes in recent years. Our bound strictly improves a very recent result of Bandeira, Gopi, Jiang, Lucca, and Rothvoss. In the Euclidean case ($p=2$), our bound sharpens a result of Lata{\l}a that was central to proving his estimates on the moments of Gaussian chaoses. As a consequence, we obtain an elementary proof of this fundamental result.
arXiv:2503.09904v1 Announce Type: new Abstract: In studies on complex network systems using graph theory, eigen-analysis is typically performed on an undirected graph model of the network. However, when analyzing cascading failures in a power system, the interactions among failures suggest the need for a directed graph beyond the topology of the power system to model directions of failure propagation. To accurately quantify failure interactions for effective mitigation strategies, this paper proposes a stochastic interaction graph model and associated eigen-analysis. Different types of modes on failure propagations are defined and characterized by the eigenvalues of a stochastic interaction matrix, whose absolute values are unity, zero, or in between. Finding and interpreting these modes helps identify the probable patterns of failure propagation, either local or widespread, and the participating components based on eigenvectors. Then, by lowering the failure probabilities of critical components highly participating in a mode of widespread failures, cascading can be mitigated. The validity of the proposed stochastic interaction graph model, eigen-analysis and the resulting mitigation strategies is demonstrated using simulated cascading failure data on an NPCC 140-bus system.
arXiv:2503.09762v1 Announce Type: new Abstract: We study a centralized discrete-time dynamic two-way matching model with finitely many agent types. Agents arrive stochastically over time and join their type-dedicated queues waiting to be matched. We focus on state-independent greedy policies that achieve constant regret at all times by making matching decisions based solely on agent availability across types, rather than requiring complete queue-length information. Such policies are particularly appealing for life-saving applications such as kidney exchange, as they require less information and provide more transparency compared to state-dependent policies. First, for acyclic matching networks, we analyze a deterministic priority policy proposed by Kerimov et al. [2023] that follows a static priority order over matches. We derive the first explicit regret bound in terms of the general position gap (GPG) parameter $\epsilon$, which measures the distance of the fluid relaxation from degeneracy. Second, for general two-way matching networks, we design a randomized state-independent greedy policy that achieves constant regret with optimal scaling $O(\epsilon^{-1})$, matching the existing lower bound established by Kerimov et al. [2024].
arXiv:2407.09301v3 Announce Type: replace-cross Abstract: We prove non asymptotic total variation estimates for the kinetic Langevin algorithm in high dimension when the target measure satisfies a Poincar\'e inequality and has gradient Lipschitz potential. The main point is that the estimate improves significantly upon the corresponding bound for the non kinetic version of the algorithm, due to Dalalyan. In particular the dimension dependence drops from $O(n)$ to $O(\sqrt n)$.
arXiv:2503.05323v1 Announce Type: cross Abstract: We consider the graph alignment problem, wherein the objective is to find a vertex correspondence between two graphs that maximizes the edge overlap. The graph alignment problem is an instance of the quadratic assignment problem (QAP), known to be NP-hard in the worst case even to approximately solve. In this paper, we analyze Birkhoff relaxation, a tight convex relaxation of QAP, and present theoretical guarantees on its performance when the inputs follow the Gaussian Wigner Model. More specifically, the weighted adjacency matrices are correlated Gaussian Orthogonal Ensemble with correlation $1/\sqrt{1+\sigma^2}$. Denote the optimal solutions of the QAP and Birkhoff relaxation by $\Pi^\star$ and $X^\star$ respectively. We show that $\|X^\star-\Pi^\star\|_F^2 = o(n)$ when $\sigma = o(n^{-1.25})$ and $\|X^\star-\Pi^\star\|_F^2 = \Omega(n)$ when $\sigma = \Omega(n^{-0.5})$. Thus, the optimal solution $X^\star$ transitions from a small perturbation of $\Pi^\star$ for small $\sigma$ to being well separated from $\Pi^\star$ as $\sigma$ becomes larger than $n^{-0.5}$. This result allows us to guarantee that simple rounding procedures on $X^\star$ align $1-o(1)$ fraction of vertices correctly whenever $\sigma = o(n^{-1.25})$. This condition on $\sigma$ to ensure the success of the Birkhoff relaxation is state-of-the-art.
arXiv:2402.03467v2 Announce Type: replace Abstract: We give quantitative estimates for the rate of convergence of Riemannian stochastic gradient descent (RSGD) to Riemannian gradient flow and to a diffusion process, the so-called Riemannian stochastic modified flow (RSMF). Using tools from stochastic differential geometry we show that, in the small learning rate regime, RSGD can be approximated by the solution to the RSMF driven by an infinite-dimensional Wiener process. The RSMF accounts for the random fluctuations of RSGD and, thereby, increases the order of approximation compared to the deterministic Riemannian gradient flow. The RSGD is build using the concept of a retraction map, that is, a cost efficient approximation of the exponential map, and we prove quantitative bounds for the weak error of the diffusion approximation under assumptions on the retraction map, the geometry of the manifold, and the random estimators of the gradient.
arXiv:2503.04855v1 Announce Type: cross Abstract: We characterize a joint CLT of the number of pulls and the sample mean reward of the arms in a stochastic two-armed bandit environment under UCB algorithms. Several implications of this result are in place: (1) a nonstandard CLT of the number of pulls hence pseudo-regret that smoothly interpolates between a standard form in the large arm gap regime and a slow-concentration form in the small arm gap regime, and (2) a heuristic derivation of the sample bias up to its leading order from the correlation between the number of pulls and sample means. Our analysis framework is based on a novel perturbation analysis, which is of broader interest on its own.
arXiv:2408.15858v2 Announce Type: replace-cross Abstract: In this article, we study a discrete and continuous version of the spectral Dirichlet problem in an open bounded connected set $\Omega\subset \mathbb{R}^d$, in dimension $d\geq 2$. More precisely, we consider the simple random walk on $\mathbb{Z}^d$ killed upon exiting the (large) bounded domain $\Omega_N = (N\Omega)\cap \mathbb{Z}^d$. We denote by $P_N$ the corresponding transition matrix and we study the properties of its ($L^2$-normalized) principal eigenvector $\phi_N$. With probabilistic arguments (namely gambler's ruin estimates and coupling techniques) and under mild assumptions on the domain, we are able to give regularity estimates on $\phi_N$ (namely on its $k$-th order differences), with a uniform control inside $\Omega_N$. The same holds true for the first eigenfunction $\varphi_1$ of the corresponding continuous spectral Dirichlet problem, this being related to a Brownian motion killed upon exiting $\Omega$. As corollaries, we derive the $L^2$ and uniform convergence of $\phi_N$ to $\varphi_1$ in Lipschitz bounded domains.
arXiv:2503.05068v1 Announce Type: new Abstract: In this work, we propose and analyze a new local time-decoupled squared Wasserstein-2 method for reconstructing the distribution of unknown parameters in dynamical systems. Specifically, we show that a stochastic neural network model, which can be effectively trained by minimizing our proposed local time-decoupled squared Wasserstein-2 loss function, is an effective model for approximating the distribution of uncertain model parameters in dynamical systems. Through several numerical examples, we showcase the effectiveness of our proposed method in reconstructing the distribution of parameters in different dynamical systems.
arXiv:2502.07347v3 Announce Type: replace Abstract: In artificial intelligence (AI) and decision-making systems, structured approximations play a crucial role in balancing model interpretability and predictive accuracy. Coarse Set Theory (CST) introduces a mathematical framework to formalize Coarse Ethics (CE), which models coarse-grained decision-making processes commonly used in human evaluations and AI classification systems. CST defines hierarchical relationships among sets using totally ordered structures and coarse mappings, enabling us to adjust decision granularity dynamically. Furthermore, coarse evaluations inherently involve a trade-off between efficiency and information retention, as they simplify complex data representations at the cost of precision. To quantitatively assess this trade-off, we introduce Kullback-Leibler (KL) Divergence as a measure of information loss in coarse evaluations, demonstrating the impact of coarse partitioning on decision accuracy. This study employs CST in grading systems, automated recommendations, and risk assessments, demonstrating its potential to enhance fairness, reduce bias, and improve transparency in AI-driven decision-making.
arXiv:2503.01855v1 Announce Type: cross Abstract: The desirable gambles framework provides a foundational approach to imprecise probability theory but relies heavily on linear utility assumptions. This paper introduces {\em function-coherent gambles}, a generalization that accommodates non-linear utility while preserving essential rationality properties. We establish core axioms for function-coherence and prove a representation theorem that characterizes acceptable gambles through continuous linear functionals. The framework is then applied to analyze various forms of discounting in intertemporal choice, including hyperbolic, quasi-hyperbolic, scale-dependent, and state-dependent discounting. We demonstrate how these alternatives to constant-rate exponential discounting can be integrated within the function-coherent framework. This unified treatment provides theoretical foundations for modeling sophisticated patterns of time preference within the desirability paradigm, bridging a gap between normative theory and observed behavior in intertemporal decision-making under genuine uncertainty.
arXiv:2503.05137v1 Announce Type: cross Abstract: We recall the classical formulation of PageRank as the stationary distribution of a singularly perturbed irreducible Markov chain that is not irreducible when the perturbation parameter goes to zero. Specifically, we use the Markov chain tree theorem to derive explicit expressions for the PageRank. This analysis leads to some surprising results. These results are then extended to a much more general class of perturbations that subsume personalized PageRank. We also give examples where even simpler formulas for PageRank are possible.