math.PR

27 posts

arXiv:2501.00762v1 Announce Type: new Abstract: Graph neural networks (GNNs) have achieved remarkable empirical success in processing and representing graph-structured data across various domains. However, a significant challenge known as "oversmoothing" persists, where vertex features become nearly indistinguishable in deep GNNs, severely restricting their expressive power and practical utility. In this work, we analyze the asymptotic oversmoothing rates of deep GNNs with and without residual connections by deriving explicit convergence rates for a normalized vertex similarity measure. Our analytical framework is grounded in the multiplicative ergodic theorem. Furthermore, we demonstrate that adding residual connections effectively mitigates or prevents oversmoothing across several broad families of parameter distributions. The theoretical findings are strongly supported by numerical experiments.

Ziang Chen, Zhengjiang Lin, Shi Chen, Yury Polyanskiy, Philippe Rigollet1/3/2025

arXiv:2501.00780v1 Announce Type: new Abstract: We introduce a novel meshless simulation method for the McKean-Vlasov Stochastic Differential Equation (MV-SDE) utilizing deep learning, applicable to both self-interaction and interaction scenarios. Traditionally, numerical methods for this equation rely on the interacting particle method combined with techniques based on the It\^o-Taylor expansion. The convergence rate of this approach is determined by two parameters: the number of particles $N$ and the time step size $h$ for each Euler iteration. However, for extended time horizons or equations with larger Lipschitz coefficients, this method is often limited, as it requires a significant increase in Euler iterations to achieve the desired precision $\epsilon$. To overcome the challenges posed by the difficulty of parallelizing the simulation of continuous interacting particle systems, which involve solving high-dimensional coupled SDEs, we propose a meshless MV-SDE solver grounded in Physics-Informed Neural Networks (PINNs) that does not rely on the propagation of chaos result. Our method constructs a pseudo MV-SDE using It\^o calculus, then quantifies the discrepancy between this equation and the original MV-SDE, with the error minimized through a loss function. This loss is controlled via an optimization algorithm, independent of the time step size, and we provide an error estimate for the loss function. The advantages of our approach are demonstrated through corresponding simulations.

Jingyuan Li, Wei Liu1/3/2025

arXiv:2110.07722v2 Announce Type: replace Abstract: This paper managed to induce probability theory (sigma system) and possibility theory (max system) respectively from the clearly-defined randomness and fuzziness, while focusing the question why the key axiom of "maxitivity" is adopted for possibility measure. Such an objective is achieved by following three steps: a) the establishment of mathematical definitions of randomness and fuzziness; b) the development of intuitive definition of possibility as measure of fuzziness based on compatibility interpretation; c) the abstraction of the axiomatic definitions of probability/ possibility from their intuitive definitions, by taking advantage of properties of the well-defined randomness and fuzziness. We derived the conclusion that "max" is the only but un-strict disjunctive operator that is applicable across the fuzzy event space, and is an exact operator for extracting the value from the fuzzy sample space that leads to the largest possibility of one. Then a demonstration example of stock price prediction is presented, which confirms that max inference indeed exhibits distinctive performance, with an improvement up to 18.99%, over sigma inference for the investigated application. Our work provides a physical foundation for the axiomatic definition of possibility for the measure of fuzziness, which hopefully would facilitate wider adoption of possibility theory in practice.

Wei Mei, Ming Li, Yuanzeng Cheng, Limin Liu1/3/2025

arXiv:2112.03543v3 Announce Type: replace Abstract: Communication noise is a common feature in several real-world scenarios where systems of agents need to communicate in order to pursue some collective task. In particular, many biologically inspired systems that try to achieve agreements on some opinion must implement resilient dynamics that are not strongly affected by noisy communications. In this work, we study the popular 3-Majority dynamics, an opinion dynamics which has been proved to be an efficient protocol for the majority consensus problem, in which we introduce a simple feature of uniform communication noise, following (d'Amore et al. 2020). We prove that in the fully connected communication network of n agents and in the binary opinion case, the process induced by the 3-Majority dynamics exhibits a phase transition. For a noise probability $p 1/3$, no form of consensus is possible, and any information regarding the initial majority opinion is lost in logarithmic time with high probability. Despite more communications per-round are allowed, the 3-Majority dynamics surprisingly turns out to be less resilient to noise than the Undecided-State dynamics (d'Amore et al. 2020), whose noise threshold value is $p = 1/2$.

Francesco d'Amore, Isabella Ziccardi1/3/2025

arXiv:2210.07996v5 Announce Type: replace Abstract: We study the classical Network Revenue Management (NRM) problem with accept/reject decisions and $T$ IID arrivals. We consider a distributional form where each arrival must fall under a finite number of possible categories, each with a deterministic resource consumption vector, but a random value distributed continuously over an interval. We develop an online algorithm that achieves $O(\log^2 T)$ regret under this model, with the only (necessary) assumption being that the probability densities are bounded away from 0. We derive a second result that achieves $O(\log T)$ regret under an additional assumption of second-order growth. To our knowledge, these are the first results achieving logarithmic-level regret in an NRM model with continuous values that do not require any kind of "non-degeneracy" assumptions. Our results are achieved via new techniques including a new method of bounding myopic regret, a "semi-fluid" relaxation of the offline allocation, and an improved bound on the "dual convergence".

Jiashuo Jiang, Will Ma, Jiawei Zhang1/3/2025

arXiv:2411.06857v2 Announce Type: replace Abstract: We study algebraic properties of partition functions, particularly the location of zeros, through the lens of rapidly mixing Markov chains. The classical Lee-Yang program initiated the study of phase transitions via locating complex zeros of partition functions. Markov chains, besides serving as algorithms, have also been used to model physical processes tending to equilibrium. In many scenarios, rapid mixing of Markov chains coincides with the absence of phase transitions (complex zeros). Prior works have shown that the absence of phase transitions implies rapid mixing of Markov chains. We reveal a converse connection by lifting probabilistic tools for the analysis of Markov chains to study complex zeros of partition functions. Our motivating example is the independence polynomial on $k$-uniform hypergraphs, where the best-known zero-free regime has been significantly lagging behind the regime where we have rapidly mixing Markov chains for the underlying hypergraph independent sets. Specifically, the Glauber dynamics is known to mix rapidly on independent sets in a $k$-uniform hypergraph of maximum degree $\Delta$ provided that $\Delta \lesssim 2^{k/2}$. On the other hand, the best-known zero-freeness around the point $1$ of the independence polynomial on $k$-uniform hypergraphs requires $\Delta \le 5$, the same bound as on a graph. By introducing a complex extension of Markov chains, we lift an existing percolation argument to the complex plane, and show that if $\Delta \lesssim 2^{k/2}$, the Markov chain converges in a complex neighborhood, and the independence polynomial itself does not vanish in the same neighborhood. In the same regime, our result also implies central limit theorems for the size of a uniformly random independent set, and deterministic approximation algorithms for the number of hypergraph independent sets of size $k \le \alpha n$ for some constant $\alpha$.

Jingcheng Liu, Chunyang Wang, Yitong Yin, Yixiao Yu1/3/2025

arXiv:2412.04489v2 Announce Type: replace Abstract: The potential demand in the market and customers' perception of service value are crucial factors in pricing strategies, resource allocation, and other operational decisions. However, this information is typically private and not readily accessible. In this paper, we analyze a service system operating across two stations, each with its own customer flow. Customers arriving at the system are informed of the waiting times at both stations and can choose to either join the local station, switch to the other station, or balk. Our objective is to estimate the arrival rates at each station and customers' perceived service value based on the observed workloads at both stations. A significant challenge arises from the inability to observe balking customers and the lack of distinction between local arrivals and customers switching from the other station, as the switching cost is unknown. To address this issue, we employ maximum likelihood estimation and validate the effectiveness of the estimator through a series of simulations.

Nishant Mangre, Jiesen Wang12/25/2024

arXiv:2412.07592v2 Announce Type: replace-cross Abstract: We study the complexity of deterministic and probabilistic inversions of partial computable functions on the reals.

George Barmpalias, Mingyang Wang, Xiaoyan Zhang12/25/2024

arXiv:2412.18014v1 Announce Type: new Abstract: Universality, namely distributional invariance, is a well-known property for many random structures. For example it is known to hold for a broad range of variational problems with random input. Much less is known about the universality of the performance of specific algorithms for solving such variational problems. Namely, do algorithms tuned to specific variational tasks produce the same asymptotic answer regardless of the underlying distribution? In this paper we show that the answer is yes for a class of models, which includes spin glass models and constraint satisfaction problems on sparse graphs, provided that an algorithm can be coded as a low-degree polynomial (LDP). We illustrate this specifically for the case of the Max-Cut problem in sparse Erd\"os-R\'enyi graph $\mathbb{G}(n, d/n)$. We use the fact that the Approximate Message Passing (AMP) algorithm, which is an effective algorithm for finding near-ground state of the Sherrington-Kirkpatrick (SK) model, is well approximated by an LDP. We then establish our main universality result: the performance of the LDP based algorithms exhibiting certain connectivity property, is the same in the mean-field (SK) and in the random graph $\mathbb{G}(n, d/n)$ setting, up to an appropriate rescaling. The main technical challenge which we address in this paper is showing that the output of the LDP algorithm on $\mathbb{G}(n, d/n)$ is truly discrete, namely it is close to the set of points in the binary cube.

Houssam El Cheairi, David Gamarnik12/25/2024

arXiv:2412.18184v1 Announce Type: new Abstract: Quantization and pruning are two essential techniques for compressing neural networks, yet they are often treated independently, with limited theoretical analysis connecting them. This paper introduces a unified framework for post-training quantization and pruning using stochastic path-following algorithms. Our approach builds on the Stochastic Path Following Quantization (SPFQ) method, extending its applicability to pruning and low-bit quantization, including challenging 1-bit regimes. By incorporating a scaling parameter and generalizing the stochastic operator, the proposed method achieves robust error correction and yields rigorous theoretical error bounds for both quantization and pruning as well as their combination.

Haoyu Zhang, Rayan Saab12/25/2024

arXiv:2412.18599v1 Announce Type: new Abstract: Theoretical guarantees for double spending probabilities for the Nakamoto consensus under the $k$-deep confirmation rule have been extensively studied for zero/bounded network delays and fixed mining rates. In this paper, we introduce a ruin-theoretical model of double spending for Nakamoto consensus under the $k$-deep confirmation rule when the honest mining rate is allowed to be an arbitrary function of time including the block delivery periods, i.e., time periods during which mined blocks are being delivered to all other participants of the network. Time-varying mining rates are considered to capture the intrinsic characteristics of the peer to peer network delays as well as dynamic participation of miners such as the gap game and switching between different cryptocurrencies. Ruin theory is leveraged to obtain the double spend probabilities and numerical examples are presented to validate the effectiveness of the proposed analytical method.

Mustafa Doger, Sennur Ulukus, Nail Akar12/25/2024

arXiv:2412.18338v1 Announce Type: cross Abstract: We establish weak convergence rates for spectral Galerkin approximations of the stochastic viscous Burgers equation driven by additive trace-class noise. Our results complement the known results regarding strong convergence; we obtain essential weak convergence rate 2. As expected, this is twice the known strong rate. The main ingredients of the proof are novel regularity results on the solutions of the associated Kolmogorov equations.

Charles-Edouard Br\'ehier, Sonja Cox, Annie Millet12/25/2024

arXiv:2412.18432v1 Announce Type: cross Abstract: Entropic optimal transport problems are regularized versions of optimal transport problems. These models play an increasingly important role in machine learning and generative modelling. For finite spaces, these problems are commonly solved using Sinkhorn algorithm (a.k.a. iterative proportional fitting procedure). However, in more general settings the Sinkhorn iterations are based on nonlinear conditional/conjugate transformations and exact finite-dimensional solutions cannot be computed. This article presents a finite-dimensional recursive formulation of the iterative proportional fitting procedure for general Gaussian multivariate models. As expected, this recursive formulation is closely related to the celebrated Kalman filter and related Riccati matrix difference equations, and it yields algorithms that can be implemented in practical settings without further approximations. We extend this filtering methodology to develop a refined and self-contained convergence analysis of Gaussian Sinkhorn algorithms, including closed form expressions of entropic transport maps and Schr\"odinger bridges.

O. Deniz Akyildiz, Pierre Del Moral, Joaqu\'in Miguez12/25/2024

arXiv:2412.18468v1 Announce Type: cross Abstract: Matrix concentration inequalities and their recently discovered sharp counterparts provide powerful tools to bound the spectrum of random matrices whose entries are linear functions of independent random variables. However, in many applications in theoretical computer science and in other areas one encounters more general random matrix models, called matrix chaoses, whose entries are polynomials of independent random variables. Such models have often been studied on a case-by-case basis using ad-hoc methods that can yield suboptimal dimensional factors. In this paper we provide general matrix concentration inequalities for matrix chaoses, which enable the treatment of such models in a systematic manner. These inequalities are expressed in terms of flattenings of the coefficients of the matrix chaos. We further identify a special family of matrix chaoses of combinatorial type for which the flattening parameters can be computed mechanically by a simple rule. This allows us to provide a unified treatment of and improved bounds for matrix chaoses that arise in a variety of applications, including graph matrices, Khatri-Rao matrices, and matrices that arise in average case analysis of the sum-of-squares hierarchy.

Afonso S. Bandeira, Kevin Lucca, Petar Nizi\'c-Nikolac, Ramon van Handel12/25/2024

arXiv:2407.16642v4 Announce Type: replace-cross Abstract: We revisit the classic problem of aggregating binary advice from conditionally independent experts, also known as the Naive Bayes setting. Our quantity of interest is the error probability of the optimal decision rule. In the case of symmetric errors (sensitivity = specificity), reasonably tight bounds on the optimal error probability are known. In the general asymmetric case, we are not aware of any nontrivial estimates on this quantity. Our contribution consists of sharp upper and lower bounds on the optimal error probability in the general case, which recover and sharpen the best known results in the symmetric special case. Since this turns out to be equivalent to estimating the total variation distance between two product distributions, our results also have bearing on this important and challenging problem.

Aryeh Kontorovich, Ariel Avital12/24/2024

arXiv:2412.16307v1 Announce Type: new Abstract: We investigate the qualitative behaviour of the solutions of a stochastic boundary value problem on the half-line for a nonlinear system of parabolic reaction-diffusion equations, from a numerical point of view. The model describes the chemical aggression of calcium carbonate stones under the attack of sulphur dioxide. The dynamical boundary condition is given by a Pearson diffusion, which is original in the context of the degradation of cultural heritage. We first discuss a scheme based on the Lamperti transformation for the stochastic differential equation to preserve the boundary and a splitting strategy for the partial differential equation based on recent theoretical results. Positiveness, boundedness, and stability are stated. The impact of boundary noise on the solution and its qualitative behaviour both in the slow and fast regimes is discussed in several numerical experiments.

Francesca Arceci, Daniela Morale, Stefania Ugolini12/24/2024

arXiv:2412.17418v1 Announce Type: new Abstract: This paper studies the numerical simulation of the solution to the McKean-Vlasov equation with common noise. We begin by discretizing the solution in time using the Euler scheme, followed by spatial discretization through the particle method, inspired by the propagation of chaos property. Assuming H{\"o}lder continuity in time, as well as Lipschitz continuity in the state and measure arguments of the coefficient functions, we establish the convergence rate of the Euler scheme and the particle method. These results extend those for the standard McKean-Vlasov equation without common noise. Finally, we present two simulation examples : a modified conditional Ornstein Uhlenbeck process with common noise and an interbank market model.

Th\'eophile Le Gall (CEREMADE)12/24/2024

arXiv:2412.16457v1 Announce Type: cross Abstract: In this paper, we focus on the matching recovery problem between a pair of correlated Gaussian Wigner matrices with a latent vertex correspondence. We are particularly interested in a robust version of this problem such that our observation is a perturbed input $(A+E,B+F)$ where $(A,B)$ is a pair of correlated Gaussian Wigner matrices and $E,F$ are adversarially chosen matrices supported on an unknown $\epsilon n * \epsilon n$ principle minor of $A,B$, respectively. We propose a vector-approximate message passing (vector-AMP) algorithm that succeeds in polynomial time as long as the correlation $\rho$ between $(A,B)$ is a non-vanishing constant and $\epsilon = o\big( \tfrac{1}{(\log n)^{20}} \big)$. The main methodological inputs for our result are the iterative random graph matching algorithm proposed in \cite{DL22+, DL23+} and the spectral cleaning procedure proposed in \cite{IS24+}. To the best of our knowledge, our algorithm is the first efficient random graph matching type algorithm that is robust under any adversarial perturbations of $n^{1-o(1)}$ size.

Zhangsong Li12/24/2024

arXiv:2405.17191v2 Announce Type: replace Abstract: Generative adversarial networks (GANs) have emerged as a powerful tool for generating high-fidelity data. However, the main bottleneck of existing approaches is the lack of supervision on the generator training, which often results in undamped oscillation and unsatisfactory performance. To address this issue, we propose an algorithm called Monte Carlo GAN (MCGAN). This approach, utilizing an innovative generative loss function, termly the regression loss, reformulates the generator training as a regression task and enables the generator training by minimizing the mean squared error between the discriminator's output of real data and the expected discriminator of fake data. We demonstrate the desirable analytic properties of the regression loss, including discriminability and optimality, and show that our method requires a weaker condition on the discriminator for effective generator training. These properties justify the strength of this approach to improve the training stability while retaining the optimality of GAN by leveraging strong supervision of the regression loss. Extensive experiments on diverse datasets, including image data (CIFAR-10/100, FFHQ256, ImageNet, and LSUN Bedroom), time series data (VAR and stock data) and video data, are conducted to demonstrate the flexibility and effectiveness of our proposed MCGAN. Numerical results show that the proposed MCGAN is versatile in enhancing a variety of backbone GAN models and achieves consistent and significant improvement in terms of quality, accuracy, training stability, and learned latent space.

Baoren Xiao, Hao Ni, Weixin Yang12/24/2024

arXiv:2412.11526v2 Announce Type: replace Abstract: Machine learning (ML) has emerged as a powerful tool for tackling complex regression and classification tasks, yet its success often hinges on the quality of training data. This study introduces an ML paradigm inspired by domain knowledge of the structure of output function, akin to physics-informed ML, but rooted in probabilistic principles rather than physical laws. The proposed approach integrates the probabilistic structure of the target variable-such as its cumulative distribution function-into the training process. This probabilistic information is obtained from historical data or estimated using structural reliability methods during experimental design. By embedding domain-specific probabilistic insights into the learning process, the technique enhances model accuracy and mitigates risks of overfitting and underfitting. Applications in regression, image denoising, and classification demonstrate the approach's effectiveness in addressing real-world problems.

Mohsen Rashki12/24/2024