physics.data-an

10 posts

arXiv:2503.10154v1 Announce Type: cross Abstract: Identifying model parameters from observed configurations poses a fundamental challenge in data science, especially with limited data. Recently, diffusion models have emerged as a novel paradigm in generative machine learning, capable of producing new samples that closely mimic observed data. These models learn the gradient of model probabilities, bypassing the need for cumbersome calculations of partition functions across all possible configurations. We explore whether diffusion models can enhance parameter inference by augmenting small datasets. Our findings demonstrate this potential through a synthetic task involving inverse Ising inference and a real-world application of reconstructing missing values in neural activity data. This study serves as a proof-of-concept for using diffusion models for data augmentation in physics-related problems, thereby opening new avenues in data science.

Yechan Lim, Sangwon Lee, Junghyo Jo3/14/2025

arXiv

cond-mat.stat-mech cs.LG physics.data-an q-bio.NC

Thermodynamic Bound on Energy and Negentropy Costs of Inference in Deep Neural Networks

arXiv:2503.09980v1 Announce Type: cross Abstract: The fundamental thermodynamic bound is derived for the energy cost of inference in Deep Neural Networks (DNNs). By applying Landauer's principle, we demonstrate that the linear operations in DNNs can, in principle, be performed reversibly, whereas the non-linear activation functions impose an unavoidable energy cost. The resulting theoretical lower bound on the inference energy is determined by the average number of neurons undergoing state transition for each inference. We also restate the thermodynamic bound in terms of negentropy, a metric which is more universal than energy for assessing thermodynamic cost of information processing. Concept of negentropy is further elaborated in the context of information processing in biological and engineered system as well as human intelligence. Our analysis provides insight into the physical limits of DNN efficiency and suggests potential directions for developing energy-efficient AI architectures that leverage reversible analog computing.

Alexei V. Tkachenko3/14/2025

arXiv

cs.AI cs.IT math.IT physics.data-an

Decomposing Interventional Causality into Synergistic, Redundant, and Unique Components

arXiv:2501.11447v1 Announce Type: new Abstract: We introduce a novel framework for decomposing interventional causal effects into synergistic, redundant, and unique components, building on the intuition of Partial Information Decomposition (PID) and the principle of M\"obius inversion. While recent work has explored a similar decomposition of an observational measure, we argue that a proper causal decomposition must be interventional in nature. We develop a mathematical approach that systematically quantifies how causal power is distributed among variables in a system, using a recently derived closed-form expression for the M\"obius function of the redundancy lattice. The formalism is then illustrated by decomposing the causal power in logic gates, cellular automata, and chemical reaction networks. Our results reveal how the distribution of causal power can be context- and parameter-dependent. This decomposition provides new insights into complex systems by revealing how causal influences are shared and combined among multiple variables, with potential applications ranging from attribution of responsibility in legal or AI systems, to the analysis of biological networks or climate models.

Abel Jansma1/22/2025

arXiv

physics.optics cs.CV cs.IT eess.IV math.IT physics.data-an

Information-driven design of imaging systems

arXiv:2405.20559v3 Announce Type: replace-cross Abstract: Most modern imaging systems process the data they capture computationally, either to make the measurement more interpretable for human viewing or to analyze it without a human in the loop. As a result, what matters is not how measurements appear visually, but how much information they contain. Information theory provides mathematical tools to quantify this; however, it has found limited use in imaging system design due to the challenge of developing methods that can handle the complexity of real-world measurements yet remain practical enough for widespread use. We introduce a data-driven approach for estimating the information content of imaging system measurements in order to evaluate system performance and optimize designs. Our framework requires only a dataset of experimental measurements and a means for noise characterization, enabling its use in real systems without ground truth data. We validate that these information estimates reliably predict system performance across diverse imaging modalities, including color photography, radio astronomy, lensless imaging, and label-free microscopy. We further introduce an optimization technique called Information-Driven Encoder Analysis Learning (IDEAL) for designing imaging systems that maximize information capture. This work unlocks information theory as a powerful, practical tool for analyzing and designing imaging systems across a broad range of applications. A video summarizing this work can be found at https://waller-lab.github.io/EncodingInformationWebsite/

Henry Pinkard, Leyla Kabuli, Eric Markley, Tiffany Chien, Jiantao Jiao, Laura Waller1/22/2025

arXiv

cs.LG cs.AI physics.data-an stat.ML

Physics of Skill Learning

arXiv:2501.12391v1 Announce Type: new Abstract: We aim to understand physics of skill learning, i.e., how skills are learned in neural networks during training. We start by observing the Domino effect, i.e., skills are learned sequentially, and notably, some skills kick off learning right after others complete learning, similar to the sequential fall of domino cards. To understand the Domino effect and relevant behaviors of skill learning, we take physicists' approach of abstraction and simplification. We propose three models with varying complexities -- the Geometry model, the Resource model, and the Domino model, trading between reality and simplicity. The Domino effect can be reproduced in the Geometry model, whose resource interpretation inspires the Resource model, which can be further simplified to the Domino model. These models present different levels of abstraction and simplification; each is useful to study some aspects of skill learning. The Geometry model provides interesting insights into neural scaling laws and optimizers; the Resource model sheds light on the learning dynamics of compositional tasks; the Domino model reveals the benefits of modularity. These models are not only conceptually interesting -- e.g., we show how Chinchilla scaling laws can emerge from the Geometry model, but also are useful in practice by inspiring algorithmic development -- e.g., we show how simple algorithmic changes, motivated by these toy models, can speed up the training of deep learning models.

Ziming Liu, Yizhou Liu, Eric J. Michaud, Jeff Gore, Max Tegmark1/22/2025

arXiv

cs.LG physics.data-an stat.ME stat.ML

Learning dynamical systems with hit-and-run random feature maps

arXiv:2501.06661v1 Announce Type: new Abstract: We show how random feature maps can be used to forecast dynamical systems with excellent forecasting skill. We consider the tanh activation function and judiciously choose the internal weights in a data-driven manner such that the resulting features explore the nonlinear, non-saturated regions of the activation function. We introduce skip connections and construct a deep variant of random feature maps by combining several units. To mitigate the curse of dimensionality, we introduce localization where we learn local maps, employing conditional independence. Our modified random feature maps provide excellent forecasting skill for both single trajectory forecasts as well as long-time estimates of statistical properties, for a range of chaotic dynamical systems with dimensions up to 512. In contrast to other methods such as reservoir computers which require extensive hyperparameter tuning, we effectively need to tune only a single hyperparameter, and are able to achieve state-of-the-art forecast skill with much smaller networks.

Pinak Mandal, Georg A. Gottwald1/14/2025

arXiv

cs.LG hep-ex physics.data-an

Introduction to the Usage of Open Data from the Large Hadron Collider for Computer Scientists in the Context of Machine Learning

arXiv:2501.06896v1 Announce Type: new Abstract: Deep learning techniques have evolved rapidly in recent years, significantly impacting various scientific fields, including experimental particle physics. To effectively leverage the latest developments in computer science for particle physics, a strengthened collaboration between computer scientists and physicists is essential. As all machine learning techniques depend on the availability and comprehensibility of extensive data, clear data descriptions and commonly used data formats are prerequisites for successful collaboration. In this study, we converted open data from the Large Hadron Collider, recorded in the ROOT data format commonly used in high-energy physics, to pandas DataFrames, a well-known format in computer science. Additionally, we provide a brief introduction to the data's content and interpretation. This paper aims to serve as a starting point for future interdisciplinary collaborations between computer scientists and physicists, fostering closer ties and facilitating efficient knowledge exchange.

Timo Saala, Matthias Schott1/14/2025

arxiv

stat.AP cs.IT math.IT physics.data-an

On the reconstruction limits of complex networks

arXiv:2501.01437v1 Announce Type: cross Abstract: Network reconstruction consists in retrieving the -- hidden -- interaction structure of a system from empirical observations such as time series. Many reconstruction algorithms have been proposed, although less research has been devoted to describe their theoretical limitations. To this end, we adopt an information-theoretical point of view and define the reconstructability -- the fraction of structural information recoverable from data. The reconstructability depends on the true data generating model which is shown to set the reconstruction limit, i.e., the performance upper bound for all algorithms. We show that the reconstructability is related to various performance measures, such as the probability of error and the Jaccard similarity. In an empirical context where the true data generating model is unknown, we introduce the reconstruction index as an approximation of the reconstructability. We find that performing model selection is crucial for the validity of the reconstruction index as a proxy of the reconstructability, and illustrate how it assesses the reconstruction limit of empirical time series and networks.

Charles Murphy, Simon Lizotte, Fran\c{c}ois Thibault, Vincent Thibeault, Patrick Desrosiers, Antoine Allard1/6/2025

arXiv

stat.ML cs.LG hep-ex physics.data-an

An information theoretic limit to data amplification

arXiv:2412.18041v1 Announce Type: cross Abstract: In recent years generative artificial intelligence has been used to create data to support science analysis. For example, Generative Adversarial Networks (GANs) have been trained using Monte Carlo simulated input and then used to generate data for the same problem. This has the advantage that a GAN creates data in a significantly reduced computing time. N training events for a GAN can result in GN generated events with the gain factor, G, being more than one. This appears to violate the principle that one cannot get information for free. This is not the only way to amplify data so this process will be referred to as data amplification which is studied using information theoretic concepts. It is shown that a gain of greater than one is possible whilst keeping the information content of the data unchanged. This leads to a mathematical bound which only depends on the number of generated and training events. This study determines conditions on both the underlying and reconstructed probability distributions to ensure this bound. In particular, the resolution of variables in amplified data is not improved by the process but the increase in sample size can still improve statistical significance. The bound is confirmed using computer simulation and analysis of GAN generated data from the literature.

S. J. Watts, L. Crow12/25/2024

arXiv

cs.SI physics.data-an

Dynamics of Collective Information Processing for Risk Encoding in Social Networks during Crises

arXiv:2412.17342v1 Announce Type: new Abstract: Online social networks are increasingly being utilized for collective sense making and information processing in disasters. However, the underlying mechanisms that shape the dynamics of collective intelligence in online social networks during disasters is not fully understood. To bridge this gap, we examine the mechanisms of collective information processing in human networks during five threat cases including airport power outage, hurricanes, wildfire, and blizzard, considering the temporal and spatial dimensions. Using the 13MM Twitter data generated by 5MM online users during these threats, we examined human activities, communication structures and frequency, social influence, information flow, and medium response time in social networks. The results show that the activities and structures are stable in growing networks, which lead to a stable power-law distribution of the social influence in networks. These temporally invariant patterns are not affected by people's memory and ties' strength. In addition, spatially localized communication spikes and global transmission gaps in the networks. The findings could inform about network intervention strategies to enable a healthy and efficient online environment, with potential long-term impact on risk communication and emergency response.

Chao Fan, Fangsheng Wu, Ali Mostafavi12/24/2024