physics.comp-ph

63 posts

arXiv:2503.22652v1 Announce Type: cross Abstract: Chebyshev Filtered Subspace Iteration (ChFSI) has been widely adopted for computing a small subset of extreme eigenvalues in large sparse matrices. This work introduces a residual-based reformulation of ChFSI, referred to as R-ChFSI, designed to accommodate inexact matrix-vector products while maintaining robust convergence properties. By reformulating the traditional Chebyshev recurrence to operate on residuals rather than eigenvector estimates, the R-ChFSI approach effectively suppresses the errors made in matrix-vector products, improving the convergence behaviour for both standard and generalized eigenproblems. This ability of R-ChFSI to be tolerant to inexact matrix-vector products allows one to incorporate approximate inverses for large-scale generalized eigenproblems, making the method particularly attractive where exact matrix factorizations or iterative methods become computationally expensive for evaluating inverses. It also allows us to compute the matrix-vector products in lower-precision arithmetic allowing us to leverage modern hardware accelerators. Through extensive benchmarking, we demonstrate that R-ChFSI achieves desired residual tolerances while leveraging low-precision arithmetic. For problems with millions of degrees of freedom and thousands of eigenvalues, R-ChFSI attains final residual norms in the range of 10$^{-12}$ to 10$^{-14}$, even with FP32 and TF32 arithmetic, significantly outperforming standard ChFSI in similar settings. In generalized eigenproblems, where approximate inverses are used, R-ChFSI achieves residual tolerances up to ten orders of magnitude lower, demonstrating its robustness to approximation errors. Finally, R-ChFSI provides a scalable and computationally efficient alternative for solving large-scale eigenproblems in high-performance computing environments.

Nikhil Kodali, Kartick Ramakrishnan, Phani Motamarri3/31/2025

arXiv:2503.22528v1 Announce Type: new Abstract: We introduce MixFunn, a novel neural network architecture designed to solve differential equations with enhanced precision, interpretability, and generalization capability. The architecture comprises two key components: the mixed-function neuron, which integrates multiple parameterized nonlinear functions to improve representational flexibility, and the second-order neuron, which combines a linear transformation of its inputs with a quadratic term to capture cross-combinations of input variables. These features significantly enhance the expressive power of the network, enabling it to achieve comparable or superior results with drastically fewer parameters and a reduction of up to four orders of magnitude compared to conventional approaches. We applied MixFunn in a physics-informed setting to solve differential equations in classical mechanics, quantum mechanics, and fluid dynamics, demonstrating its effectiveness in achieving higher accuracy and improved generalization to regions outside the training domain relative to standard machine learning models. Furthermore, the architecture facilitates the extraction of interpretable analytical expressions, offering valuable insights into the underlying solutions.

Tiago de Souza Farias, Gubio Gomes de Lima, Jonas Maziero, Celso Jorge Villas-Boas3/31/2025

arXiv:2503.09625v1 Announce Type: cross Abstract: This paper presents a data-driven framework for learning optimal second-order total variation diminishing (TVD) flux limiters via differentiable simulations. In our fully differentiable finite volume solvers, the limiter functions are replaced by neural networks. By representing the limiter as a pointwise convex linear combination of the Minmod and Superbee limiters, we enforce both second-order accuracy and TVD constraints at all stages of training. Our approach leverages gradient-based optimization through automatic differentiation, allowing a direct backpropagation of errors from numerical solutions to the limiter parameters. We demonstrate the effectiveness of this method on various hyperbolic conservation laws, including the linear advection equation, the Burgers' equation, and the one-dimensional Euler equations. Remarkably, a limiter trained solely on linear advection exhibits strong generalizability, surpassing the accuracy of most classical flux limiters across a range of problems with shocks and discontinuities. The learned flux limiters can be readily integrated into existing computational fluid dynamics codes, and the proposed methodology also offers a flexible pathway to systematically develop and optimize flux limiters for complex flow problems.

Chenyang Huang, Amal S. Sebastian, Venkatasubramanian Viswanathan3/14/2025

arXiv:2503.10021v1 Announce Type: new Abstract: We propose a general framework for the Discontinuous Galerkin-induced Neural Network (DGNet) inspired by the Interior Penalty Discontinuous Galerkin Method (IPDGM). In this approach, the trial space consists of piecewise neural network space defined over the computational domain, while the test function space is composed of piecewise polynomials. We demonstrate the advantages of DGNet in terms of accuracy and training efficiency across several numerical examples, including stationary and time-dependent problems. Specifically, DGNet easily handles high perturbations, discontinuous solutions, and complex geometric domains.

Guanyu Chen, Shengze Xu, Dong Ni, Tieyong Zeng3/14/2025

arXiv:2503.09998v1 Announce Type: new Abstract: We numerically investigate the sensitivity of the scattered wave field to perturbations in the shape of a scattering body illuminated by an incident plane wave. This study is motivated by recent work on the inverse problem of reconstructing a scatterer shape from measurements of the scattered wave at large distances from the scatterer. For this purpose we consider star-shaped scatterers represented using cubic splines, and our approach is based on a Nystr\"om method-based discretisation of the shape derivative. Using the singular value decomposition, we identify fundamental geometric modes that most strongly influence the scattered wave, providing insight into the most visible boundary features in scattering data.

Erik Garc\'ia Neefjes, Stuart C. Hawkins3/14/2025

arXiv:2503.10492v1 Announce Type: cross Abstract: While machine learning holds great promise for quantum technologies, most current methods focus on predicting or controlling a specific quantum system. Meta-learning approaches, however, can adapt to new systems for which little data is available, by leveraging knowledge obtained from previous data associated with similar systems. In this paper, we meta-learn dynamics and characteristics of closed and open two-level systems, as well as the Heisenberg model. Based on experimental data of a Loss-DiVincenzo spin-qubit hosted in a Ge/Si core/shell nanowire for different gate voltage configurations, we predict qubit characteristics i.e. $g$-factor and Rabi frequency using meta-learning. The algorithm we introduce improves upon previous state-of-the-art meta-learning methods for physics-based systems by introducing novel techniques such as adaptive learning rates and a global optimizer for improved robustness and increased computational efficiency. We benchmark our method against other meta-learning methods, a vanilla transformer, and a multilayer perceptron, and demonstrate improved performance.

Lucas Schorling, Pranav Vaidhyanathan, Jonas Schuff, Miguel J. Carballido, Dominik Zumb\"uhl, Gerard Milburn, Florian Marquardt, Jakob Foerster, Michael A. Osborne, Natalia Ares3/14/2025

arXiv:2501.08998v2 Announce Type: replace-cross Abstract: Determining whether a candidate crystalline material is thermodynamically stable depends on identifying its true ground-state structure, a central challenge in computational materials science. We introduce CrystalGRW, a diffusion-based generative model on Riemannian manifolds that proposes novel crystal configurations and can predict stable phases validated by density functional theory. The crystal properties, such as fractional coordinates, atomic types, and lattice matrices, are represented on suitable Riemannian manifolds, ensuring that new predictions generated through the diffusion process preserve the periodicity of crystal structures. We incorporate an equivariant graph neural network to also account for rotational and translational symmetries during the generation process. CrystalGRW demonstrates the ability to generate realistic crystal structures that are close to their ground states with accuracy comparable to existing models, while also enabling conditional control, such as specifying a desired crystallographic point group. These features help accelerate materials discovery and inverse design by offering stable, symmetry-consistent crystal candidates for experimental validation.

Krit Tangsongcharoen, Teerachote Pakornchote, Chayanon Atthapak, Natthaphon Choomphon-anomakhun, Annop Ektarawong, Bj\"orn Alling, Christopher Sutton, Thiti Bovornratanaraks, Thiparat Chotibut3/10/2025

arXiv:2503.04870v1 Announce Type: cross Abstract: Machine learning in materials science faces challenges due to limited experimental data, as generating synthesis data is costly and time-consuming, especially with in-house experiments. Mining data from existing literature introduces issues like mixed data quality, inconsistent formats, and variations in reporting experimental parameters, complicating the creation of consistent features for the learning algorithm. Additionally, combining continuous and discrete features can hinder the learning process with limited data. Here, we propose strategies that utilize large language models (LLMs) to enhance machine learning performance on a limited, heterogeneous dataset of graphene chemical vapor deposition synthesis compiled from existing literature. These strategies include prompting modalities for imputing missing data points and leveraging large language model embeddings to encode the complex nomenclature of substrates reported in chemical vapor deposition experiments. The proposed strategies enhance graphene layer classification using a support vector machine (SVM) model, increasing binary classification accuracy from 39% to 65% and ternary accuracy from 52% to 72%. We compare the performance of the SVM and a GPT-4 model, both trained and fine-tuned on the same data. Our results demonstrate that the numerical classifier, when combined with LLM-driven data enhancements, outperforms the standalone LLM predictor, highlighting that in data-scarce scenarios, improving predictive learning with LLM strategies requires more than simple fine-tuning on datasets. Instead, it necessitates sophisticated approaches for data imputation and feature space homogenization to achieve optimal performance. The proposed strategies emphasize data enhancement techniques, offering a broadly applicable framework for improving machine learning performance on scarce, inhomogeneous datasets.

Devi Dutta Biswajeet, Sara Kadkhodaei3/10/2025

arXiv:2503.05557v1 Announce Type: cross Abstract: We present a high-order, sharp-interface method for simulation of two-phase flow of real gases using implicit shock tracking. The method is based on a phase-field formulation of two-phase, compressible, inviscid flow with a trivial mixture model. Implicit shock tracking is a high-order, optimization-based discontinuous Galerkin method that automatically aligns mesh faces with non-smooth flow features to represent them perfectly with inter-element jumps. It is used to accurately approximate shocks and rarefactions without stabilization and converge the phase-field solution to a sharp interface one by aligning mesh faces with the material interface. Time-dependent problems are formulated as steady problems in a space-time domain where complex wave interactions (e.g., intersections and reflections) manifest as space-time triplet points. The space-time formulation avoids complex re-meshing and solution transfer that would be required to track moving waves with mesh faces using the method of lines. The approach is applied to several two-phase flow Riemann problems involving gases with ideal, stiffened gas, and Becker-Kistiakowsky-Wilson (BKW) equations of state, including a spherically symmetric underwater explosion problem. In all cases, the method aligns element faces with all shocks (including secondary shocks that form at time t > 0), rarefactions, and material interfaces, and accurately resolves the flow field on coarse space-time grids.

Charles Naudet, Brian Taylor, Matthew J. Zahr3/10/2025

arXiv:2411.02126v2 Announce Type: replace Abstract: In real-world data, information is stored in extremely large feature vectors. These variables are typically correlated due to complex interactions involving many features simultaneously. Such correlations qualitatively correspond to semantic roles and are naturally recognized by both the human brain and artificial neural networks. This recognition enables, for instance, the prediction of missing parts of an image or text based on their context. We present a method to detect these correlations in high-dimensional data represented as binary numbers. We estimate the binary intrinsic dimension of a dataset, which quantifies the minimum number of independent coordinates needed to describe the data, and is therefore a proxy of semantic complexity. The proposed algorithm is largely insensitive to the so-called curse of dimensionality, and can therefore be used in big data analysis. We test this approach identifying phase transitions in model magnetic systems and we then apply it to the detection of semantic correlations of images and text inside deep neural networks.

Santiago Acevedo, Alex Rodriguez, Alessandro Laio3/10/2025

arXiv:2503.02407v2 Announce Type: replace-cross Abstract: Symmetry rules that atoms obey when they bond together to form an ordered crystal play a fundamental role in determining their physical, chemical, and electronic properties such as electrical and thermal conductivity, optical and polarization behavior, and mechanical strength. Almost all known crystalline materials have internal symmetry. Consistently generating stable crystal structures is still an open challenge, specifically because such symmetry rules are not accounted for. To address this issue, we propose WyFormer, a generative model for materials conditioned on space group symmetry. We use Wyckoff positions as the basis for an elegant, compressed, and discrete structure representation. To model the distribution, we develop a permutation-invariant autoregressive model based on the Transformer and an absence of positional encoding. WyFormer has a unique and powerful synergy of attributes, proven by extensive experimentation: best-in-class symmetry-conditioned generation, physics-motivated inductive bias, competitive stability of the generated structures, competitive material property prediction quality, and unparalleled inference speed.

Nikita Kazeev, Wei Nong, Ignat Romanov, Ruiming Zhu, Andrey Ustyuzhanin, Shuya Yamazaki, Kedar Hippalgaonkar3/10/2025

arXiv:2503.04901v1 Announce Type: cross Abstract: Multiscale homogenization of woven composites requires detailed micromechanical evaluations, leading to high computational costs. Data-driven surrogate models based on neural networks address this challenge but often suffer from big data requirements, limited interpretability, and poor extrapolation capabilities. This study introduces a Hierarchical Physically Recurrent Neural Network (HPRNN) employing two levels of surrogate modeling. First, Physically Recurrent Neural Networks (PRNNs) are trained to capture the nonlinear elasto-plastic behavior of warp and weft yarns using micromechanical data. In a second scale transition, a physics-encoded meso-to-macroscale model integrates these yarn surrogates with the matrix constitutive model, embedding physical properties directly into the latent space. Adopting HPRNNs for both scale transitions can avoid nonphysical behavior often observed in predictions from pure data-driven recurrent neural networks and transformer networks. This results in better generalization under complex cyclic loading conditions. The framework offers a computationally efficient and explainable solution for multiscale modeling of woven composites.

Ehsan Ghane, Marina A. Maia, Iuri B. C. M. Rocha, Martin Fagerstr\"om, Mohsen Mirakhalaf3/10/2025

arXiv:2501.10594v1 Announce Type: cross Abstract: Accurate determination of the equation of state of dense hydrogen is essential for understanding gas giants. Currently, there is still no consensus on methods for calculating its entropy, which play a fundamental role and can result in qualitatively different predictions for Jupiter's interior. Here, we investigate various aspects of entropy calculation for dense hydrogen based on ab initio molecular dynamics simulations. Specifically, we employ the recently developed flow matching method to validate the accuracy of the traditional thermodynamic integration approach. We then clearly identify pitfalls in previous attempts and propose a reliable framework for constructing the hydrogen equation of state, which is accurate and thermodynamically consistent across a wide range of temperature and pressure conditions. This allows us to conclusively address the long-standing discrepancies in Jupiter's adiabat among earlier studies, demonstrating the potential of our approach for providing reliable equations of state of diverse materials.

Hao Xie, Saburo Howard, Guglielmo Mazzola1/22/2025

arXiv:2501.12222v1 Announce Type: cross Abstract: We used our developed AI search engine~(InvDesFlow) to perform extensive investigations regarding ambient stable superconducting hydrides. A cubic structure Li$_2$AuH$_6$ with Au-H octahedral motifs is identified to be a candidate. After performing thermodynamical analysis, we provide a feasible route to experimentally synthesize this material via the known LiAu and LiH compounds under ambient pressure. The further first-principles calculations suggest that Li$_2$AuH$_6$ shows a high superconducting transition temperature ($T_c$) $\sim$ 140 K under ambient pressure. The H-1$s$ electrons strongly couple with phonon modes of vibrations of Au-H octahedrons as well as vibrations of Li atoms, where the latter is not taken seriously in other previously similar cases. Hence, different from previous claims of searching metallic covalent bonds to find high-$T_c$ superconductors, we emphasize here the importance of those phonon modes with strong electron-phonon coupling (EPC). And we suggest that one can intercalate atoms into binary or ternary hydrides to introduce more potential phonon modes with strong EPC, which is an effective approach to find high-$T_c$ superconductors within multicomponent compounds.

Zhenfeng Ouyang, Bo-Wen Yao, Xiao-Qi Han, Peng-Jie Guo, Ze-Feng Gao, Zhong-Yi Lu1/22/2025

arXiv:2404.10863v2 Announce Type: replace-cross Abstract: Hypo-elastoplasticity is a framework suitable for modeling the mechanics of many hard materials that have small elastic deformation and large plastic deformation. In most laboratory tests for these materials the Cauchy stress is in quasi-static equilibrium. Rycroft et al. discovered a mathematical correspondence between this physical system and the incompressible Navier-Stokes equations, and developed a projection method similar to Chorin's projection method (1968) for incompressible Newtonian fluids. Here, we improve the original projection method to simulate quasi-static hypo-elastoplasticity, by making three improvements. First, drawing inspiration from the second-order projection method for incompressible Newtonian fluids, we formulate a second-order in time numerical scheme for quasi-static hypo-elastoplasticity. Second, we implement a finite element method for solving the elliptic equations in the projection step, which provides both numerical benefits and flexibility. Third, we develop an adaptive global time-stepping scheme, which can compute accurate solutions in fewer timesteps. Our numerical tests use an example physical model of a bulk metallic glass based on the shear transformation zone theory, but the numerical methods can be applied to any elastoplastic material.

Jiayin Lu, Chris H. Rycroft1/22/2025

arXiv:2408.02161v2 Announce Type: replace-cross Abstract: The added value of machine learning for weather and climate applications is measurable through performance metrics, but explaining it remains challenging, particularly for large deep learning models. Inspired by climate model hierarchies, we propose that a full hierarchy of Pareto-optimal models, defined within an appropriately determined error-complexity plane, can guide model development and help understand the models' added value. We demonstrate the use of Pareto fronts in atmospheric physics through three sample applications, with hierarchies ranging from semi-empirical models with minimal parameters to deep learning algorithms. First, in cloud cover parameterization, we find that neural networks identify nonlinear relationships between cloud cover and its thermodynamic environment, and assimilate previously neglected features such as vertical gradients in relative humidity that improve the representation of low cloud cover. This added value is condensed into a ten-parameter equation that rivals deep learning models. Second, we establish a machine learning model hierarchy for emulating shortwave radiative transfer, distilling the importance of bidirectional vertical connectivity for accurately representing absorption and scattering, especially for multiple cloud layers. Third, we emphasize the importance of convective organization information when modeling the relationship between tropical precipitation and its surrounding environment. We discuss the added value of temporal memory when high-resolution spatial information is unavailable, with implications for precipitation parameterization. Therefore, by comparing data-driven models directly with existing schemes using Pareto optimality, we promote process understanding by hierarchically unveiling system complexity, with the hope of improving the trustworthiness of machine learning models in atmospheric applications.

Tom Beucler, Arthur Grundner, Sara Shamekh, Peter Ukkonen, Matthew Chantry, Ryan Lagerquist1/22/2025

arXiv:2501.12149v1 Announce Type: cross Abstract: Density functional theory (DFT) is probably the most promising approach for quantum chemistry calculations considering its good balance between calculations precision and speed. In recent years, several neural network-based functionals have been developed for exchange-correlation energy approximation in DFT, DM21 developed by Google Deepmind being the most notable between them. This study focuses on evaluating the efficiency of DM21 functional in predicting molecular geometries, with a focus on the influence of oscillatory behavior in neural network exchange-correlation functionals. We implemented geometry optimization in PySCF for the DM21 functional in geometry optimization problem, compared its performance with traditional functionals, and tested it on various benchmarks. Our findings reveal both the potential and the current challenges of using neural network functionals for geometry optimization in DFT. We propose a solution extending the practical applicability of such functionals and allowing to model new substances with their help.

Kirill Kulaev, Alexander Ryabov, Michael Medvedev, Evgeny Burnaev, Vladimir Vanovskiy1/22/2025

arXiv:2406.00047v3 Announce Type: replace-cross Abstract: A central problem in quantum mechanics involves solving the Electronic Schrodinger Equation for a molecule or material. The Variational Monte Carlo approach to this problem approximates a particular variational objective via sampling, and then optimizes this approximated objective over a chosen parameterized family of wavefunctions, known as the ansatz. Recently neural networks have been used as the ansatz, with accompanying success. However, sampling from such wavefunctions has required the use of a Markov Chain Monte Carlo approach, which is inherently inefficient. In this work, we propose a solution to this problem via an ansatz which is cheap to sample from, yet satisfies the requisite quantum mechanical properties. We prove that a normalizing flow using the following two essential ingredients satisfies our requirements: (a) a base distribution which is constructed from Determinantal Point Processes; (b) flow layers which are equivariant to a particular subgroup of the permutation group. We then show how to construct both continuous and discrete normalizing flows which satisfy the requisite equivariance. We further demonstrate the manner in which the non-smooth nature ("cusps") of the wavefunction may be captured, and how the framework may be generalized to provide induction across multiple molecules. The resulting theoretical framework entails an efficient approach to solving the Electronic Schrodinger Equation.

Daniel Freedman, Eyal Rozenberg, Alex Bronstein1/22/2025

arXiv:2411.07422v3 Announce Type: replace Abstract: The numerical flux determines the performance of numerical methods for solving hyperbolic partial differential equations (PDEs). In this work, we compare a selection of 8 numerical fluxes in the framework of nonlinear semidiscrete finite volume (FV) schemes, based on Weighted Essentially Non-Oscillatory (WENO) spatial reconstruction and Deferred Correction (DeC) time discretization. The methodology is implemented and systematically assessed for order of accuracy in space and time up to seven. The numerical fluxes selected in the present study represent the two existing classes of fluxes, namely centred and upwind. Centred fluxes do not explicitly use wave propagation information, while, upwind fluxes do so from the solution of the Riemann problem via a wave model containing $A$ waves. Upwind fluxes include two subclasses: complete and incomplete fluxes. For complete upwind fluxes, $A=E$, where $E$ is the number of characteristic fields in the exact problem. For incomplete upwind ones, $A<E$. Our study is conducted for the one- and two-dimensional Euler equations, for which we consider the following numerical fluxes: Lax-Friedrichs (LxF), First-Order Centred (FORCE), Rusanov (Rus), Harten-Lax-van Leer (HLL), Central-Upwind (CU), Low-Dissipation Central-Upwind (LDCU), HLLC, and the flux computed through the exact Riemann solver (Ex.RS). We find that the numerical flux has an effect on the performance of the methods. The magnitude of the effect depends on the type of numerical flux and on the order of accuracy of the scheme. It also depends on the type of problem; that is, whether the solution is smooth or discontinuous, whether discontinuities are linear or nonlinear, whether linear discontinuities are fast- or slowly-moving, and whether the solution is evolved for short or long time.

Lorenzo Micalizzi, Eleuterio F. Toro1/22/2025

arXiv:2312.09215v3 Announce Type: replace-cross Abstract: Chebyshev polynomials have shown significant promise as an efficient tool for both classical and quantum neural networks to solve linear and nonlinear differential equations. In this work, we adapt and generalize this framework in a quantum machine learning setting for a variety of problems, including the 2D Poisson's equation, second-order linear differential equation, system of differential equations, nonlinear Duffing and Riccati equation. In particular, we propose in the quantum setting a modified Self-Adaptive Physics-Informed Neural Network (SAPINN) approach, where self-adaptive weights are applied to problems with multi-objective loss functions. We further explore capturing correlations in our loss function using a quantum-correlated measurement, resulting in improved accuracy for initial value problems. We analyse also the use of entangling layers and their impact on the solution accuracy for second-order differential equations. The results indicate a promising approach to the near-term evaluation of differential equations on quantum devices.

Abhishek Setty, Rasul Abdusalamov, Felix Motzoi1/22/2025