Transferlab
Pulkit Tandon, research engineer at Granica, will present his work on data selection, showing how using surrogate models to select subsamples of a data set for labeling can improve training efficiency and performance.
Sridhar Chellappa will introduce the concept of reduced order modeling (ROM), a technique used in the field of simulation and AI to reduce the complexity of mathematical models. The seminar will cover the basics of ROM, its applications, and a lead up to more ML-flavoured approaches.
Antonio Vergari will give an overview about recent advancements in tractable probabilistic inference.
Pawan Goyal, Senior AI Engineer at appliedAI, will present a recent work in physics-enhanced machine learning on generalized quadratic embeddings for nonlinear dynamics.
<em>optimagic</em> is a Python package for numerical optimization. It is a unified interface to optimizers from SciPy, NlOpt and other packages. optimagic’s <code>minimize</code> function works just like SciPy’s, so you don’t have to adjust your code. You simply get more optimizers for free. On top you get diagnostic tools, parallel numerical derivatives and more.
Carles Domingo-Enrich will present his work on Stochastic Optimal Control Matching (SOCM), a novel Iterative Diffusion Optimization (IDO) technique for stochastic optimal control that stems from the same philosophy as the conditional score matching loss for diffusion models.
Jakob Wagner, Junior AI Researcher at the appliedAI Institute for Europe, will talk about the topic of his Master’s thesis, conducted in collaboration with TUM, on applying neural operators to real-world acoustic problems.
The CLP-Transfer method introduces a novel approach for cross-lingual language transfer by leveraging token overlap and a small pre-trained model with the desired tokenizer, simplifying the transfer process without the need for fastText embeddings or bilingual dictionaries. Despite its practical advantages, the method’s performance on downstream tasks is limited, highlighting areas for future research and evaluation.
The mesh-independent neural operator (MINO) is a fully attentional architecture for operator learning that allows to represent the discretized system as a set-valued data without a prior structure.
A novel approach, symmetry teleportation, enhances convergence speed in gradient-based optimization by allowing parameters to traverse large distances on the loss level set by exploiting symmetries in the loss landscape.
This paper presents a new SBI algorithm that utilizes transformer architectures and score-based diffusion models. Unlike traditional approaches, it can estimate the posterior, the likelihood, and other arbitrary conditionals once trained. It also handles missing data, leverages known dependencies in the simulator, and performs well on common benchmarks.
Akshey Kumar, postdoctoral member of the Neuroinformatics research group at TU Vienna, will talk about BunDLe-Net, a manifold-learning algorithm that effectively preserves relevant information while abstracting away details that are irrelevant to the dynamics of a specific target variable.
This paper explores the use of Large Language Models (LLMs) to address challenges in Black Box Optimization (BBO), particularly multi-modality and task generalization. The authors propose framing BBO around sequence-based foundation models, leveraging LLMs’ capabilities to retrieve information from various modalities resulting in superior optimization strategies.
Seungjun Lee will talk about an attention-based neural operator architecture called an Inducing Point Operator Transformer (IPOT), which addresses the challenges of flexibility in handling irregular and arbitrary input and output formats and scalability to large discretizations when solving partial differential equations (PDEs).
Language transfer enables the use of language models trained in one or more languages to initialize a new language model in another language. WECHSEL is a cross-lingual language transfer method that efficiently initializes the embedding parameters of a language model in a target language using the embedding parameters from an existing model in a source language, facilitating more efficient training in the new language.
Deep neural operators, such as DeepONet, have changed the paradigm in high-dimensional nonlinear regression, promising significant generalization and speed-up in computational engineering applications. In a recent paper, the authors investigate the use of DeepONet to infer flow fields around unseen airfoils with the aim of shape constrained optimization, an important design problem in aerodynamics that typically taxes computational resources heavily.
Christian will present a proposal for dense rewards in task-oriented dialogue systems to enhance sample efficiency and discuss continual reinforcement learning of dialogue policies. Key topics include an architecture for continual learning, an extended learning environment, lifetime return optimization, and meta-reinforcement learning for hyperparameter adaptation.
Interpreting the output of neural networks is often challenging because it entails putting into words patterns that may not be easily expressible in human language. This often results in forced explanations that do not reflect the true decision-making process of the model. However, for CLIP-ViT models there is a natural way to map image features of each component of the Transformer network to text-based concepts.
Timo will introduce a framework for designing interpretable machine learning methods for science, termed “property descriptors”.
Bayesian inference is a popular tool for parameter estimation. However, the posterior distribution might not be sufficient for decision-making. Bayesian Amortized Decision-Making is a method that learns the cost of data and action pairs to make Bayes-optimal decisions.