cs.FL
24 postsarXiv:2501.05830v1 Announce Type: cross Abstract: We investigate the lengths and starting positions of the longest monochromatic arithmetic progressions for a fixed difference in the Fibonacci word. We provide a complete classification for their lengths in terms of a simple formula. Our strongest results are proved using methods from dynamical systems, especially the dynamics of circle rotations. We also employ computer-based methods in the form of the automatic theorem-proving software Walnut. This allows us to extend recent results concerning similar questions for the Thue-Morse sequence and the Rudin-Shapiro sequence. This also allows us to obtain some results for the Fibonacci word that do not seem to be amenable to dynamical methods.
arXiv:2501.06579v1 Announce Type: new Abstract: We consider the problem of refuting equivalence of probabilistic programs, i.e., the problem of proving that two probabilistic programs induce different output distributions. We study this problem in the context of programs with conditioning (i.e., with observe and score statements), where the output distribution is conditioned by the event that all the observe statements along a run evaluate to true, and where the probability densities of different runs may be updated via the score statements. Building on a recent work on programs without conditioning, we present a new equivalence refutation method for programs with conditioning. Our method is based on weighted restarting, a novel transformation of probabilistic programs with conditioning to the output equivalent probabilistic programs without conditioning that we introduce in this work. Our method is the first to be both a) fully automated, and b) providing provably correct answers. We demonstrate the applicability of our method on a set of programs from the probabilistic inference literature.
arXiv:2501.07428v1 Announce Type: new Abstract: The set of finite words over a well-quasi-ordered set is itself well-quasi-ordered. This seminal result by Higman is a cornerstone of the theory of well-quasi-orderings and has found numerous applications in computer science. However, this result is based on a specific choice of ordering on words, the (scattered) subword ordering. In this paper, we describe to what extent other natural orderings (prefix, suffix, and infix) on words can be used to derive Higman-like theorems. More specifically, we are interested in characterizing languages of words that are well-quasi-ordered under these orderings. We show that a simple characterization is possible for the prefix and suffix orderings, and that under extra regularity assumptions, this also extends to the infix ordering. We furthermore provide decision procedures for a large class of languages, that contains regular and context-free languages.
arXiv:2412.04970v2 Announce Type: replace Abstract: We show that given a graph G we can CMSO-transduce its modular decomposition, its split decomposition and its bi-join decomposition. This improves results by Courcelle [Logical Methods in Computer Science, 2006] who gave such transductions using order-invariant MSO, a strictly more expressive logic than CMSO. Our methods more generally yield C2MSO-transductions of the canonical decomposition of weakly-partitive set systems and weakly-bipartitive systems of bipartitions.
arXiv:2405.08171v5 Announce Type: replace Abstract: A transducer is finite-valued if for some bound k, it maps any given input to at most k outputs. For classical, one-way transducers, it is known since the 80s that finite valuedness entails decidability of the equivalence problem. This decidability result is in contrast to the general case, which makes finite-valued transducers very attractive. For classical transducers, it is also known that finite valuedness is decidable and that any k-valued finite transducer can be decomposed as a union of k single-valued finite transducers. In this paper, we extend the above results to copyless streaming string transducers (SSTs), answering questions raised by Alur and Deshmukh in 2011. SSTs strictly extend the expressiveness of one-way transducers via additional variables that store partial outputs. We prove that any k-valued SST can be effectively decomposed as a union of k (single-valued) deterministic SSTs. As a corollary, we obtain equivalence of SSTs and two-way transducers in the finite-valued case (those two models are incomparable in general). Another corollary is an elementary upper bound for checking equivalence of finite-valued SSTs. The latter problem was already known to be decidable, but the proof complexity was unknown (it relied on Ehrenfeucht's conjecture). Finally, our main result is that finite valuedness of SSTs is decidable. The complexity is PSpace, and even PTime when the number of variables is fixed.
arXiv:2501.03573v1 Announce Type: new Abstract: This essay discusses the connections and differences between two emerging paradigms in deep learning, namely Neural Cellular Automata and Deep Equilibrium Models, and train a simple Deep Equilibrium Convolutional model to demonstrate the inherent similarity of NCA and DEQ based methods. Finally, this essay speculates about ways to combine theoretical and practical aspects of both approaches for future research.
arXiv:1909.12582v4 Announce Type: replace Abstract: This paper focuses on formally specifying and verifying the chain of formal semantics of the Esterel synchronous programming language using the Coq proof assistant. In particular, in addition to the standard logical (LBS) semantics, constructive semantics (CBS) and constructive state semantics (CSS), we introduce a novel microstep semantics that gets rid of the Must/Can potential function pair of the constructive semantics and can be viewed as an abstract version of Esterel's circuit semantics used by compilers to generate software code and hardware designs. Excluding the loop construct from Esterel, the paper also provides formal proofs in Coq of the equivalence between the CBS and CSS semantics and of the refinement of the CSS by the microstep semantics.
arXiv:2501.03914v1 Announce Type: new Abstract: Pomsets are a promising formalism for concurrent programs based on partially ordered sets. Among this class, series-parallel pomsets admit a convenient linear representation and can be recognized by simple algebraic structures known as pomset recognizers. Active learning consists in inferring a formal model of a recognizable language by asking membership and equivalence queries to a minimally adequate teacher (MAT). We improve existing learning algorithms for pomset recognizers by 1. introducing a new counter-example analysis procedure that is in the best case scenario exponentially more efficient than existing methods 2. adapting the state-of-the-art $L^{\lambda}$ algorithm to minimize the impact of exceedingly verbose counter-examples and remove redundant queries 3. designing a suitable finite test suite that ensures general equivalence between two pomset recognizers by extending the well-known W-method.
arXiv:2409.13629v2 Announce Type: replace Abstract: Previous work has shown that the languages recognized by average-hard attention transformers (AHATs) and softmax-attention transformers (SMATs) are within the circuit complexity class TC$^0$. However, these results assume limited-precision arithmetic: using floating-point numbers with O(log n) bits (where n is the length of the input string), Strobl showed that AHATs can be approximated in L-uniform TC$^0$, and Merrill and Sabharwal showed that SMATs can be approximated in DLOGTIME-uniform TC$^0$. Here, we improve these results, showing that AHATs with no approximation, SMATs with O(poly(n)) bits of floating-point precision, and SMATs with at most $2^{-O(poly(n))}$ absolute error are all in DLOGTIME-uniform TC$^0$.
arXiv:2501.01882v1 Announce Type: cross Abstract: We study monads in the (pseudo-)double category $\mathbf{KSW}(\mathcal{K})$ where loose arrows are Mealy automata valued in an ambient monoidal category $\mathcal{K}$, and the category of tight arrows is $\mathcal{K}$. Such monads turn out to be elegantly described through instances of semifree bicrossed products (bicrossed products of monoids, in the sense of Zappa-Sz\'ep-Takeuchi, where one factor is a free monoid). This result which gives an explicit description of the `free monad' double left adjoint to the forgetful functor. (Loose) monad maps are interesting as well, and relate to already known structures in automata theory. In parallel, we outline what double co/limits exist in $\mathbf{KSW}(\mathcal{K})$ and express in a synthetic language, based on double category theory, the bicategorical features of Katis-Sabadini-Walters `bicategory of circuits'.
arXiv:2411.19906v2 Announce Type: replace-cross Abstract: L-systems can be made to model and create simulations of many biological processes, such as plant development. Finding an L-system for a given process is typically solved by hand, by experts, in a massively time-consuming process. It would be significant if this could be done automatically from data, such as from sequences of images. In this paper, we are interested in inferring a particular type of L-system, deterministic context-free L-system (D0L-system) from a sequence of strings. We introduce the characteristic graph of a sequence of strings, which we then utilize to translate our problem (inferring D0L-system) in polynomial time into the maximum independent set problem (MIS) and the SAT problem. After that, we offer a classical exact algorithm and an approximate quantum algorithm for the problem.
arXiv:2501.00364v1 Announce Type: new Abstract: Reward machines (RMs) are an effective approach for addressing non-Markovian rewards in reinforcement learning (RL) through finite-state machines. Traditional RMs, which label edges with propositional logic formulae, inherit the limited expressivity of propositional logic. This limitation hinders the learnability and transferability of RMs since complex tasks will require numerous states and edges. To overcome these challenges, we propose First-Order Reward Machines ($\texttt{FORM}$s), which use first-order logic to label edges, resulting in more compact and transferable RMs. We introduce a novel method for $\textbf{learning}$ $\texttt{FORM}$s and a multi-agent formulation for $\textbf{exploiting}$ them and facilitate their transferability, where multiple agents collaboratively learn policies for a shared $\texttt{FORM}$. Our experimental results demonstrate the scalability of $\texttt{FORM}$s with respect to traditional RMs. Specifically, we show that $\texttt{FORM}$s can be effectively learnt for tasks where traditional RM learning approaches fail. We also show significant improvements in learning speed and task transferability thanks to the multi-agent learning framework and the abstraction provided by the first-order language.
arXiv:2501.00784v1 Announce Type: cross Abstract: In 2009 Benoit Cloitre introduced a certain self-generating sequence $$(a_n)_{n\geq 1} = 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 2, 2, \ldots,$$ with the property that the sum of the terms appearing in the $n$'th run equals twice the $n$'th term of the sequence. We give a connection between this sequence and the paperfolding sequence, and then prove Cloitre's conjecture about the density of $1$'s appearing in $(a_n)_{n \geq 1}$.
arXiv:2307.08780v2 Announce Type: replace Abstract: Discounting the influence of future events is a key paradigm in economics and it is widely used in computer-science models, such as games, Markov decision processes (MDPs), reinforcement learning, and automata. While a single game or MDP may allow for several different discount factors, discounted-sum automata (NDAs) were only studied with respect to a single discount factor. It is known that every class of NDAs with an integer as the discount factor has good computational properties: It is closed under determinization and under the algebraic operations min, max, addition, and subtraction, and there are algorithms for its basic decision problems, such as automata equivalence and containment. Extending the integer discount factor to an arbitrary rational number, loses most of these good properties. We define and analyze nondeterministic discounted-sum automata in which each transition can have a different integral discount factor (integral NMDAs). We show that integral NMDAs with an arbitrary choice of discount factors are not closed under determinization and under algebraic operations and that their containment problem is undecidable. We then define and analyze a restricted class of integral NMDAs, which we call tidy NMDAs, in which the choice of discount factors depends on the prefix of the word read so far. Among their special cases are NMDAs that correlate discount factors to actions (alphabet letters) or to the elapsed time. We show that for every function $\theta$ that defines the choice of discount factors, the class of $\theta$-NMDAs enjoys all of the above good properties of NDAs with a single integral discount factor, as well as the same complexity of the required decision problems. Tidy NMDAs are also as expressive as deterministic integral NMDAs with an arbitrary choice of discount factors.
arXiv:2302.06420v3 Announce Type: replace Abstract: We formalized general (i.e., type-0) grammars using the Lean 3 proof assistant. We defined basic notions of rewrite rules and of words derived by a grammar, and used grammars to show closure of the class of type-0 languages under four operations: union, reversal, concatenation, and the Kleene star. The literature mostly focuses on Turing machine arguments, which are possibly more difficult to formalize. For the Kleene star, we could not follow the literature and came up with our own grammar-based construction.
arXiv:2411.07741v2 Announce Type: replace Abstract: Complex Cyber-Physical System (CPS) such as Unmanned Aerial System (UAS) got rapid development these years, but also became vulnerable to GPS spoofing, packets injection, buffer-overflow and other malicious attacks. Ensuring the behaviors of UAS always keeping secure no matter how the environment changes, would be a prospective direction for UAS security. This paper aims at introducing a pattern-based framework to describe the security properties of UAS, and presenting a reactive synthesis-based approach to implement the automatic generation of secure UAS controller. First, we study the operating mechanism of UAS and construct a high-level model consisting of actuator and monitor. Besides, we analyze the security threats of UAS from the perspective of hardware, software and cyber physics, and then summarize the corresponding specification patterns of security properties with LTL formulas. With the UAS model and security specification patterns, automatons for controller can be constructed by General Reactivity of Rank 1 (GR(1)) synthesis algorithm, which is a two-player game process between Unmanned Aerial Vehicle (UAV) and its environment. Finally, we experimented under the Ardupilot simulation platform to test the effectiveness of our method.
arXiv:2412.17930v1 Announce Type: cross Abstract: The paperfolding sequences form an uncountable class of infinite sequences over the alphabet $\{ -1, 1 \}$ that describe the sequence of folds arising from iterated folding of a piece of paper, followed by unfolding. In this note we observe that the sequence of run lengths in such a sequence, as well as the starting and ending positions of the $n$'th run, is $2$-synchronized and hence computable by a finite automaton. As a specific consequence, we obtain the recent results of Bunder, Bates, and Arnold, in much more generality, via a different approach. We also prove results about the critical exponent and subword complexity of these run-length sequences.
arXiv:2412.18425v1 Announce Type: cross Abstract: Two finite words are k-binomially equivalent if each subword (i.e., subsequence) of length at most k occurs the same number of times in both words. The k-binomial complexity of an infinite word is a function that maps the integer $n\geq 0$ to the number of k-binomial equivalence classes represented by its factors of length n. The Thue--Morse (TM) word and its generalization to larger alphabets are ubiquitous in mathematics due to their rich combinatorial properties. This work addresses the k-binomial complexities of generalized TM words. Prior research by Lejeune, Leroy, and Rigo determined the k-binomial complexities of the 2-letter TM word. For larger alphabets, work by L\"u, Chen, Wen, and Wu determined the 2-binomial complexity for m-letter TM words, for arbitrary m, but the exact behavior for $k\geq 3$ remained unresolved. They conjectured that the k-binomial complexity function of the m-letter TM word is eventually periodic with period $m^k$. We resolve the conjecture positively by deriving explicit formulae for the k-binomial complexity functions for any generalized TM word. We do this by characterizing k-binomial equivalence among factors of generalized TM words. This comprehensive analysis not only solves the open conjecture, but also develops tools such as abelian Rauzy graphs.
arXiv:2402.17000v2 Announce Type: replace Abstract: Opacity is a general framework modeling security properties of systems interacting with a passive attacker. Initial-and-final-state opacity (IFO) generalizes the classical notions of opacity, such as current-state opacity and initial-state opacity. In IFO, the secret is whether the system evolved from a given initial state to a given final state or not. There are two algorithms for IFO verification. One arises from a trellis-based state estimator, which builds a semigroup of binary relations generated by the events of the automaton, and the other is based on the reduction to language inclusion. The time complexity of both algorithms is bounded by a super-exponential function, and it is a challenging open problem to find a faster algorithm or to show that no faster algorithm exists. We discuss the lower-bound time complexity for both general and special cases, and use extensive benchmarks to compare the existing algorithms.
arXiv:2412.16612v1 Announce Type: new Abstract: Vector Addition Systems with States (VASS), equivalent to Petri nets, are a well-established model of concurrency. The central algorithmic challenge in VASS is the reachability problem: is there a run from a given starting state and counter values to a given target state and counter values? When the input is encoded in binary, reachability is computationally intractable: even in dimension one, it is NP-hard. In this paper, we comprehensively characterise the tractability border of the problem when the input is encoded in unary. For our main result, we prove that reachability is NP-hard in unary encoded 3-VASS, even when structure is heavily restricted to be a simple linear path scheme. This improves upon a recent result of Czerwi\'nski and Orlikowski (2022), in both the number of counters and expressiveness of the considered model, as well as answers open questions of Englert, Lazi\'c, and Totzke (2016) and Leroux (2021). The underlying graph structure of a simple linear path scheme (SLPS) is just a path with self-loops at each node. We also study the exceedingly weak model of computation that is SPLS with counter updates in {-1,0,+1}. Here, we show that reachability is NP-hard when the dimension is bounded by O(\alpha(k)), where \alpha is the inverse Ackermann function and k bounds the size of the SLPS. We complement our result by presenting a polynomial-time algorithm that decides reachability in 2-SLPS when the initial and target configurations are specified in binary. To achieve this, we show that reachability in such instances is well-structured: all loops, except perhaps for a constant number, are taken either polynomially many times or almost maximally. This extends the main result of Englert, Lazi\'c, and Totzke (2016) who showed the problem is in NL when the initial and target configurations are specified in unary.