cs.DS
70 postsarXiv:2410.02234v2 Announce Type: replace Abstract: Ego-centric queries, focusing on a target vertex and its direct neighbors, are essential for various applications. Enabling such queries on graphs owned by mutually distrustful data providers, without breaching privacy, holds promise for more comprehensive results. In this paper, we propose GORAM, a graph-oriented data structure that enables efficient ego-centric queries on federated graphs with strong privacy guarantees. GORAM is built upon secure multi-party computation (MPC) and ensures that no single party can learn any sensitive information about the graph data or the querying keys during the process. However, achieving practical performance with privacy guaranteed presents a challenge. To overcome this, GORAM is designed to partition the federated graph and construct an Oblivious RAM(ORAM)-inspired index atop these partitions. This design enables each ego-centric query to process only a single partition, which can be accessed fast and securely. To evaluate the performance of GORAM, we developed a prototype querying engine on a real-world MPC framework. We conduct a comprehensive evaluation with five commonly used queries on both synthetic and real-world graphs. Our evaluation shows that all benchmark queries can be completed in just 58.1 milliseconds to 35.7 seconds, even on graphs with up to 41.6 million vertices and 1.4 billion edges. To the best of our knowledge, this represents the first instance of processing billion-scale graphs with practical performance on MPC.
arXiv:2412.19057v2 Announce Type: replace Abstract: We design a deterministic algorithm for the $(1+\epsilon)$-approximate maximum matching problem. Our primary result demonstrates that this problem can be solved in $O(\epsilon^{-6})$ semi-streaming passes, improving upon the $O(\epsilon^{-19})$ pass-complexity algorithm by [Fischer, Mitrovi\'c, and Uitto, STOC'22]. This contributes substantially toward resolving Open question 2 from [Assadi, SOSA'24]. Leveraging the framework introduced in [FMU'22], our algorithm achieves an analogous round complexity speed-up for computing a $(1+\epsilon)$-approximate maximum matching in both the Massively Parallel Computation (MPC) and CONGEST models. The data structures maintained by our algorithm are formulated using blossom notation and represented through alternating trees. This approach enables a simplified correctness analysis by treating specific components as if operating on bipartite graphs, effectively circumventing certain technical intricacies present in prior work.
arXiv:2109.08745v4 Announce Type: replace Abstract: We initiate the study of sublinear-time algorithms that access their input via an online adversarial erasure oracle. After answering each input query, such an oracle can erase $t$ input values. Our goal is to understand the complexity of basic computational tasks in extremely adversarial situations, where the algorithm's access to data is blocked during the execution of the algorithm in response to its actions. Specifically, we focus on property testing in the model with online erasures. We show that two fundamental properties of functions, linearity and quadraticity, can be tested for constant $t$ with asymptotically the same complexity as in the standard property testing model. For linearity testing, we prove tight bounds in terms of $t$, showing that the query complexity is $\Theta(\log t).$ In contrast to linearity and quadraticity, some other properties, including sortedness and the Lipschitz property of sequences, cannot be tested at all, even for $t=1$. Our investigation leads to a deeper understanding of the structure of violations of linearity and other widely studied properties. We also consider implications of our results for algorithms that are resilient to online adversarial corruptions instead of erasures.
arXiv:2407.17619v2 Announce Type: replace Abstract: The graph continual release model of differential privacy seeks to produce differentially private solutions to graph problems under a stream of edge updates where new private solutions are released after each update. Thus far, previously known edge-differentially private algorithms for most graph problems including densest subgraph and matchings in the continual release setting only output real-value estimates (not vertex subset solutions) and do not use sublinear space. Instead, they rely on computing exact graph statistics on the input [FHO21,SLMVC18]. In this paper, we leverage sparsification to address the above shortcomings for edge-insertion streams. Our edge-differentially private algorithms use sublinear space with respect to the number of edges in the graph while some also achieve sublinear space in the number of vertices in the graph. In addition, for the densest subgraph problem, we also output edge-differentially private vertex subset solutions; no previous graph algorithms in the continual release model output such subsets. We make novel use of assorted sparsification techniques from the non-private streaming and static graph algorithms literature to achieve new results in the sublinear space, continual release setting. This includes algorithms for densest subgraph, maximum matching, as well as the first continual release $k$-core decomposition algorithm. We conclude with polynomial additive error lower bounds for edge-privacy in the fully dynamic setting.
arXiv:2411.06857v2 Announce Type: replace Abstract: We study algebraic properties of partition functions, particularly the location of zeros, through the lens of rapidly mixing Markov chains. The classical Lee-Yang program initiated the study of phase transitions via locating complex zeros of partition functions. Markov chains, besides serving as algorithms, have also been used to model physical processes tending to equilibrium. In many scenarios, rapid mixing of Markov chains coincides with the absence of phase transitions (complex zeros). Prior works have shown that the absence of phase transitions implies rapid mixing of Markov chains. We reveal a converse connection by lifting probabilistic tools for the analysis of Markov chains to study complex zeros of partition functions. Our motivating example is the independence polynomial on $k$-uniform hypergraphs, where the best-known zero-free regime has been significantly lagging behind the regime where we have rapidly mixing Markov chains for the underlying hypergraph independent sets. Specifically, the Glauber dynamics is known to mix rapidly on independent sets in a $k$-uniform hypergraph of maximum degree $\Delta$ provided that $\Delta \lesssim 2^{k/2}$. On the other hand, the best-known zero-freeness around the point $1$ of the independence polynomial on $k$-uniform hypergraphs requires $\Delta \le 5$, the same bound as on a graph. By introducing a complex extension of Markov chains, we lift an existing percolation argument to the complex plane, and show that if $\Delta \lesssim 2^{k/2}$, the Markov chain converges in a complex neighborhood, and the independence polynomial itself does not vanish in the same neighborhood. In the same regime, our result also implies central limit theorems for the size of a uniformly random independent set, and deterministic approximation algorithms for the number of hypergraph independent sets of size $k \le \alpha n$ for some constant $\alpha$.
arXiv:2411.11544v2 Announce Type: replace Abstract: Bonne and Censor-Hillel (ICALP 2019) initiated the study of distributed subgraph finding in dynamic networks of limited bandwidth. For the case where the target subgraph is a clique, they determined the tight bandwidth complexity bounds in nearly all settings. However, several open questions remain, and very little is known about finding subgraphs beyond cliques. In this work, we consider these questions and explore subgraphs beyond cliques. For finding cliques, we establish an $\Omega(\log \log n)$ bandwidth lower bound for one-round membership-detection under edge insertions only and an $\Omega(\log \log \log n)$ bandwidth lower bound for one-round detection under both edge insertions and node insertions. Moreover, we demonstrate new algorithms to show that our lower bounds are tight in bounded-degree networks when the target subgraph is a triangle. Prior to our work, no lower bounds were known for these problems. For finding subgraphs beyond cliques, we present a complete characterization of the bandwidth complexity of the membership-listing problem for every target subgraph, every number of rounds, and every type of topological change: node insertions, node deletions, edge insertions, and edge deletions. We also show partial characterizations for one-round membership-detection and listing.
arXiv:2501.00614v1 Announce Type: cross Abstract: Seymour's Second Neighborhood Conjecture asserts that in the square of any oriented graph, there exists a node whose out-degree at least doubles. This paper presents a definitive proof of the conjecture by introducing the GLOVER (Graph Level Order) data structure, which facilitates a systematic partitioning of neighborhoods and an analysis of degree-doubling conditions. By leveraging this structure, we construct a decreasing sequence of subsets that establish a well-ordering of nodes, ensuring that no counterexample can exist. This approach not only confirms the conjecture for all oriented graphs but also provides a novel framework for analyzing degrees and arcs in complex networks. The findings have implications for theoretical graph studies and practical applications in network optimization and algorithm design.
arXiv:1902.00488v3 Announce Type: replace Abstract: The reachability problem asks to decide if there exists a path from one vertex to another in a digraph. In a grid digraph, the vertices are the points of a two-dimensional square grid, and an edge can occur between a vertex and its immediate horizontal and vertical neighbors only. Asano and Doerr (CCCG'11) presented the first simultaneous time-space bound for reachability in grid digraphs by solving the problem in polynomial time and $O(n^{1/2 + \epsilon})$ space. In 2018, the space complexity was improved to $\tilde{O}(n^{1/3})$ by Ashida and Nakagawa (SoCG'18). In this paper, we show that there exists a polynomial-time algorithm that uses $O(n^{1/4 + \epsilon})$ space to solve the reachability problem in a grid digraph containing $n$ vertices. We define and construct a new separator-like device called pseudoseparator to develop a divide-and-conquer algorithm. This algorithm works in a space-efficient manner to solve reachability.
arXiv:2305.00425v2 Announce Type: replace Abstract: An orthogonal drawing is an embedding of a plane graph into a grid. In a seminal work of Tamassia (SIAM Journal on Computing 1987), a simple combinatorial characterization of angle assignments that can be realized as bend-free orthogonal drawings was established, thereby allowing an orthogonal drawing to be described combinatorially by listing the angles of all corners. The characterization reduces the need to consider certain geometric aspects, such as edge lengths and vertex coordinates, and simplifies the task of graph drawing algorithm design. Barth, Niedermann, Rutter, and Wolf (SoCG 2017) established an analogous combinatorial characterization for ortho-radial drawings, which are a generalization of orthogonal drawings to cylindrical grids. The proof of the characterization is existential and does not result in an efficient algorithm. Niedermann, Rutter, and Wolf (SoCG 2019) later addressed this issue by developing quadratic-time algorithms for both testing the realizability of a given angle assignment as an ortho-radial drawing without bends and constructing such a drawing. In this paper, we further improve the time complexity of these tasks to near-linear time. We establish a new characterization for ortho-radial drawings based on the concept of a good sequence. Using the new characterization, we design a simple greedy algorithm for constructing ortho-radial drawings.
arXiv:2404.19081v3 Announce Type: replace Abstract: We study the communication complexity of $(\Delta + 1)$ vertex coloring, where the edges of an $n$-vertex graph of maximum degree $\Delta$ are partitioned between two players. We provide a randomized protocol which uses $O(n)$ bits of communication and ends with both players knowing the coloring. Combining this with a folklore $\Omega(n)$ lower bound, this settles the randomized communication complexity of $(\Delta + 1)$-coloring up to constant factors.
arXiv:2501.00111v1 Announce Type: new Abstract: Given a binary string $\omega$ over the alphabet $\{0, 1\}$, a vector $(a, b)$ is a Parikh vector if and only if a factor of $\omega$ contains exactly $a$ occurrences of $0$ and $b$ occurrences of $1$. Answering whether a vector is a Parikh vector of $\omega$ is known as the Binary Jumbled Indexing Problem (BJPMP) or the Histogram Indexing Problem. Most solutions to this problem rely on an $O(n)$ word-space index to answer queries in constant time, encoding the Parikh set of $\omega$, i.e., all its Parikh vectors. Cunha et al. (Combinatorial Pattern Matching, 2017) introduced an algorithm (JBM2017), which computes the index table in $O(n+\rho^2)$ time, where $\rho$ is the number of runs of identical digits in $\omega$, leading to $O(n^2)$ in the worst case. We prove that the average number of runs $\rho$ is $n/4$, confirming the quadratic behavior also in the average-case. We propose a new algorithm, SFTree, which uses a suffix tree to remove duplicate substrings. Although SFTree also has an average-case complexity of $\Theta(n^2)$ due to the fundamental reliance on run boundaries, it achieves practical improvements by minimizing memory access overhead through vectorization. The suffix tree further allows distinct substrings to be processed efficiently, reducing the effective cost of memory access. As a result, while both algorithms exhibit similar theoretical growth, SFTree significantly outperforms others in practice. Our analysis highlights both the theoretical and practical benefits of the SFTree approach, with potential extensions to other applications of suffix trees.
arXiv:2501.00120v1 Announce Type: new Abstract: For a set $P$ of $n$ points in the plane and a value $r > 0$, the unit-disk range reporting problem is to construct a data structure so that given any query disk of radius $r$, all points of $P$ in the disk can be reported efficiently. We consider the dynamic version of the problem where point insertions and deletions of $P$ are allowed. The previous best method provides a data structure of $O(n\log n)$ space that supports $O(\log^{3+\epsilon}n)$ amortized insertion time, $O(\log^{5+\epsilon}n)$ amortized deletion time, and $O(\log^2 n/\log\log n+k)$ query time, where $\epsilon$ is an arbitrarily small positive constant and $k$ is the output size. In this paper, we improve the query time to $O(\log n+k)$ while keeping other complexities the same as before. A key ingredient of our approach is a shallow cutting algorithm for circular arcs, which may be interesting in its own right. A related problem that can also be solved by our techniques is the dynamic unit-disk range emptiness queries: Given a query unit disk, we wish to determine whether the disk contains a point of $P$. The best previous work can maintain $P$ in a data structure of $O(n)$ space that supports $O(\log^2 n)$ amortized insertion time, $O(\log^4n)$ amortized deletion time, and $O(\log^2 n)$ query time. Our new data structure also uses $O(n)$ space but can support each update in $O(\log^{1+\epsilon} n)$ amortized time and support each query in $O(\log n)$ time.
arXiv:2501.00161v1 Announce Type: new Abstract: The $H$-Induced Minor Containment problem ($H$-IMC) consists in deciding if a fixed graph $H$ is an induced minor of a graph $G$ given as input, that is, whether $H$ can be obtained from $G$ by deleting vertices and contracting edges. Several graphs $H$ are known for which $H$-IMC is \NP-complete, even when $H$ is a tree. In this paper, we investigate which conditions on $H$ and $G$ are sufficient so that the problem becomes polynomial-time solvable. Our results identify three infinite classes of graphs such that, if $H$ belongs to one of these classes, then $H$-IMC can be solved in polynomial time. Moreover, we show that if the input graph $G$ excludes long induced paths, then $H$-IMC is polynomial-time solvable for any fixed graph $H$. As a byproduct of our results, this implies that $H$-IMC is polynomial-time solvable for all graphs $H$ with at most $5$ vertices, except for three open cases.
arXiv:2501.00337v1 Announce Type: new Abstract: In the almost-everywhere reliable message transmission problem, introduced by [Dwork, Pippenger, Peleg, Upfal'86], the goal is to design a sparse communication network $G$ that supports efficient, fault-tolerant protocols for interactions between all node pairs. By fault-tolerant, we mean that that even if an adversary corrupts a small fraction of vertices in $G$, then all but a small fraction of vertices can still communicate perfectly via the constructed protocols. Being successful to do so allows one to simulate, on a sparse graph, any fault-tolerant distributed computing task and secure multi-party computation protocols built for a complete network, with only minimal overhead in efficiency. Previous works on this problem achieved either constant-degree networks tolerating $o(1)$ faults, constant-degree networks tolerating a constant fraction of faults via inefficient protocols (exponential work complexity), or poly-logarithmic degree networks tolerating a constant fraction of faults. We show a construction of constant-degree networks with efficient protocols (i.e., with polylogarithmic work complexity) that can tolerate a constant fraction of adversarial faults, thus solving the main open problem of Dwork et al.. Our main contribution is a composition technique for communication networks, based on graph products. Our technique combines two networks tolerant to adversarial edge-faults to construct a network with a smaller degree while maintaining efficiency and fault-tolerance. We apply this composition result multiple times, using the polylogarithmic-degree edge-fault tolerant networks constructed in a recent work of [Bafna, Minzer, Vyas'24] (that are based on high-dimensional expanders) with itself, and then with the constant-degree networks (albeit with inefficient protocols) of [Upfal'92].
arXiv:2501.00860v1 Announce Type: new Abstract: The successive and the amendment procedures have been widely employed in parliamentary and legislative decision making and have undergone extensive study in the literature from various perspectives. However, investigating them through the lens of computational complexity theory has not been as thoroughly conducted as for many other prevalent voting procedures heretofore. To the best of our knowledge, there is only one paper which explores the complexity of several strategic voting problems under these two procedures, prior to our current work. To provide a better understanding of to what extent the two procedures resist strategic behavior, we study the parameterized complexity of constructive/destructive control by adding/deleting voters/candidates for both procedures. To enhance the generalizability of our results, we also examine a more generalized form of the amendment procedure. Our exploration yields a comprehensive (parameterized) complexity landscape of these problems with respect to numerous parameters.
arXiv:2501.00926v1 Announce Type: new Abstract: Computing matchings in general graphs plays a central role in graph algorithms. However, despite the recent interest in differentially private graph algorithms, there has been limited work on private matchings. Moreover, almost all existing work focuses on estimating the size of the maximum matching, whereas in many applications, the matching itself is the object of interest. There is currently only a single work on private algorithms for computing matching solutions by [HHRRW STOC'14]. Moreover, their work focuses on allocation problems and hence is limited to bipartite graphs. Motivated by the importance of computing matchings in sensitive graph data, we initiate the study of differentially private algorithms for computing maximal and maximum matchings in general graphs. We provide a number of algorithms and lower bounds for this problem in different models and settings. We first prove a lower bound showing that computing explicit solutions necessarily incurs large error, even if we try to obtain privacy by allowing ourselves to output non-edges. We then consider implicit solutions, where at the end of the computation there is an ($\varepsilon$-differentially private) billboard and each node can determine its matched edge(s) based on what is written on this publicly visible billboard. For this solution concept, we provide tight upper and lower (bicriteria) bounds, where the degree bound is violated by a logarithmic factor (which we show is necessary). We further show that our algorithm can be made distributed in the local edge DP (LEDP) model, and can even be done in a logarithmic number of rounds if we further relax the degree bounds by logarithmic factors. Our edge-DP matching algorithms give rise to new matching algorithms in the node-DP setting by combining our edge-DP algorithms with a novel use of arboricity sparsifiers. [...]
arXiv:2501.00991v1 Announce Type: new Abstract: We investigate the structure of graphs of twin-width at most $1$, and obtain the following results: - Graphs of twin-width at most $1$ are permutation graphs. In particular they have an intersection model and a linear structure. - There is always a $1$-contraction sequence closely following a given permutation diagram. - Based on a recursive decomposition theorem, we obtain a simple algorithm running in linear time that produces a $1$-contraction sequence of a graph, or guarantees that it has twin-width more than $1$. - We characterise distance-hereditary graphs based on their twin-width and deduce a linear time algorithm to compute optimal sequences on this class of graphs.
arXiv:2501.01071v1 Announce Type: new Abstract: This article provides a comprehensive exploration of submodular maximization problems, focusing on those subject to uniform and partition matroids. Crucial for a wide array of applications in fields ranging from computer science to systems engineering, submodular maximization entails selecting elements from a discrete set to optimize a submodular utility function under certain constraints. We explore the foundational aspects of submodular functions and matroids, outlining their core properties and illustrating their application through various optimization scenarios. Central to our exposition is the discussion on algorithmic strategies, particularly the sequential greedy algorithm and its efficacy under matroid constraints. Additionally, we extend our analysis to distributed submodular maximization, highlighting the challenges and solutions for large-scale, distributed optimization problems. This work aims to succinctly bridge the gap between theoretical insights and practical applications in submodular maximization, providing a solid foundation for researchers navigating this intricate domain.
arXiv:2501.01099v1 Announce Type: new Abstract: Given a set of three positive integers {a1, a2, a3}, denoted A, the Frobenius problem in three variables is to find the greatest integer which cannot be expressed in the following form, where x1, x2 and x3 are non-negative integers: x1*a1 + x2*a2 + x3*a3 The fastest known algorithm for solving the three variable case of the Frobenius problem was invented by H. Greenberg in 1988 whose worst case time complexity is a logarithmic function of A. In 2017 A. Tripathi presented another algorithm for solving the same problem. This article presents an algorithm whose foundation is the same as Tripathi's. However, the algorithm presented here is significantly different from Tripathi's and we show that its worst case time complexity also is a logarithmic function of A
arXiv:2411.19906v2 Announce Type: replace-cross Abstract: L-systems can be made to model and create simulations of many biological processes, such as plant development. Finding an L-system for a given process is typically solved by hand, by experts, in a massively time-consuming process. It would be significant if this could be done automatically from data, such as from sequences of images. In this paper, we are interested in inferring a particular type of L-system, deterministic context-free L-system (D0L-system) from a sequence of strings. We introduce the characteristic graph of a sequence of strings, which we then utilize to translate our problem (inferring D0L-system) in polynomial time into the maximum independent set problem (MIS) and the SAT problem. After that, we offer a classical exact algorithm and an approximate quantum algorithm for the problem.