key: cord-0437757-hs0vm0tr authors: Das, Suddhasattwa; Mustavee, Shakib; Agarwal, Shaurya title: Uncovering Quasi-periodic Nature of Physical Systems: A Case Study of Signalized Intersections date: 2021-09-15 journal: nan DOI: nan sha: 7094120df436ce206c49ba6f2457edb82f6769a1 doc_id: 437757 cord_uid: hs0vm0tr This paper presents a novel approach to analyze quasiperiodically driven dynamical systems. It aims to develop a complete data-driven framework for modeling such unknown dynamics. To achieve this, we characterize Koopman eigenfrequencies as generating frequencies of the quasiperiodic driver of the system. We compute true eigenfrequencies of Koopman operators by applying the theory of Reproducing Kernel Hibert Space (RKHS) and results from ergodic theory. We also demonstrate the decomposition of quasiperiodically driven dynamics into two components, i) the quasiperiodic driving source with generating frequencies and ii) the driven nonlinear dynamics. A unique aspect of the proposed framework is that it applies to the analysis of systems where the periodic component is either non-dominant or even absent. As a case study, we analyze a system of nine traffic signalized intersections. The proposed framework accurately reconstructs the measured queue lengths of the signalized intersections and makes stable long-term predictions. Many physical phenomena show periodicities, such as oscillations and limit cycles, characterized by a single frequency or period. There are also many phenomena, such as the arrival of fall foliage, the arctic ice cycle, and planetary dynamics, in which there are multiple driving frequencies. As a result of which the states of the system do not show exact and regular periodicity. Such systems are called quasiperiodic systems. More complicated systems exhibit chaos and a high degree of nonlinearity but having a quasiperiodic component as a driving source. Examples are astronomical systems [1] , climate systems [2] , [3] , geophysical flows on periodic domains [4] , epidemics [5] , and computational neuroscience [6] . The goal of this paper is to develop a robust method for reconstructing quasiperiodically driven dynamical systems using the Koopman operator framework. Koopman operator is originally a linear operator that describes the spatiotemporal evolution of a nonlinear dynamical system. Projection of observables on the eigenfunction of the Koopman operator is defined as Koopman modes. Koopman modes can efficiently capture quasiperiodic components of a flow and outperform proper orthogonal decomposition characterizing the evolution on limit cycles, and tori [7] . Koopman operator has also proven to be successful for describing highly non-periodic dynamics by decomposing the underlying dynamics into periodic and quasiperiodic patterns [6] . In Section II-B, we describe quasiperiodicity in terms of eigenfunctions of the Koopman operator. However, an accurate estimation of the eigenfunctions and eigenfrequencies has remained an elusive task. Techniques such as DMD [7] , [8] , deep neural networks [9] , [10] , or Fourier averaging [1] are inadequate when the dynamics has a substantial chaotic component. Our novel approach uses a technique from RKHS interpolation theory developed in [11] to extract the true eigenfrequencies. After that, we proceed to identify different components of the dynamics in a manner similar in structure to [6] . This approach leads to a robust theoretical and numerical method to reconstruct quasiperiodically driven systems having strong chaotic components by identifying their true eigenfrequencies. As a case study, we study the queue length dynamics on a corridor of traffic signalized intersections. Such a system has quasiperiodic driving sources, but the measurement data also reflects unpredictable phenomena such as accidents, road constructions, and human factors. It makes the study more challenging. Contributions: The contributions of this paper is as follows: • We provide an alternative mathematical formulation of the quasiperiodic coordinates in terms of the Koopman operator that correctly recovers both the periodic and chaotic part of the dynamics. • The proposed framework is even applicable to the systems where the periodic component is either nondominant or even absent. • We describe the physical significance of the quasiperiodic sources obtained from the proposed technique Outline: Rest of the paper is arranged as follows: The dynamical systems theory related to quasiperiodically driven systems is described in Section II-A. We discuss the relevant concepts and techniques in Section II-B and Section II-C. The proposed data-driven implementation is described in Section III. Section IV performs a case study using real measurements from traffic intersections and discusses the results. A quasiperiodically driven system has the form where f is some nonlinear function. The angular coordinate θ is a point on a d-dimensional torus T d . θ represents the phase of a driving quasiperiodic system. The vector ρ is called the rotation vector [12] , [13] , it represents the angular increments at each step for each of the coordinates of θ. If the underlying system arises from a continuous time system by taking samples at intervals ∆t, then ρ = ∆t ω for some angular frequency vector ω. The variable x lies in some abstract or unknown manifold X . Let Ω := T d × X . Thus (1) is a one-way coupled or skew-product dynamical system on the space Ω : . (2) The space Ω is unknown, and its points are pairs of points from T d and X respectively. This abstract formulation applies to all quasiperiodically driven systems. We further assume the dynamics in (1) to be of the form θ n+1 = θ n + ∆t ω mod 2π Thus the task is to find (i) the quasiperiodicity dimension d and the rotation vector ω; and the functions (ii) g per and (iii) g chaos . The functions and spaces above will be assumed to be unknown. The only information about the system will be through a collection of k observations / measurements, represented collectively as a map Y : Ω → R k . Y is possibly unknown, and possibly a low-dimensional / partial observation of Ω. It generates a sequence of k-dimensional data points {y n := Y (θ n , x n ) : n = 0, 1, 2, . . .}, where (θ n , x n ) is a trajectory of the dynamics in (2) under F . The Koopman operator U is essentially a time-shift operator. It operates on functions instead of points on the phase space. Given a function φ : Ω → R, U φ is the function defined as where F is the underlying dynamical system (2) . φ can be interpreted as a measurement or observation on the phase space Ω, and U φ is the evolution/transformation of this measurement with the dynamics. a) Koopman eigenfrequencies: Since U is unitary; its spectrum must lie on the unit circle of the complex plane. The eigenvalues of U correspond to the point spectrum, and any eigenfunction ζ has a corresponding eigenvalue of the form e ιω for some ω ∈ R. Thus ω is called the Koopman eigenfrequency corresponding to ζ. Thus the time-evolution of Koopman eigenfunctions is equivalent to multiplication by e ιωn as a function of time n. U always has the constant functions as eigenfunctions with eigenfrequency 0. U may or may not have other eigenfrequencies. The collection of eigenfunctions and (eigen)-frequencies have an algebraic structure to them. For any two frequencies ω 1 , ω 2 , and integers a, b, aω 1 + bω 2 is also a frequency [11] . If the system has at least one nonzero frequency, then it has all harmonics of that frequency and thus infinitely many frequencies. A collection of eigenfrequencies is said to be independent if no integer linear combination is an integer. If the system has two independent frequencies, then all its frequencies are together dense on the real line. A collection of frequencies will be called a basis or generating set of eigenfrequencies if they are independent, and all frequencies of the system can be generated by taking integer linear combinations of frequencies from this set. There is no unique choice of a basis, but all bases will have the same cardinality d, called the quasiperiodicity dimension d. d is a fixed finite number if Ω is a finite-dimensional manifold. b) Koopman eigenfunctions and torus dynamics: Koopman eigenfunctions reveal quasiperiodic dynamics embedded in the system. Any k Koopman eigenfunctions leads to a rotation on a k-dimensional torus: Here, R ω is a rotation by the vector ω of frequencies. If these eigenfrequencies are independent, then this map will be surjective. Taking d = k implies that the dynamics has an embedded / factor torus rotation of the same dimension as the quasiperiodicity dimension. This completes our examination of the quasiperiodic structure of the dynamics (2). We next discuss some techniques from Functional Analysis for reconstructing the quasiperiodic component and its complement. A kernel is a function k : M × M → R on some space M . The quantity k(x, y) is measure of similarity, closeness or distance between two points x, y ∈ M . Kernel based methods have been used very effectively to obtain information such as statistical manifolds [14] , geometric information [15] , and dynamical information such as tracer flows [4] , stable/unstable foliations [16] , Koopman spectrum [11] , [17] . The techniques in this paper are based on [11] . We shall use the Gaussian kernel where is called the bandwidth parameter, and d(·, ·) is some notion of metric or distance on the space. a) Delay-coordinates: The data sequence y n is obtained through an observation Y . However Y may not be a one-toone map and its values may not correspond to unique states in Ω. We convert Y into an embedding using the method of delay coordinates [18] , by incorporating Q delays to get the map Y (Q) : Ω → R k(Q+1) : Thus the delay coordinated version of each point y n is y n ↔ y (Q) N := (y n y n+1 , . . . , y n+Q ) . We next use the Gaussian shape function to implicitly obtain a kernel k : Ω × Ω → R as follows Even if the two states z, z are unknown, the left-hand side in (6) can be computed since the right-hand side only uses the observation map Y . We next modify k by a process called bistochastic normalization [19] to get a kernel p which is symmetric, Markovian, and more adapted to the non-uniform distribution of the data. b) Kernel integral operator: Associated to the kernel p : Ω × Ω → R is the integral operator P , which operates on P is a compact, symmetric operator on L 2 (µ) [e,g, 19] . Moreover P has a complete basis of eigenfunctions where the indexing is done so that the λ j s are in decreasing order. Due to the normalizations carried out, we have φ 1 ≡ 1 Ω , the constant function equal to 1 everywhere. Moreover, the eigenvalues satisfy1 = λ 1 ≥ λ 2 ≥ λ 2 ≥ . . . > 0. Also importantly, the φ j are an orthonormal basis, i.e., All these properties of the λ j and φ j are useful for kernelbased learning, in which we recreate or extrapolate unknown functions from some samples using these φ j s as a basis. c) Kernel based learning: One of the main advantages of kernel-based approaches is that while the φ j can be approximated to any degree of accuracy by solving an eigenvalue equation of a data-driven matrix; they can be easily extended from vectors to a continuous function over the entire data space R k(|Q|+1) and thus on Ω. The φ j s also happen to be left singular vectors of the asymmetric operatorK , with λ j and γ j being the associated singular values and right singular vectors. Thus for an arbitrary point z ∈ R k(|Q|+1) , we have These φ j for the basis for learning any function f : R k(Q+1) → R d . It is done by first computing the components of f along the first L φ j and then taking the sum/integral The parameter L is called the spectral truncation parameter; it is the size of the hypothesis space. We next describe the use of the fast-Fourier transform to derive the Koopman eigenfrequencies but on the L eigenfunctions instead of the raw data. In the data-driven approach, all of the entities described in Section II-C have a data-driven analog. We begin with the invariant measure µ itself. It will be replaced by µ N = n . These are called sampling/empirical measures. Their integrals with respect to these sampling measures are given by for every continuous test function φ : R k(Q+1) → R. The kernel integral operators K,K and P will be approximated as N × N matrices [K], K and [P ] : The functionφ l is continuous as the vector k (y) is a continuous function of y. If y in the above equation is substituted by one of the data-points y Thusφ l is indeed a continuous extension of the vector φ l . This feature of extendability and easy evaluation at arbitrary points is one of the most powerful tools of kernel-based methods. a) RKHS based spectral filtering: The following procedure was described in [11, Algorithm 1] , and accepts as parameters 1 , 2 > 0 and integer L 0 > 1. Let F N denote the discrete Fourier transform on vectors of length N . Let [Φ] be the N × L matrix whose l-th column is φ l . Set Λ := diag (λ 1 , . . . , λ L ) and compute Next, compute the RKHS-norms as N ∆t for all of the remaining j ∈ J. These frequencies 0 = ω 1 < ω 2 < . . . < ω m can be interpreted to be true Koopman eigenfrequencies with substantial presence in the original data. b) Periodic and chaotic components: The identified frequencies 0 = ω 1 < ω 2 < . . . < ω m are by no means exhaustive, they are only a finite subset of a usually infinite set of Koopman eigenfrequencies. However, they represent those (true) frequencies that have a significant presence in the data. The threshold 1 is meant to be a numerical implementation of frequencies being significant. We next construct the periodic component g per : R → R k as The m × k matrix [A] is the least-squares solution to where Y is the data-matrix and [F ] is N × m matrix [F ] n,j := (2 − δ j,1 )e ιnωj 1 ≤ n ≤ N, 1 ≤ j ≤ m. Next, we construct the chaotic component where c) The reconstruction: We avoid the task of identifying a set of generating frequencies by directly using the selected frequencies in the approximation g per θ + n ω ≈ m j=1 a j exp (ιnω j ) ., n = 0, 1, 2, . . . . (14) Using this simplification in (14) , and the formulas in (13) and (12), we create the following data-driven model of the dynamics : Here each y q n ∈ R k , thus making the state vector y n = y 0 n , . . . , y Q n a vector in R k(|Q|+1) . We have thus created a standalone dynamical system R k(|Q|+1) which is conjugate to the latent dynamics. The case study analyzes queue length measurements from nine adaptive traffic signals located on the Alafaya Trail (SR-434) in East Orlando, FL. The obtained data includes the details of each movement with the time, duration, queue length, and waiting time. It provides information on eight movements: north left (NL), north through (NT), south left (SL), south through (ST), east left (EL), east through (ET), west left (WL), and west through (WT). In this study, we focus on the queue length formation of northbound through movements. The raw data was processed and calibrated by Rahman, and et al. [20] and was resampled at regular intervals of ∆t = 2minutes. This work uses the processed data from [20] . We formulate the signalized intersection corridor as a quasiperiodically driven dynamical system. The underlying dynamics of the system are high dimensional, and its governing equations are unknown. We hypothesize that the signalized intersection corridor system obeys a dynamics of the form (2), and the observed queue lengths are generated through some measurement function Y , as described in Section II-A. Note that the measurement Y is not necessarily one-to-one. In particular, it may not be possible to connect the y i with a dynamical rule of the form y i+1 =F (y i ). Rather y i should be interpreted as a partial observation of the true state in Ω, in the i-th time frame. In Figure 1 we show the visual representation of y i . We use only this data to obtain a parameter-free reconstruction of the dynamical system. We arrange the data in the form of a matrix [Y ] with k = 9 columns. Each column corresponds to the traffic queue length as a function of time, at one among 9 intersections along Alfaya Trail. We used a bandwidth parameter of = 0.1 for the Gaussian kernel. We compute a total of L = 1001 eigenfunctions for the bistochastic kernel. We set 1 = 0.1 and then choose the parameters L 0 = 5 and 2 = 2.5 using the heuristic approach shown in Figure 2 . Lower the value of 2 , more frequencies get filtered out and higher the probability of the identification being correct. The two thresholds 1 and 2 are based on the asymptotic behavior in two different directions [11, Theorem 1, 4] . Combined, they provide a surer guarantee of identification of true eigenfrequencies and the discarding of spurious eigenvalues or pseudo-spectrum. This work analyzed 20, 000 snapshots of queue lengths for the nine intersections to extract Koopman frequencies, i.e., generating frequencies of the quasiperiodic driving source of the dynamics. Figure 3 exhibits the spectrum of Koopman eigenfrequencies. The x axis denotes periods corresponding to the frequencies. We separate longer periods from the shorter ones and present them in two different panels for convenience of presentation. For both the panels, the y-axis shows [W ] j,L0 which we have denoted as amplitude for the selected period indices j. Figure 3 shows that dominant periods are clustered around 1 hour, 2 hours, 3 hours, 6 hours, 12 hours, 14 hours, 3.5 days, 7 days, and 14 days. The identified periods correspond to the natural periods of the system. These periods are consistent with the results in [21] . The authors decomposed speed measurements from an intersection corridor via multiscale multifractal analysis (MMA). They reported that the dominant periodicities on weekdays are 7 days, 24, 12, 8, 6, and 3 hours while on weekends are 12 and 24 hours. In the present work, we did not differentiate weekend data from weekday data. However, we took a different set of observables (i.e., northbound through queue length data) and a completely different intersection corridor system, our identified quasiperiodic frequencies matched with that of [21] . This finding corroborates that the proposed technique can successfully identify generating frequencies of the quasiperiodic driving force of the intersection dynamical systems. We reconstruct the original data from the decomposed parts, which is shown in Figure 4 . The red curve is the output obtained from the reconstruction at each intersection. The curves closely following each other, including the moments when fluctuations occur. Although reconstructed dynamical models differ from the true system, this difference is inevitable in a learning problem. However, the reconstruction of g per is bounded, which guarantees that the dynamics under (3) would remain bounded, and the deviation of the trajectories also remain bounded. In Figure 5 we illustrate the error in reconstruction and prediction by computing the normalized relative error Hereŷ (i) n denotes the output of the reconstructed system (15) . This work developed a data-driven framework for modeling quasi-periodic dynamical systems using Koopman theoretic approach. The proposed approach can handle dynamics with strong nonlinear and chaotic components. Thus, it is applicable across domains. We performed a case study using queue length data on a corridor of nine signalized intersections. The proposed approach accurately identified the generating frequencies and the results for reconstruction and prediction are encouraging. The long-term prediction error remained bounded without exogenous inputs, unlike recurrent neural network-based methods such as long short-term memory (LSTM). Moreover, in comparison to deep NNs, the proposed technique is not a black-box approach. All these advantages make it a promising candidate for future research. Super convergence of ergodic averages for quasiperiodic orbits Singular spectrum analysis in nonlinear dynamics, with applications to paleoclimatic time series Spatiotemporal pattern extraction with data-driven Koopman operators for convectively coupled equatorial waves Extraction and prediction of coherent patterns in incompressible flows through space-time Koopman analysis A linear dynamical perspective on epidemiology: Interplay between early covid-19 outbreak and human mobility Data-driven koopman operator approach for computational neuroscience Study of dynamics in post-transient flows using koopman mode decomposition Exploring dmd-type algorithms for modeling signalised intersections Learning deep neural network representations for koopman operators of nonlinear dynamical systems Reservoir computing universality with stochastic inputs Koopman spectra in reproducing kernel Hilbert spaces Mesure de Lebesgue et nombre de rotation Small denominators. i. mapping of the circumference onto itself An informationgeometric approach for feature extraction in ergodic dynamical systems Density estimation on manifolds with boundary Time-scale separation from diffusion-mapped delay coordinates Delay-coordinate maps and the spectra of Koopman operators Reproducing kernel Hilbert space compactification of unitary evolution groups Real-time signal queue length prediction using long short-term memory neural network Multiscale multifractal analysis of traffic signals to uncover richer structures Authors would like to thank Dr. Samiul Hasan and his research group for providing access to the intersection dataset.