This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright http://www.elsevier.com/copyright Author's personal copy Effective synchronizing algorithms R. Kudłacik a, A. Roman b,⇑, H. Wagner b a IBM Poland, SWG Krakow Laboratory, Armii Krajowej 18, 30-150 Krakow, Poland b Institute of Computer Science, Jagiellonian University, Łojasiewicza 6, 30-348 Krakow, Poland a r t i c l e i n f o Keywords: Circuit testing Conformance testing Synchronizing sequences Synchronizing automata Reset word Synchronizing algorithm a b s t r a c t The notion of a synchronizing sequence plays an important role in the model-based testing of reactive systems, such as sequential circuits or communication protocols. The main problem in this approach is to find the shortest possible sequence which synchronizes the automaton being a model of the system under test. This can be done with a synchronizing algorithm. In this paper we analyze the synchronizing algorithms described in the literature, both exact (with exponential runtime) and greedy (polynomial). We investigate the implementation of the exact algorithm and show how this implementation can be optimized by use of some efficient data structures. We also propose a new greedy algorithm, which relies on some new heuristics. We compare our algorithms with the existing ones, with respect to both runtime and quality aspect. � 2012 Elsevier Ltd. All rights reserved. 1. Introduction Synchronizing words (called also: synchronizing sequences, re- set sequences, reset words or recurrent words) play an important role in the model-based testing of reactive systems (Broy, Jonsson, Katoen, Leucker, & Pretschner, 2005). In presence, with advanced computer technology, systems are getting larger and more compli- cated, but also less reliable. Therefore, testing is indispensable part of system design and implementation. Finite automata are the most frequently used models that describe structure and behavior of the reactive systems, such as sequential circuits, certain types of programs, and, more recently, communication protocols (Fukada, Nakata, Kitamichi, Higashino, & Cavalli, 2001; Ponce, Csopaki, & Tarnay, 1994; Zhao, Liu, Guo, & Zhang, 2010). Because of its prac- tical importance and theoretical interest, the problem of testing fi- nite state machines has been studied in different areas and at various times. Originally, in 1950s and 1960s, the researchers work in this area was motivated by automata theory and sequential cir- cuit testing. The area seemed to have mostly died down, but in 1990s the problem was resurrected due to its applications to con- formance testing of communication protocols. The problem of conformance testing can be described as follows (Lee & Yannakakis, 1996). Let there be given a finite state machine MS which acts as the system specification and for which we know completely its internal structure. Let MI be another machine, which is the alleged implementation of the system and for which we can only observe its behavior. We want to test whether MI correctly implements or conforms to MS. Synchronizing words allow us to bring the machine into one state, no matter which state we currently are in. This helps much in designing effective test cases, e.g. for sequential circuits. In Pomeranz and Reddy (1998) authors show a class of faults for which a synchronizing word for the faulty circuit can be easily determined from the synchronizing word of the fault free circuit. They also consider circuits that have a reset mechanism, and show how reset can ensure that no single fault would cause the circuit to become unsynchronizable. In Hyunwoo, Somenzi, and Pixley (1993) a framework and algo- rithms for test generation based on the multiple observation time strategy are developed by taking advantage of synchronizing words. When a circuit is synchronizable, test generation can em- ploy the multiple observation time strategy and provide better fault coverage, while using the conventional tester operation mod- el. The authors investigate how a synchronizing word simplifies test generation. The central problem in the approach based on the synchroniz- ing words is to find the shortest one for a given automaton. As the problem is NP-hard (see Section 2), the polynomial algorithms cannot be optimal, that is they cannot find the shortest possible synchronizing words (unless P ¼NP, which is strongly believed to be false). In last years some efforts were made in the field of algorithmic approach for finding short synchronizing words (Deshmukh & Hawat, 1994). Pixley, Jeong, and Hachtel (1994) presented an efficient method based upon the universal alignment theorem 0957-4174/$ - see front matter � 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.eswa.2012.04.079 ⇑ Corresponding author. E-mail addresses: rafal.kudlacik@gmail.com (R. Kudłacik), roman@ii.uj.edu.pl (A. Roman), hub.wag@gmail.com (H. Wagner). Expert Systems with Applications 39 (2012) 11746–11757 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / e s w a Author's personal copy and binary decision diagrams to compute a synchronizing word. There are also Natarajan (1986) and Eppstein (1990) algorithms. The problem of synchronizing finite state automata has a long history. While its statement is simple (find a word that sends all states to one state), there are still some important questions to be answered. One of the most intriguing issues is the famous Černý Conjecture (Černý, Pirická, & Rosenauerová, 1971), which states that for any n-state synchronizing automaton there exists a syn- chronizing word of length at most (n � 1)2. Should the conjecture be true, this would be a strict upper bound, as there exist automata with minimal synchronizing words of length exactly (n � 1)2. The Černý Conjecture has profound theoretical significance (remaining one of the last ‘basic’ unanswered questions in the field of auto- mata theory, especially after the Road Coloring Problem has been recently solved by Trahtman (2009)). On the other hand, there are several practical applications of finding short reset sequences: part orienters (Natarajan, 1986), finding one’s location on a map/ graph (Kari, 2002), resetting biocomputers (Ananichev & Volkov, 2003), networking (determining a leader in a network) (Kari, 2002) and testing electronic circuits, mentioned above. Clearly, finding short reset words is important both for theoretical and practical reasons. The paper is organized as follows. In Section 2 we give the basic definitions on automata and synchronizing words. In Sec- tion 3 we introduce two auxiliary constructions which are com- monly used in the synchronizing algorithms. In Section 4 we present the well-known synchronizing algorithms, both exact and greedy. In Sections 5 and 6 we present our two main re- sults: application of efficient data structures to the exact syn- chronizing algorithm and a new, efficient heuristic algorithm. Both sections end with the experimental results and efficiency comparison to other algorithms. 2. Synchronizing words An alphabet is a nonempty, finite set. A word over some alphabet A is a sequence of letters from A. The length of a word w is the number of its letters and is denoted by jwj. By e we denote the empty word of length 0. If A is an alphabet, by A⁄ we denote the set all words over A. For example, if A = {a, b}, then A⁄ = {e, a, b, aa, ab, ba, bb, aaa, . . .}. The catenation of words is denoted by a dot: if u, v 2 A⁄, then u�v = uv. A finite state automaton is a triple A¼ðQ; A; dÞ, where Q is a finite set of states, A is an alphabet and d is a transition function, d:Q � A ? Q. Note that initial and terminal states are not marked – we are not interested in languages accepted by automata, but rather in automaton action itself. In the following, for the sake of simplicity, we will use the word ’automaton’ instead of ‘a finite state automa- ton’. The transition function can be extended to PðQÞ� A�, that is, to the sets of states and words over A. The same symbol d will be used to refer to the extended function d : PðQÞ� A� !PðQÞ. This makes no confusion: "P # Q, a 2 A, w 2 A⁄ dðP; eÞ¼ P; dðP; awÞ¼ [ p2P fdðdðp; aÞ; wÞg: A word w is called a synchronizing word for A¼ðQ; A; dÞ iff jd(Q, w)j = 1. We say that such a word synchronizes A. We also say that A¼ðQ; A; dÞ is synchronizing if there exists w 2 A⁄ that syn- chronizes it. If, for a given A, there is no shorter synchronizing word than w, the word w is called the shortest synchronizing word (SSW) for A. There are two main algorithmic problems in the synchronization theory: in the first one, given a synchronizing automaton A¼ðQ; A; dÞ we ask about SSW for A. In the second one we ask to find any synchronizing word, not necessarily the shortest (but, of course, the shortest word is found, the better), in a reasonable time. These problems can be restated in the form of the following decision problems: Problem FIND-SSW. Input: a synchronizing automaton A and k 2 N. Output: YES iff the shortest word synchronizing A has length k. Problem FIND-SW-OF-LENGTH-K Input: a synchronizing automaton A and k 2 N. Output: YES iff there exists a synchronizing word of length k for A. The decision problem FIND-MSW has been recently shown to be DP-complete (Olschewski & Ummels, 2010). The decision problem FIND-SW-OF-LENGTH-K is NP-complete (Eppstein, 1990). It is well-known that the length of SSW for an n-state synchronizing automaton is at most n 3�n 6 (Klyachko, Rystsov, & Spivak, 1987; Pin, 1983). The Černý conjecture states that this length can be bounded by (n � 1)2. Černý showed (Černý, 1964) that for each n P 1 there exists an automaton with SSW of length (n � 1)2, so the conjectured bound is tight. These automata are called the Černý automata. An n-state Černy automaton will be denoted by Cn. Černy automaton is defined over a two-element alphabet A = {a, b} and its transition function is as follows: 8q 2f0; . . . ; n � 1g dðq; xÞ¼ ðq þ 1Þmodn if x ¼ a q if x ¼ b ^ q – n � 1 0 if x ¼ b ^ q ¼ n � 1 8>< >: ð1Þ Černý automata are very important, as automata with jSSWj = (n � 1)2 are very rare. Only eight such automata are known that are not isomorphic with the Černy automata (Trahtman, 2006). 3. Auxiliary constructions In this section we describe two auxiliary constructions used throughout the paper. Let A¼ðQ; A; dÞ be a synchronizing autom- aton. A pair automaton for A is the automaton A2 ¼ðQ 0; A; d0Þ, where: Q 0 ¼ [ p;q2Q^p–q ffp; qgg[f0g; d0 : Q 0 � A ! Q 0; d0ðfp; qg; lÞ¼ fdðp; lÞ; dðq; lÞg ifdðp; lÞ – dðq; lÞ; 0 otherwise; ( d0ð0; lÞ¼ 0 8l 2 A: Let A¼ðQ; A; dÞ be an automat on. A sequence (q1, q2), (q2, q3), . . . , (ql, ql+1), q1, . . . , ql+1 2 Q, is called a path in A, if for each i = 1, . . . , l there exists ai 2 A, such that d(qi, ai) = qi+1. We will identify such a path with a word a1a2 . . . al (notice that if there is more than one letter transforming some qi into qi+1, then the path including (qi, qi+1) can be identified with more than one word). Pair automaton shows how the pairs of states behave when words are applied to the original automaton. If p, q 2 S # Q and w is a path leading from {p, q} to 0, it means that jd(S, w)j < jSj, where p, q 2 S. In such a situ- ation we say that pair {p, q} of states was synchronized by w. Pair R. Kudłacik et al. / Expert Systems with Applications 39 (2012) 11746–11757 11747 Author's personal copy automaton is utilized in all heuristic algorithms, see Sections 4.3, 4.4 and 4.5. The next proposition is a straightforward, but very important fact, utilized in all heuristic algorithms. Proposition 1. A word w 2 A⁄ synchronizes A2 iff w synchronizes A. Proposition 1 implies the following necessary and sufficient condition for A to be synchronized: Proposition 2. A is synchronizing iff each pair of its states is synchronizing. The problem of finding SSW can be restated as a problem of path-searching in a so-called power-set automaton (or power automaton for short) of A. A power-set automaton for A¼ðQ; A; dÞ is an automaton PðAÞ¼ ð2Q ; A; DÞ, where: 2Q ¼fP � Qgnf;g; D : 2Q � A ! 2Q ; Dðq; lÞ¼ [ s2q fdðs; lÞg 8q 2 2Q ; l 2 A: Like for d, we can extend D to 2Q � A⁄. Let PðAÞ¼ ð2Q ; A; DÞ be a power automaton of A¼ðQ; A; dÞ. State Q 2 2Q will be called a start state of PðAÞ. The size of the power automaton is exponential in the size of the original automaton. There are 2jQj � 1 states and jAj(2jQj� 1) edges. States of the power automaton represent subsets of states of the input automaton and are labeled by the corresponding subsets. We will only consider a subautomaton of the power automaton which is reachable from the start state. Specifically, when we say that the power automaton is small, we mean that the reachable subautomaton is small. Černy automata Cn are the interesting examples here, as all states of their power automata are reachable from the start state. We will sometimes refer to the ‘‘size of state s’’. This means that s is a state of the power automaton and it represents an jsj-element subset s of Q. Since edges in pair automaton represent transitions between subsets of states, the power automaton can be thought of as a way of expressing the global behavior of the input automaton when certain letter (or word) is applied. In contrast, pair-automa- ton describes local behavior only (it shows how the pairs of states are transformed). Proposition 3. The sets of synchronizing words in A and PðAÞ coincide. It means that A is synchronizing iff PðAÞ is synchronizing. It is clear that in a power automaton a path leading from Q 2 2Q to any state F 2 2Q, such that jFj = 1, represents a synchronizing word for A. Also, the shortest such path determines the shortest synchronizing word for A. So the entire problem can be rephrased as a basic graph problem. This is convenient as the single-source path-searching algorithms (exact or otherwise) have been extensively studied. Also, augmenting the generic path-searching methods with knowledge specific to this problem may give some interesting results. 4. Synchronizing algorithms In this section we describe 5 synchronizing algorithms, that is, the algorithms that find a synchronizing word for a given automa- ton. Two of them (EXACT and SEMIGROUP) are exponential ones that always find the shortest synchronizing words. Three others (NATARA- JAN, EPPSTEIN and SYNCHROP) are heuristic algorithms working in poly- nomial time, so they are faster, but they find not necessarily the shortest synchronizing words. In the following sections we assume jQj = n. 4.1. Exact exponential algorithm There are two well-known algorithms finding the shortest syn- chronizing words. Due to the fact that this problem is NP-hard, their runtime complexity is exponential in the size of the input automaton, which limits their use. The standard exact algorithm is a simple breadth-first-search in the power automaton. The runtime is X(2n) in the worst case. Standard implementation requires X(2nn) space. Due to these discouraging fact this algorithm is often disregarded in the literature. Algorithm 1 EXACT ALGORITHMðAÞ 1: Input: an automaton A¼ðQ; A; dÞ 2: Output: SSW of A (if exists) 3: queue Q empty 4: push Q into queue Q 5: mark Q as visited 6: while Q is not empty 7: S popðQÞ 8: foreach a 2 A 9: T d(S, a) 10: if jTj = 1 11: return reversed path from T to Q 12: if T is not visited 13: push T into Q 14: mark T as visited 15: return A is not synchronizing 4.2. Semigroup algorithm Another algorithm (which is typically more memory-efficient) was described in Trahtman (2006) and uses a notion of syntactic semigroup. Let A¼ðQ; A; dÞ be an automaton. Alphabet letters (and also words over A) represent functions Q ? Q, so if f is a func- tion from Q to Q and w 2 A⁄, by f�w we denote the composition of two functions: f and a function corresponding to w. Syntactic semi- group for A is constructed as follows: process all words over A⁄ in the lexicographic order. If a processed word defines a new function f: Q ? Q, add f to list L. The procedure is stopped in two cases: (1) when "f 2 L "a 2 A f�a 2 L, that is, when no new function can be de- fined; (2) when a constant function (mapping all elements into one element) is found. The word corresponding to the constant func- tion is SSW, as words were processed in the lexicographic order. The semigroup algorithm does not require a costly power automa- ton construction phase, but its standard implementation is terribly inefficient in the worst case. So it is slightly better than the power automaton algorithm, but the above fact limits its use only to small automata. The algorithm’s runtime complexity is O(jAjn � s2) with O(n � s) space required (Trahtman, 2006), where s is the size of the syn- tactic semigroup S. Syntactic semigroup size can be as big as nn, but since only a subset of S (containing only words no longer than SSW) is considered, the average runtime is usually much lower. The semigroup algorithm is used in a well-known synchroniza- tion package TESTAS. Its worst-time complexity can be drastically reduced and we show it in Section 5. 4.3. Natarajan algorithm One of the first heuristic algorithms for finding short synchro- nizing words was provided by Natarajan (1986). The algorithm is shown in Listing 2. 11748 R. Kudłacik et al. / Expert Systems with Applications 39 (2012) 11746–11757 Author's personal copy Algorithm 2 NATARAJANðAÞ 1: Input: synchronizing automaton A¼ðQ; A; dÞ 2: Output: synchronizing word for A 3: Q {1, 2, . . . , n}; s e 4: while jQj > 1: 5: choose two states p, q 2 Q 6: w the shortest path from {p, q} to 0 7: Q d(Q, w) 8: s s.w 9: return s A loop in line 4. is performed O(n) times. The shortest path (line 6.) can be found in O(jAjn2). Transformation in line 7. is done in O(n3), because jQj = O(n) and jwj = O(n2). Hence, the total complex- ity is O(jAjn3 + n4). 4.4. Eppstein and cycle algorithms Eppstein proposed a modification of Natarajan’s algorithm. The modification is based on a preprocessing in which for each pair of states we compute the first letter of the shortest word synchroniz- ing these states. Eppstein has shown that this preprocessing allows us to reduce the complexity to O(n3 + jAjn2). CYCLE is a slight modification of EPPSTEIN. In CYCLE, when a pair of states in synchronized into some state q, it is required that in the next step q must be one of the elements in the chosen pair. CYCLE works optimally for Černý automata, that is, always returns SSW. 4.5. SYNCHROP algorithm SYNCHROP (and its modified version, SYNCHROPL) algorithm (Roman, 2009) is, in comparison to NATARAJAN, a ‘one-step-ahead’ procedure – we do not choose arbitrary pair of states as in line 5. of NATARAJAN. Let w(p, q) be the shortest word synchronizing {p, q}. For each {p, q} we check how the set of states in the pair automaton will be transformed if we apply w(p, q) to all states we currently are in. Each transformation is rated in terms of some heuristically de- fined cost function. We choose the pair with the lowest cost func- tion. The remaining part of the algorithm is exactly the same as in NATARAJAN. In its original version, SYNCHROP algorithm does not use the preprocessing introduced in EPPSTEIN. Therefore, its complexity is O(n5 + jAjn2). The detailed description and discussion on SYNCHROP properties and complexity is given in Section 6. 5. Optimizing exponential algorithms In this section we deal with the exact synchronizing algorithms. We show how the selection of the efficient data structures affects on the time complexity. Let us consider the basic version of the algorithm shown in Listing 1. While the algorithm looks very simple, its performance greatly depends on the data-structures used. The following aspects of the algorithm must be considered: 1. transition function computation, 2. state representation, 3. queue implementation, 4. visited states’ set implementation, 5. predecessor tree implementation (required, if the actual SSW must be returned rather than its length). Judging by the complexity (given in Trahtman (2006)) of the SEMIGROUP algorithm implementation in TESTAS, checking if t was previously visited (Listing 1, line 12.) is assumed to be performed in H(nm), where m is the number of elements visited. This step can easily be done in time nlog m using the standard tree-based dictionaries. So the worst case runtime complexity can easily be re- duced from H(jAjns2) to H(jAjnslog s). Another simple optimization is possible. Note that the original algorithm generates sequences of states of size n (and not sets). We can treat these elements as sets without losing any valuable information. This should also speed the process up: semigroup size s can reach the value of nn (and reaches 22n for Cn ), while there are at most 2n subsets of the considered set. Finally, for small n, a trick can be used: sets can be mapped to integers. This technique will be described in more details later. An ordinary array can be used to check if a set was previously added to the visited sets (actually, similar trick can be applied when sequences of size n are considered: radix n rather than 2 must be used). This allows us to skip the logarithmic part in H(jAjnslog s) yielding H(jAjns) (assuming that n is small). Of course, one could argue that there is no point in analyzing asymp- totic complexity with bounded values of input size. In such cases this notation should be understood as a way to express the order of complexity. This makes a really big difference. For example, TESTAS (which apparently uses this algorithm) cannot handle calculating SSW of Cn for n P 16. After these simple optimizations, all automata of size up to 26 (or more) should be handled easily (space complexity be- comes a bigger problem in case of slowly-synchronizing auto- mata). More detailed comparison will be shown later. Interestingly, after performing these optimizations, the algo- rithm is very similar to the power-automaton-based approach. In fact, we will try to merge the best features of these two approaches. 5.1. Power automaton traversal We would like to focus now on the algorithm that utilizes the concept of the power-automaton: a breadth-first-search is per- formed, beginning with the start state (set of all states of the input automaton). When a singleton state is found, the SSW has been found and the computation can be terminated. In the effect, typi- cally only a subset of states of the power automaton is utilized. This is an important fact that will enable us to examine larger automata. 5.1.1. On-line transition function computation The power automaton transition function can be calculated on- line (i.e. whenever it is required). In order to compute a single power automaton transition, one to n transition functions of the original automaton must be calculated. This is reasonable and is, in fact, a standard approach used. Typically the resulting runtime complexity is O(jAjn2nM(n)), where M(n) is the cost of performing one mapping of state into an associated value. For small n it can be assumed to be O(jAjn2n) (this is a slight abuse of the O notation, as this algorithm is limited to small n; in fact we are not interested in asymptotic behavior of this function). 5.1.2. Off-line transition function computation It is possible to generate all transitions in the power automaton in an amortized constant time. This approach leads us to H(jAj2n) complexity of calculating SSW. Currently, memory requirements make it usable only for n < 30, so we will only investigate this case. Power automaton’s states S1 . . . S2n�1 are generated in Gray’s code order, which can be done in amortized constant time. This way, consecutive states differ on exactly one position (i.e jSi � Si+1j = 1, where � denotes a symmetric-difference operation). Power automaton transition for certain l 2 A can be expressed as DðSi; lÞ¼ dðSi \ Si�1; lÞ[ DðSi � Si�1; lÞ; ð2Þ R. Kudłacik et al. / Expert Systems with Applications 39 (2012) 11746–11757 11749 Author's personal copy where jSi \ Si�1j ¼ jSij� 1; jSi � Si�1j ¼ 1: If Si is the ith generated state, d(Si \ Si�1,l) can be determined in con- stant time. This can be done by storing the count of each element of the current transition’s function value. This information can be up- dated in constant time. Some additional pre-computation is necessary, so that logarithm of a number can be computed in constant time. This is required to map a power automaton’s state (representing the sym- metric difference of consecutive states) into input automaton’s state. Since jSi � Si�1j = 1, Si � Si�1 is encoded as an integer I being a power of two. So log(I) denotes the state’s number. Listing 3 shows how to calculate this value quickly. Finally, the entire algo- rithm shown in Listing 4 can be run in a constant time. Algorithm 3 FAST_LOG2(n) 1: Input: integer n = 2k 2: Output: log2n = k 3: r right (n)// right half, r < 216 4: if r > 0: 5: return log2[r]// precomputed value 6: l left (n) shr 16 7: return 16 + log2[l]// assuming 32 bit integers are used Algorithm 4 FAST POWER AUTOMATON TRANSITION (Si,Si+1, prev_trans, image_count, l) 1: Input: States Si, Si+1; transition function for Si; count of each element in the transition function’s image; letter l 2 A for which the transition function is calculated 2: Output: power automaton transition for a given state and letter 3: ex Si XOR Si+1// bitwise symmetric difference 4: in Si AND next// bitwise and 5: ret prev_trans 6: change fast_log2(ex)// changed state’s number 7: change_to delta (change, l) 8: if in == prev:// change was added 9: image_count[change]+ = 1 10: ret = ret OR change_to 11: else// change was removed 12: image_count[change_to]� = 1 13: if image_count[change_to] == 0 14: ret = ret XOR change_to 15: return ret A full graph representing the power automaton is constructed in memory. This approach is good for calculating SSW of slowly-synchroniz- ing automata. It takes exactly H(jAj2n) integer operations (that is, never takes less, unlike other implementations). It is not suitable for larger (n > 30) automata due to memory requirements. It is also not recommended for random and fast-synchronizing automata as they tend to have small power automata. This should be a good choice when the inexact algorithm (run prior to the exact one) shows that SSW may be long. 5.1.3. Mapping states to arbitrary information In the algorithm a set of visited states must be maintained. We will consider a more general solution: mapping states into arbi- trary values (when a Boolean value is used, this approach is equiv- alent to defining a subset by its characteristic function). This method will be useful for storing predecessor tree. 5.1.4. Dense mapping When we can afford keeping the entire mapping in memory, the situation is rather simple. To each set can be assigned an integer that will uniquely identify this set. There exists a convenient corre- spondence r: r : 2f1;...;ng ! N; r�1 : f0; . . . ; 2n � 1g! 2f1;...;ng; rðSÞ¼ X i2S 2i�1; r�1ðIÞ¼ fi 6 n : bit ði � 1Þ of I is set to 1g: In other words, subsets of a fixed set can be represented as its characteristic vector. This binary vector can be encoded as an inte- ger. It is a common technique. All relevant operations can be performed quickly. Hence, an ordinary array (of size 2n) can be used to map power automaton’s states into arbitrary data (for example, an information if a given set was visited earlier during the search). Since the array must fit into memory, n must be small. 5.1.5. Sparse mapping In case where only a subset (of unknown size) must be mapped, sparse data structures should be used. Such structures include tree-based dictionaries (like various BSTs) and hash tables. Trees require a strict weak ordering on elements (that is a proper ’ < ’ predicate must be supplied). Note that x ¼ y () :ðx < yÞ^:ðy < xÞ. Hash tables require an equality predicate (=). Also, it is required that hash value h can be calculated for each element. Tree-based dictionaries typically guarantee that insertions and retrievals perform H(log2m) key comparisons, where m is the number of stored keys. Hence, the complexity of performing key comparisons must be included in the estimation of the total com- plexity (this fact seems to be often omitted). Hash tables promise insertions and retrievals in O(1) time on average. Once more, the complexity of comparing keys and calcu- lating hashes must be taken into account. Using a trie data structure is also an option, but no efficient implementations seem to be available. An optimized implementa- tion based on a compact two-array approach would be suitable for our needs. There exist standard implementations of tree-based and hash- based dictionaries and we will not delve into further details here. We used std80). In effect, power automaton’s states can no longer be encoded into a single integer value. An alternative representation must be devised. Essentially, we need to represent subsets of {1, 2, . . . , n}. We will now investi- gate various data-structures that can be used for this purpose. Let a data structure S represent a set S # {1, 2, . . . , n}. By itera- tion over S we mean enumerating elements of S, preferably in sorted order. So if S represents the set {1, 2, 3}, iterating over S yields elements 1, 2, 3, and signalizes that no more elements are present. 5.1.7. Tree-based sets Using the standard set structures based on AVL or red–black trees (like std from 6: return down (tree, n, node, left (node)) 7: return down (tree, n, node, right (node)) The UPPERBOUND goes up to the root searching for the non-empty nodes. When a node is found, we go down, following the leftmost path (leading to leaves with indices greater than the start node) of non-empty nodes. Clearly, at most 2log n nodes are visited. This happens, for example, when only one value is present. The entire structure can be iterated using the function ITERATE (see Listing 7). Algorithm 7 ITERATE (tree, n) 1: Input: structure S 2: Output: next elements of S 3: elem = �1 4: while(elem != �1) 5: elem = UPPERBOUND(tree, n, elem + 1) 1: yield elem // the next element was found Let us investigate the amortized complexity of iterating over a full set. This is trivial, since UPPERBOUND function returns immedi- ately in line 5. Hence, the amortized complexity is constant when all elements are present. We already mentioned that when one ele- ment is present, the amortized complexity (as well as total) is 2log n. Instead of going up to a parent we can visit our grand-sibling’s parent without skipping any valuable nodes (sometimes we could visit the grand-sibling itself if it is non-empty, but this happens only once per UPPERBOUND call and lowers performance in the worst case). The remaining part of the algorithm is unchanged: if the cur- rent node is non-empty (marked), we start moving down. Other- wise, we continue moving right-up. This small change is important, as it enables us to perform more detailed complexity analysis. Note that the cost (counted in the number of visited nodes) between two leaves p, q is bounded by 3log(jp � qj). The search process can be divided into two phases. The search starts from node p. 1. Right-up jumps are performed until a non-empty node is found. Successive right-up jumps cull sub-trees of expo- nentially growing sizes: 1, 2, 4 etc. Since there are jp � qj empty leaves between the considered leaves, it is enough to perform logjp � qj jumps. 2. Now, the left-most non-empty path is followed. For each performed up or right-up jump a down-move must be performed (there were at most logjp � qj such jumps). Each down-move requires checking two nodes. It can be seen that at most 3log(jp � qj) nodes will be accessed. Let us now investigate the worst case. Assuming there are at least two elements (and one of them equals 0 for simplicity), they are iterated left-to-right: yi is the number of elements checked dur- ing step i (that is, during the ith call to UPPERBOUND). Let us put xi = yi+1 � yi "i < n. Now, the cost of iterating through all elements equals P i xi. Note that P i yi 6 n, but we will assume P iyi ¼ n, which is the worst case. The total cost is maxxð P i logðxiÞÞ¼ maxxðlogðPi xiÞÞ. From loga- rithm’s monotonicity we only have to maximize the product, which happens when xi = xj "i, j. This leads to function a n/a, where a is a parameter. Differentiating shows that it is maximized for a = e. Taking into account that we operate on positive integer values, we finally obtain that the cost is bounded by Oðlog 3 n 3 � � Þ¼ O n3 log 3 � � ¼ OðnÞ. The worst case amortized complexity has been improved from H(n) to H(logn), while worst case total complexity is still O(n). Memory consumption is increased twice (two bits per value are required). 5.1.11. Other data-structures Van Emde Boas Tree is an extension of the indexing tree concept described above. This heavy, recursive data-structure (each node contains a smaller VEB-Tree that is used as an index) enables iter- ation of s in H(jsjlog logn), so it guarantees H(log logn) amortized complexity. This structure is complex, so it would be an improve- ment only for much bigger n. It is of no use in our applications. 5.1.12. Partial power-set automaton The following technique can be used to reduce the amount of computations involved in calculating power automaton’s transi- tions. This, as far as we know, novel technique is most useful for small n. Let us define a partial power automaton of A for X � Q as PXðAÞ¼ ð2Q ; A; DjXÞ. In other words, the transition function domain is restricted to X. Note that the co-domain does not change. It is obvious that S # 2Q ^ [ S ¼ Q )PðAÞ¼ [ x2S PxðAÞ: From a mathematical standpoint it is a trivial tautology, but it turns out that it can be useful for our purposes. Let us consider a simple case. Put Q = {1, . . . , 32}, X = {1, . . . , 16}, Y = {17, . . . , 32}. Then jPXðAÞj¼ jPYðAÞj¼ 216 � 1. Clearly, values of DX and DY can be precomputed in 32 � 28 ( n ffiffiffiffiffi 2n p in general). Later, they can be used to construct the transition function of the entire power automaton by taking a union of transitions of the partial power automaton. Assuming that the union operation is faster then per- forming the transition in the original automaton, a speedup will occur. This is a low-level optimization, but experiments show that it can boost overall performance by a factor of 10. It is indeed possible to perform a fast set-union operation. When the represented universe is fixed (in this case: {1, . . . , 32}), a set can be represented as its characteristic vector, en- coded by an integer. Set-union is done by bit-wise OR operation, so the union of two states of size 32 is performed in one CPU instruction. This approach can be generalized. Let m (dividing n, for simplic- ity) be the maximum value such that m2n/m transitions can be pre- computed. Let us assume that our machine’s word size is 32 bits. There are d n32e words necessary to store one transition value. We need m values, so we need to bit-or m values, which takes mn32 oper- ations. Normally we would need to perform exactly n transitions in the original automaton. For m < 32 this technique should be faster. When m = 2, n = 32, exactly two results must be united. This can be done using one bit-or operation. In the result three integer oper- ations must be performed (instead of n/2 on average). The prepro- cessing phase takes 2 � 216 operations. 5.2. Possible further improvements As it was noted in Volkov (2008), an average length of SSW should be quite low (comparing to the conjectured (n � 1)2). According to the considerations of Volkov (referring to the paper 11752 R. Kudłacik et al. / Expert Systems with Applications 39 (2012) 11746–11757 Author's personal copy of Higgins (1988)), the expected SSW length is O(n). More precisely, it can be proved that a randomly chosen n-state autom- aton with a sufficiently large alphabet is synchronizing with prob- ability 1 as n goes to infinity and the length of its SSW does not exceed 2n. Therefore, statistically, only a small part of the power automaton must be visited in the search process. Recently, Skvort- sov and Tipikin (2011) provided experimentally the average length of SSW for a random n-state automaton over binary alphabet – it is O(n0.55). Regarding to the above facts, the described algorithm is not only exact, but also turns out to be quite efficient in an average case. Also, for small n, such an algorithm can be more effective than some polynomial-time solutions. One more improvement can be introduced. It aims at reducing the runtime complexity for certain types of automata (such as Černý automaton) whose SSWs are of the form wkv, w, v 2 A⁄. A similar heuristics was described in Trahtman (2006) to enhance the greedy algorithm. For each word wa such that jd(Q, w)j > jd(Q, wa)j, w 2 A⁄, a 2 A, the powers of this word are checked. It may turn out that (wa)k is a synchronizing word (just as in case of the Černý automaton). Actually, many slowly synchronizing automata (i.e. with long SSW) fall into this category. Note that by using this heuristics only a small fraction of the power automaton is visited, even though the length of SSW is quadratic and (normally) vast number of states would have to be traversed. Experimental data show that employing this heuristics for the Černý automata reduces the number of visited states to Hð1:5nÞ Hð ffiffiffiffiffi 2n p Þ. 5.3. Performance comparison Table 1 shows how the described optimizations affected perfor- mance. The results are compared with the popular synchronization tool, TESTAS (v. 1.0). Fig. 2 shows the results graphically. The measurements clearly show that the optimizations are very effective. Version 4 (using raw arrays of integers and partial power automaton concept), while significantly faster, cannot be used for larger automata. Other solutions, while slower, scale better and can be used for much larger automata, as long as the reachable power automaton size fits in memory. Algorithm 3 (using hash tables and the described fast bit-vec- tor) should be used for medium size automata. It can be more effective than algorithm 4 for smaller automata if the power automaton is relatively small. Approach 4 is superior for small (n < 30), slowly-synchronizing automata. Note that all of these approaches are faster than the algorithm used in TESTAS, due to the described complexity improvements. Fig. 2 shows performance for growing n. It must be noted that the memory conservation plays a major role here. In case of slowly synchronizing automata the compact structures enable cache-friendly (therefore fast) computations for small n, while for larger n they are necessary for the algorithm to work, due to memory requirements. Performance of algorithm 3 supports this conclusion. Algorithm 3 should be as fast as standard implementations using power automaton. Unlike them, it can handle larger automata (n > 30) as long as the reachable power automaton is small. Computations were performed with the use of a laptop manu- factured in 2004 (Athlon XP-M 2800+(@2.13 GHz), 768 MB of DDR RAM (@133 MHz)). 6. New greedy algorithm In this section we introduce a new greedy algorithm. It will be based on SYNCHROP, which itself is based on EPPSTEIN. Both algorithms are focused on choosing a pair of states, which should be synchro- nized in the next step and applying the proper word to the whole automaton. We can reformulate this task equivalently in terms of the pair automaton states (both EPPSTEIN and SYNCHROP use this struc- ture) as choosing the state, which should be transformed to a sin- gleton 0 state. From now on all procedures described below are regarded to the pair automaton A0 ¼ ðQ 0; A; d0Þ of the original input automaton A¼ðQ; A; dÞ. The choice of state depends on the set P of states that we are actually in, that is, the set P = d0(Q0, w), where w is the catenation of the words found in all previous steps. Such set will be called the active states set. The choice is based on the evaluation of the ac- tive states arrangement in the pair automaton. In SYNCHROP the eval- uation is based on the following heuristics: Heuresis 1. Let us define the distance d(p) of state p = {s1, s2} to the singleton state 0 as dðpÞ¼ min w2R� ðjwj : dðs1; wÞ¼ dðs2; wÞÞ: ð3Þ Let w be a synchronizing word for some state q. The bigger is the difference between d(p) and d(d(p, w)), the more profitable is the selection of w in the next algorithm step, because the distance of d(p, w) to the singleton state is smaller. The idea enclosed in the above heuristics is utilized in SYNCHROP in the following way: we compute the differences between states p and d(p, w), where w is the shortest synchronizing word for the pair q, as Dqðp; wÞ¼ dðdðp; wÞÞ� dðpÞ if p – q; 0 if p ¼ q: ð4Þ Table 1 Runtime (in seconds) for different implementations and different sizes of input Černý automata. No algorithm version C12 C14 C18 0 TESTAS: SSW < 1.0 s 18 s Time-out 1 Set of int 0.34 s 3.3 s 60 s 2 hash_set of vector of int 0.07 s 0.34 s 9.7 s 3 hash_set of fast_bitset 0.01 s 0.07 s 2.0 s 4 Int array 0.0 s 0.01 s 0.2 s Fig. 2. Graphical representation of performance for Cn. R. Kudłacik et al. / Expert Systems with Applications 39 (2012) 11746–11757 11753 Author's personal copy We compute Dq(p, w) for all active states except the singleton state. Let X be the set of all active states in the pair automaton. We define U1ðwÞ¼ X p2X Dqðp; wÞ: ð5Þ Having U1(w) for all the shortest words that synchronize pairs of states we choose the one with the smallest U1 and apply it to the automaton. The modified version of SYNCHROP, SYNCHROPL, uses U2 function – a modification of U1 – which takes into account the length of the word: U2ðwÞ¼ X p2X Dqðp; wÞþ jwj ¼ U1ðwÞþ jwj: ð6Þ Thanks to this penalty component shorter words are preferable. This is a good solution in case where there are two or more candi- date words with the same U1 value. Let us consider the complexity of SYNCHROP. The preprocessing part is the same as in EPPSTEIN and can be done in O(n3 + jAjn2). The new part is the choice of s1 and s2, which are to be synchro- nized. This is done O(n) times. To compute U1(w) we have to pro- cess all active states in order to compute D((s1, s2), w) values, which are O(n2). The U1 value has to be computed for all pairs of states of the input automaton (O(n2)). Hence, the total complexity of the SYNCHROP is O(n3 + jAjn2 + n(n2 � n2)) = O(n5 + jAjn2). 6.1. FASTSYNCHRO – a better SYNCHROP Comparing to NATARAJAN and EPPSTEIN, SYNCHROP and SYNCHROPL give good results, but have high complexity. In this section we present a modification of SYNCHROP with improved complexity and still pre- serving the quality. This algorithm will be called FASTSYNCHRO. The first modification regarding to SYNCHROP is the way that synchronizing word is created. Instead of computing U for the words synchronizing pairs of states of A, we will compute it for all letters from A. This modified function will be denoted by U3. Then, we will choose the letter that minimizes U3. This letter will be denoted by s. Let X be the set of active states in the pair automaton, A – the alphabet and d(p) – the distance of p to the singleton state. We define U3ðlÞ¼ X p2X ðdðd0ðp; lÞÞ� dðpÞÞ; p 2 X; l 2 A; s ¼ arg min l2A ðU3ðlÞÞ: Letter s is applied to all active states and added at the end of the currently found word, increasing its length by 1. The drawback of this solution is that it does not guarantee us to finally find the synchronizing word – we do not know if the number of active states will decrease after some number of steps. Therefore, we will use U3 only when it improves the arrangement of all active states in the pair automaton (that is, when U3 < 0). If U3 P 0, we will use U2 function for finding the word that guarantees the synchronization, hence necessarily decreases the number of active states. However, we introduce one restriction. In SYNCHROP the greatest impact on the complexity has the computation of U for all pairs of states and all the shortest words that synchronize these pairs. This can be done in O(n4). We will reduce it to O(n3) by reducing the number of processed words from square to linear order of magnitude by choosing only n shortest words synchro- nizing the pairs (if there is less than n words, we choose all of them). Such a choice, inspired by EPPSTEIN, is simple in realization and gives better results in average than other choices of words that were checked by us. The above modifications are given in Listing 8. Algorithm 8 FASTSYNCHRO(A) 1: Input: automaton A¼ðQ; A; dÞ 2: Output: synchronizing word w (if exists) 3: w e 4: A0 pair automaton of A 5: if A is not synchronizing return null 6: perform Eppstein preprocessing// see Section 4.4 7: X Q// X is the set of active states 8: count 0// a counter 9: while (jXj > 1) 10: a arg minl2A{U3(l)} 11: if (U3(a) < 0 AND count++ < jQj2) 12: w w.a; X d(X, a) 13: else 14: compute U2 for minfjQj; jXj2�jXj 2 g shortest words synchronizing the active states (denote Y for this set of words) 15: v argminy2Y{jyj} 16: w w.v; X d(X, v) 17: return w Let n = jQj. The complexity of line 4. is O(jAjn2). In line 5. we check if an automaton is synchronizing. This requires the use of BFS on a pair automaton with reverse transitions. This takes O(jAjn2). Eppstein preprocessing in line 6. takes O(jAjn2 + n3). Now consider the instructions in the while loop. The cost of line 10. is the cost of computing U3 for all letters – O(jAjn2). Application of the letter or word to all active states (lines 12. and 16.) is O(n), thanks to the Eppstein preprocessing. The cost of line 14. is O(n3), thanks to the restriction on the number of processed words. Now it remains to compute the number of while loop calls. Each application of v to the set of active states reduces their number at least by 1, so lines 14. – 16. will be executed at most n � 1 times. Alternatively, line 12. may be executed, but it does not necessarily reduce the number of active states. Therefore, to keep a tight rein on the total complexity, we set the restriction on the number of line 12. executions. In tests we have noticed that setting the limit to n2 had no influence on the algorithm results. Summarizing: the total cost of lines 4. – 6. is O(jAjn2 + n3), the cost of the instructions inside the while loop is O(n � n3) + O(n2 � jAjn2). This gives us the following theorem. Theorem 1. FASTSYNCHRO works in O(jAjn4) time complexity. 6.2. Experiments and comparison In this section we present the results of the experiments on heuristic algorithms. We focus on the efficiency (the running time) and the quality (the length of the synchronizing word found). We also make some remarks on FASTSYNCHRO algorithm. We tested five heuristic algorithms: EPPSTEIN, CYCLE, SYNCHROP, SYNCHROPL and FASTSYNCHRO. 6.2.1. Efficiency The efficiency has a great impact on the algorithms usability. In this subsection we present the efficiency comparison for EPPSTEIN, NATARAJAN, SYNCHROP and FASTSYNCHRO. Algorithms were tested for n = {10, 20, . . . , 300}. For each n one hundred random automata were generated such that 8a 2 A 8p; q 2 Q Prðdðp; aÞ¼ qÞ¼ 1n. If the generated automaton was not synchronizing, the procedure 11754 R. Kudłacik et al. / Expert Systems with Applications 39 (2012) 11746–11757 Author's personal copy was repeated. Tests were performed for jAj 2 {2,10} (Figs. 3 and 4). We have also performed tests for Černý automata (Fig. 5). It is clearly seen from Fig. 3 that SYNCHROP, due to its high com- plexity, is worse than other algorithms. The runtimes for other algorithms are comparable. The results for jAj = 10 does not differ much from these with jAj = 2. Running times are of course higher, but still the differences between NATARAJAN, EPPSTEIN and FASTSYNCHRO are small. Tests on Černý automata were performed to check how the algorithms work for automata with long synchronizing words. We can see noticeable decrease of efficiency in case of FASTSYNCHRO. EPPSTEIN and NATARAJAN work much faster in this case. 6.2.2. Quality In order to compare the algorithms in the widest possible con- text, the tests were performed for three different quality measures. If the number of all automata for a given number of states and alphabet size was reasonable, the algorithms were tested on all such automata. If the number of such automata was too big, we re- duced the tests to a subset of all possible random automata. All algorithms were run on the same set of automata. The first quality measure is the mean difference between the length of the word found by the algorithm and the SSW length. De- note by ALGðAÞ the word returned by algorithm ALG for automaton A. Let X be the set of all automata that were given as the input to ALG. Formally, we can define the first measure as M1ðXÞ¼ P A2XðjALGðAÞj� jSSWðAÞjÞ jXj : ð7Þ Table 2 shows that CYCLE and EPPSTEIN, despite their fast speed, does not give a good results in terms of M1. SYNCHROP and SYNCHROPL, based on Heuristics 1, are much better. FASTSYNCHRO is comparable to them and sometimes outperforms them (n = 5, 6, 10). The second quality measure, M2 is the ratio of cases in which ALG found the SSW to all cases: M2ðXÞ¼ P A2X½jALGðAÞj¼ jSSWðAÞj jXj ; ð8Þ where [expr] = 1 if expr is true and 0 otherwise. Notice how the quality decreases with increasing the alphabet size from 2 to 10. The ordering of the algorithms is the same as in case of M1 measure. To test the algorithms for automata with larger number of states, we need a measure which does not involve computing the Fig. 3. Efficiency for automata with jAj = 2. Fig. 4. Efficiency for automata with jAj = 10. Fig. 5. Efficiency for Černý automata. Table 2 Quality of algorithms in terms of M1, M2 and M3. C, sc Ep, SP, SPL, FS correspond to CYCLE, EPPSTEIN, SYNCHROP, SYNCHROPL and FASTSYNCHRO algorithms. n,jAj C EP SP SPL FS jXj Measure M1 3, 2 0.20 0.16 0 0 0 all automata 4, 2 0.36 0.33 0.06 0.03 0.04 All automata 4, 3 0.45 0.42 0.08 0.05 0.05 All automata 5, 2 0.64 0.55 0.21 0.17 0.13 All automata 6, 2 0.91 0.77 0.38 0.34 0.24 All automata 10, 2 1.97 1.63 1.00 0.96 0.78 105 random automata 10, 10 1.78 1.77 0.72 0.69 0.54 105 random automata 20, 2 4.18 3.49 2.18 2.07 2.01 105 random automata 20, 10 3.45 3.31 1.61 1.54 1.54 105 random automata Measure M2 3, 2 0.80 0.84 1 1 1 All automata 4, 2 0.71 0.72 0.94 0.97 0.97 All automata 4, 3 0.64 0.65 0.93 0.96 0.95 All automata 5, 2 0.57 0.60 0.85 0.87 0.89 All automata 6, 2 0.47 0.51 0.76 0.77 0.82 All automata 10, 2 0.25 0.28 0.49 0.50 0.57 105 random automata 10, 10 0.12 0.12 0.41 0.43 0.54 105 random automata 20, 2 0.08 0.10 0.23 0.24 0.28 105 random automata 20, 10 0.02 0.02 0.13 0.14 0.16 105 random automata Measure M3 50, 2 26.21 24.44 21.75 21.53 21.93 104 random automata 50, 10 16.32 15.49 12.84 12.71 13.00 104 random automata 100, 2 40.75 37.53 33.16 32.84 33.95 103 random automata 100, 10 25.30 23.41 19.84 19.61 20.78 103 random automata R. Kudłacik et al. / Expert Systems with Applications 39 (2012) 11746–11757 11755 Author's personal copy SSW length. Therefore, as M3 we took the mean length of the syn- chronizing words found by a given ALG. The use of this measure is meaningful only in relative comparison of two or more algorithms. M3ðXÞ¼ P A2XjALGðAÞj jXj : ð9Þ FASTSYNCHRO, in terms of M3, gives a little worse results than SYNCHROP and SYNCHROPL, however these results are still much better than those of EPPSTEIN and CYCLE. 6.3. Analysis of FASTSYNCHRO behavior In this section we investigate the behavior of FASTSYNCHRO. The analysis will allow us to explain the decrease of FASTSYNCHRO effi- ciency shown in Fig. 5. We will check what is the impact of differ- ent algorithm’s parts on the process of building the synchronizing word. Recall that in FASTSYNCHRO a synchronizing word can be ob- tained in two ways: first one is choosing the letter a 2 A and apply it to the set of active states. The other is to use U2 to find the pair of states and transform the set of active states by the word synchro- nizing this pair. We will refer to these two ways as to the first and the second part of the algorithm. We have performed an experiment for random automata. By g1 (resp. g2) we denote the number of executions of the first (resp. second) part of the algorithm. By k we denote the mean length of the synchronizing word found by FASTSYNCHRO and by k⁄ the esti- mated value of the SSW length for random automata over binary alphabet (Skvortsov & Tipikin, 2011). Notice that each execution of the first part of the algorithm cor- responds to the generation of exactly one letter added to the syn- chronizing word that is constructed. The value k � g1 expresses the number of letters added in result of the second part execution. From Table 3 we can see that with the increase of the number of states the fraction of the second part execution also grows. The ratio k�g1g2 is the mean length of the word added during the second part execution. This value remains relatively small and grows slightly with the increase of the number of states. When jAj is increased to 10, we can see that the influence of the second part decreases. Despite that its frequency is almost the same as in case jAj = 2, the mean length of the word added by each execution is 2–3 times shorter. The same experiment was performed for Černý automata (Table 4). These experiments explain why the runtime depends so much on the length of the synchronizing word found by the algorithm: the number of executions of the algorithm’s first part increases significantly. Also, the words found by the execution of the algo- rithm’s second part are not short anymore. Fortunately, the auto- mata with long SSWs are very rare, so the case of Černý automata is exceptional. 7. Conclusions We presented some efficient data structures for exact (expo- nential) synchronizing algorithm. Their application into the well-known algorithm that uses a power-set automaton makes the algorithm more effective than existing implementations. We also presented a new, greedy synchronizing algorithm and we compared it with some previously known greedy algorithms. Experiments show that our FASTSYNCHRO algorithm in general works better (that is, finds shorter synchronizing words) and usually works in a comparable time or faster than other methods. For lar- ger automata FASTSYNCHRO works twice as long as EPPSTEIN, but it finds much shorter synchronizing words. When one wants to find a synchronizing word, two factors have to be considered: the quality (the length of the synchronizing word) and the time. If time is a key issue, the optimal choice would be the EPPSTEIN algorithm. But if the quality is much more important (and this is usually the case in industrial testing of electronic cir- cuits, when one has to apply the same synchronizing word to thou- sands or millions copies of circuits), the best choice is to use our new FASTSYNCHRO algorithm. References Ananichev, D. S., & Volkov, M. V. (2003). Synchronizing monotonic automata. Lecture Notes in Computer Science, 2710, 111–121. Björn, K. (2005). Beyond the C++ standard library: An introduction to boost. Addison- Wesley. Broy, M., Jonsson, B., Katoen, J.-P., Leucker, M., & Pretschner, A. (2005). Model-based testing of reactive systems model-based testing of reactive systems. Advanced lectures. Lecture Notes in Computer Science, 3072. Černý, J. (1964). Poznámka k. homogénnym experimentom s konecnymi automatmi. Matematicko-fyzikálny Časopis Slovenskej Akadémie Vied, 14, 208–215. Černý, J., Pirická, A., & Rosenauerová, B. (1971). On directable automata. Kybernetika, 7(4), 289–298. Deshmukh, R. G., & Hawat, G. N. (1994). An algorithm to determine shortest length distinguishing, homing, and synchronizing sequences for sequential machines. In Proc. Southcon 94 conference (pp. 496–501). Eppstein, D. (1990). Reset sequences for monotonic automata. SIAM Journal of Computing, 19(3), 500–510. Fukada, A., Nakata, A., Kitamichi, J., Higashino, T. & Cavalli, A. R. (2001). A conformance testing method for communication protocols modeled as concurrent dfsms. In ICOIN (pp. 155–162). Higgins, P. M. (1988). The range order of a product of i transformations from a finite full transformation semigroup. Semigroup Forum, 37, 31–36. Hyunwoo, C., Somenzi, F., & Pixley, C. (1993). Multiple observation time single reference test generation using synchronizing sequences. In Proc. IEEE European conf. on design automaton (pp. 494–498). Kari, J. (2002). Synchronization and stability of finite automata. Journal of Universal Computer Science, 8(2), 270–277. Klyachko, A. A., Rystsov, I. K., & Spivak, M. A. (1987). An extremal combinatorial problem associated with the bound of the length of a synchronizing word in an automaton. Cybernetics and Systems Analysis, 23(2). translated from Kibernetika, No 2, 1987, pp. 16–20, 25. Lee, D., & Yannakakis, M. (1996). Principles and methods of testing finite state machines – a survey. In Proceedings of the IEEE (Vol. 84, pp. 1090–1123). Natarajan, B. K. (1986). An algorithmic approach to the automated design of part orienters. In Proc. IEEE symposium on foundations of computer science (pp. 132– 142). Olschewski, J., & Ummels, M. (2010). The complexity of finding reset words in finite automata. Lecture Notes in Computer Science, 6281, 568–579. Pin, J.-E. (1983). On two combinatorial problems arising from automata theory. Annals of Discrete Mathematics, 17, 535–548. Table 3 Behavior of FASTSYNCHRO for random automata. n,jAj g1 g2 k k⁄ k � g1 k�g1 g2 jXj 10, 2 6.52 0.44 7.52 6.92 0.99 2.28 10 000 20, 2 10.11 0.77 12.26 10.13 2.15 2.80 10 000 50, 2 16.12 1.63 22.07 16.77 5.95 3.65 10 000 100, 2 21.69 2.83 34.07 24.55 12.38 4.38 1 000 200, 2 27.38 4.56 51.81 35.94 24.42 5.35 1 000 10, 10 4.01 0.02 4.04 n/a 0.02 1.00 10 000 20, 10 6.68 0.21 6.91 n/a 0.22 1.07 10 000 50, 10 12.00 0.76 13.02 n/a 1.01 1.34 10 000 100, 10 17.60 1.92 20.88 n/a 3.28 1.71 1 000 200, 10 24.32 4.16 32.72 n/a 8.40 2.02 1 000 Table 4 Behavior of FASTSYNCHRO for Černý automata. n g1 g2 k k � g1 k�g1 g2 10 24 3 81 57 19 20 67 4 361 294 73 50 211 5 2401 2190 438 100 520 6 9801 9281 1546 200 1238 7 39601 38363 5480 11756 R. Kudłacik et al. / Expert Systems with Applications 39 (2012) 11746–11757 Author's personal copy Pixley, C., Jeong, S.-W., & Hachtel, G. D. (1994). Exact calculation of synchronizing sequences based on binary decision diagrams. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 13(8), 1024–1034. Pomeranz, I., & Reddy, S. M. (1998). On synchronizing sequences and test sequence partitioning. In Proc. 16th IEEE VLSI test symposium (pp.158–167). Ponce, A.M., Csopaki, G., & Tarnay, K. (1994). Formal specification of conformance testing documents for communication protocols. In 5th IEEE international symposium on personal, indoor and mobile radio communications (Vol. 4, pp. 1167–1172). Roman, A. (2009). Synchronizing finite automata with short reset words. Applied Mathematics and Computation, 209(1), 125–136. Skvortsov, E., & Tipikin, E. (2011). Experimental study of the shortest reset word of random automata. Lecture Notes in Computer Science, 6807, 290–298. Trahtman, A. N. (2006). An efficient algorithm finds noticeable trends and examples concerning the cerny conjecture. Lecture Notes in Computer Science, 4162, 789–800. Trahtman, A. N. (2009). The road coloring problem. Israel Journal of Mathematics, 1(172), 51–60. Volkov, M. V. (2008). Synchronizing automata and the cerny conjecture. Lecture Notes in Computer Science, 5196, 11–27. Zhao, Y., Liu, Y., Guo, X., & Zhang, C. (2010). Conformance testing for is-is protocol based on e-lotos. In IEEE int. conf. on information theory and information security (pp. 54–57). R. Kudłacik et al. / Expert Systems with Applications 39 (2012) 11746–11757 11757