BW H WMfWW m BflflRSS B0Mj SB! BH fflfi ml ■ LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMRAIGN 510. ^ IJlCoT Oop.2, SV4 ■ o^ wf-2^ UIUCDCS-R-T2-i+96 coo-2118-0028 Methodological Aspects of Scene Segmentation August 1972 DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS THE LIBRARY OE THE) UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN. Digitized by the Internet Archive in 2013 http://archive.org/details/methodologicalas496raul COO-2118-0028 UIUCDCS-R-72-U96 Methodological Aspects of Scene Segmentation By Peter Raulefs Department of Computer Science University of Illinois Urbana, Illinois 6l801 This work was supported in part "by Contract AT(ll-l)-21l8 with the U.S. Atomic Energe Commission. Ill ACKNOWLEDGMENT The author would like to express his gratitude to Professor Bruce H. McCormick who introduced him into the area of image processing and contributed to this thesis in numerous discussions and suggestions. The author is also grateful to the members of the Illiac III group for pleasant and helpful cooperation. In particular, he acknowledges the help of Mrs. Judy Arter for typing and improving this thesis through editing, and he appreciates the excellent drawings done by Mr. Stanley Zundo. IV TABLE OF CONTENTS CHAPTER Page 1 INTRODUCTION 1 2 SCENE SEGMENTATION 3 2.1 Property Space Representation 3 2.1.1 Local Representation of Pictures 3 2.1.2 Definition and Representation of Regions 5 2.1.3 Criterion for Partitioning Pictures into Regions 9 2.1. h Density and Entropy of Sets in the Property Space 10 2.1.5 Formal Description of Scene Segmentation into Regions 13 2.1.6 Metrics in the Property Space 1^ 2.2 The Clustering Approach to Scene Segmentation 18 2.2.1 A Formal Concept of Clustering 19 2.2.2 Application of Clustering to Scene Segmentation 23 2.2.3 Review of Some Clustering Techniques 25 2.2. k A Combined Clustering Technique for Scene Segmentation 2o 3 STRUCTURAL ANALYSIS OF SCENES 33 3.1 Data Representation of Composite Scenes 33 3.2 Synthesis of Composite Objects in Scenes 35 3.2.1 Synthesis of Composites Using Graph Transformation Rules 3o 3.2.2 Synthesis of Composites Using Cartesian Covers ~ 3.2.3 Matching Models to Scenes 3o k CONCLUSIONS kO REFERENCES kl V LIST OF FIGURES Figure Page la (6,3) - Regular Tessellation of the Plane 7 lb (U,U) - Regular Tessellation of the Plane 7 lc (3,6) - Tessellation of the Plane 8 2a (U,U) - Tessellation Which is h- Imbedded in Another (U,U) - Tessellation 21 2b Cohesions of Simple Regions 21 3 Cohesions of Different Regions in a Scene 22 CHAPTER 1 INTRODUCTION We consider scene analysis to consist of two major parts: picture preprocessing and structural analysis. Scenes are viewed as being pictorial representations of an assem- bly of more or less well known objects. In image processing research, a scene is usually given as a digitized picture constituting the output of a scanning device. The raster points of a digitized picture are referred to as picture points. The preprocessing phase consists of partitioning the picture into sets of picture points. Subsequently, these sets have to be interpreted as representations of certain objects (structural analysis). The basic idea of this paper is that the representation of objects in a scene is given by sets of picture points such that the local properties of points within such a set are more similar to each other than those belong- ing to different sets. Such a set of picture points will be called a region, The idea of partitioning scenes into regions has proven to yield a conveni- ent data structure for subsequent processing techniques (see [BF TO], [GUZ TO], [BP TO], PR TO] and Sect. 3 of this paper). Although this approach to region definition is closely related to the concept of clustering, a precise definition has not yet been given. It is shown in Sect. 2.1 that the information-theoretical quantity entropy is useful in characterizing a region. In Sect. 2.2, entropy is used to define the cohesion of a region, and this term gives a tool to analyze clustering methods and to specify an algorithm giving an optimal scene segmentation. After having partitioned a picture into regions, each region must be interpreted as a part of a known object. This requires a model of how objects may be represented and which relations between these representations must hold. Then, recognition can be viewed as finding an optimal match of regions to a collection of model objects. Although we do not pursue a solution to this problem in this paper, regions are used to define a data structure which is applied to approaches to scene analysis briefly reviewed in Sect. 3. CHAPTER 2 SCENE SEGMENTATION 2.1 Property Space Representation 2.1ol Local Representation of Pictures A picture P is considered to "be given as a finite set of points in the plane that are associated with local attributes. Definition 2.1 Let P be a finite set of discrete points in the plane. A local attribute, A, , is a function t A, p+ v t = V£ U {*} P -*■ A, (p ) r m t m mapping each point p e P into the value set , V , where V, consists of a finite set V' of real numbers and the t t don't care value, #. Remarks : 1. An, attribute A assumes the don't care value, A (p ) = *, j T/ in at a point p eP whenever no real number is specified as the value of A, at p . t r m 2. We assume that the attributes A are always specified so that the values in V' are naturally ordered real numbers, but we do not assume any ordering between the elements in V' and *. 3. We do not assume that a distance function is known in any of the sets , V . With reference to some coordinate systems in the plane, we adopt the convention that for any point, peP, with coordinates x and x , the attributes A and A are defined by A.(p) = x. (i=l,2). With these notions we consider a picture, P, to be represented as a finite set of points in the plane such that each picture point, p, is associated with a N-dimensional vector (A (p), ..., A^(p)) of attribute values „ Definition 2„2 Let A =' {A , . .., A„} be a set of local attributes. The property space X(A) is defined as the set N x(a) = n v , . t=i t Definition 2.3 Let P be a picture consisting of M points p , ...,-p^. The property space representation of P is defined to be the set P = {x,, „ . . , x^} of M points in X(A) : P = ' {x I for 1< m< M: x = (A. (p ),... ,A. T (p )} Remark : To obtain a convenient representation of pictures , the don't care value, *, may be associated with a real number, x % , which is not contained in any^of the sets V, ..., V' c N We denote by X* (A) = n V' the subset of X(A) that t=l t contains the N-tuples of all specified attribute values. Example To illustrate the property space representation of a picture, we consider the gray value of picture points as an attribute A . The gray value is defined by quantizing the possible intensities of light associated with picture points and assigning a number to each intensity interval thus obtained. The corresponding property space, 5 y{ -{A. ,A ,A J ), is three-dimensional, where two dimensions are used to describe the coordinates and the third to give the gray values of picture points. As the gray values of single picture points are often influenced "by noise effects, attributes obtained, for example , by averaging the gray values over neighborhoods , may be considered. 2.1.2. Definition and Representation of Regions The property space representation can be considered as a local description of pictures. A step towards a more global description is the formation of regions „ This is motivated by the following considerations: 1. From evaluating the importance of various parts of pictures for human perception of form, F. Attneave concluded that the "information contents" of a black-white picture is concentrated at points where the gradient of the density distribution of black points is large ([ATT5^], [AA.66]). It is suggested in [ATT5^+] that reducing a picture to a contour- line drawing reveals some redundancy among points circumscribed by closed contour-lines. This interpretation of experimental results in terms of information theory has been seriously questioned in [GC66]. But we can safely infer that the representation of pictures as an arrangement of regions (l) leads to a higher stage of conceptualization, and (2) reduces the number of code words necessary to represent a picture for automated picture processing. 2. Scene segmentation into regions is a convenient step of generalization allowing the application of graph processing techniques. A similar data structure has been used in [BFTO], [GUZTO], [BPTO], and [PRTO]. A: Regions in Discrete Point Sets Definition 2.1* Let P "be a set of M points in the plane such that the coordinates x = ( x n > x p) °f each point in P are defined ■with respect to some coordinate system. For any e>0, a subset UsP is called e -chain-connected 1 k if for any pair x_, x' e U, there is a chain t_ , „ . . , t_ eU such that: (i) x = t and x' = t : (ii) for j = 2, . . . , k where I I xl tJ" 1 - t J < e = /x, + x„ denotes the Euclidean norm of a vector, x = (x ,x ? ). Definition 2.3 An e-chain-connected set of picture points in a picture, P, is called an e -region in P for any e>0 o B. Regions in Cellular Images In many applications, the acquisition of pictorial data is done by scanning pictures at all points of a pre-selected regular grid. A systematic method of defining a grid is to tessellate the plane into (q) regular p-gons as discussed in [GRTl]. It can be shown (cf. [GRTl]) that there are only three regular tessellations of the plane for (p,q) = (6,3), (UjlO and (3,6). Then, a regular grid , G, is obtained by placing grid points at the fearycenters of all p-gons. Each p-gon is called a cell. Definition 2.6 Let P be a set of M points of a regular grid, G, obtained by a regular tessellation df the plane into (q) regular p-gons (cells). A set, Us=P, is G-chain-connected if for any two points Pjp'eU, there is a chain q , . „ . , q,?U 1 K. such that: G-REGION PICTURE POINTS Figure la (6,3) - Regular Tessellation of the Plane PICTURE POINTS G-REGION Figure lb (h,k) - Regular Tessellation of the Plane PICTURE POINTS G-REGION Figure lc (3,6) - Tessellation of the Plane 9 (i) p ■ a 1 and p 1 = q k ; (ii) for j =s 2, ..., k: c(q ) and c(.q ) are edge-adjacent or vertex-adjacent, where c(q) is the cell containing the point q„ Definition 2.7 A G-chain-connected set of picture points in a picture, F, is called a G-region in P. Regular tessellations and G-regions are illustrated in Figures la-c. 2.1.3 Criterion for Partitioning Pictures Into Regions Early approaches to region finding, such as [ROB65], employ two steps: (l) edge finding; and (2) fitting the line drawing obtained after (l) to a model. In this section, we are interested in finding regions with- out regard to a model, thus assuming an early stage of image preprocessing. A different method, proposed in [MP6T], bypasses the edge finding phase by directly forming regions as the union of squares with approximately the same gray value. In [BF70], regions are constructed by joining "atomic regions" according to two heuristic rules. In our notation, these "atomic regions" are chain -connected sets of picture points whose properties satisfy certain equivalence relations, e.g. equality of gray values. The method of [BF70] is based on the notion of the "strength of boundary", which is defined as the sum of the distances (in the plane) of all adjacent points separated by the boundary. The heuristics are then aimed at merg- ing regions if the boundaries separating them are weak. The region finding mechanism of these approaches can be generalized to the following scheme: 1. Define a similarity measure for (e-resp G-) chain-connected 10 regions represented in .the property. space. Remark; A set consisting of one picture point only is always chain-connected . 2. Assuming some threshold value 0, merge any two chain-connected regions to a single region if their similarity is larger than 9. Remark: Two regions S and S ? are ( e -resp.G-) chain-connected if any pair (p.. ,p p )eS x S p is chain-connected. The definition of a similarity measure depends on defining a metric in the property space. Then, the threshold value, 9, can be specified as that distance in the property space up to which regions are considered to he similar. By definition of the property space representation of pictures, the planar coordinates are also coordinates in the property space X. Con- sequently, the ahove partitioning scheme yields a segmentation into regions such that the density distribution of points in X is more uniform in each region than in the entire picture. This can he further generalized to the following criterion: [Region Criterion] An optimal partition of a picture into regions is given by: 1. minimizing the average intra-set dispersion within regions, and 2 maximizing the average inter-set dispersion between regions To obtain a formal and precise statement of this criterion, the concept of a density distribution of points in the property space will be introduced in the next section. With this notation, we shall see that the [Region Criterion] can very well be interpreted is terms of information theory. 2.1.^ Density and Entropy of Sets in the Property Space As the property space, X(A) , is discrete and finite, the definition 11 of a density distribution of points, in subsets of X(A) depends critically on the neighborhoods in which densities are evaluated. These densities ■will be applied by interpreting them as probability densities. Therefore, we have to establish non-overlapping neighborhoods to define normalized densities. The lattices considered in crystallography are a useful notion for this purpose, as lattices are generated by applying elements of the translation group to unit cells (see, for example, [KIT63]). The property space X(A) is imbedded in the N-dimensional vector space, R , of N-tuples of real numbers. As points in X(Aj may be distri- N buted quite irregularly, we establish neighborhoods by partitioning R A into N-dimensional cubes with sides of length %. Definition 2.8 Let be an ordered set of orthogonal N vectors in R spanning an N-dimensional cube with sides of length i. The set C(x°) = {x e R N hA n> ..., A. T e [0,1[: x = x ° + I \ % n } n=l is called a unit cell with origin x . We consider translations of the unit cell described by translation operators T^: T$[C(x°)] = { x' R N |^x eC(x°): x« = f + Ax n } for 151^ and X = , +1 , +2 , . . Definition 2.9 A standard tessellation , T(x ,£,), of R into unit cells is defined to be the set of all unit cells such that: (i) C(x ) is a unit cell with origin x_ and sides of length £4 12 (ii) if Q(x 1 )qT{x 9 \) then there are translations n n n T 1 T r X » > A such that A l r ^(t^C.CtJCcCx ))...) = c(x') . A l A 2 A r Remarks: 1. By Definition 2.8 all unit cells contained in a standard N tessellation of R do not overlap. 2. Whenever the length, &, is not relevant, a standard tessellation, T(x ,£) will he referred to as T(x ). Definition 2.10 Let S be a set of M points in X'(A) and T(x ) a standard tessellation. The normalized density , p (x) , of S at a point x e S is defined "by n (x) = rr |SnC| , where CeT(x ) is a unit H s — M i i — cell with x£C. The normalized density has the properties E p (x) = 1 and xeS S 06p (x)il for all xeS. Therefore, we can utilize p to define the entropy s s of the set S. Definition 2.11 The entropy H(S) of a set S of M points in X'(A) for a given standard tessellation, T(x°) is defined by H(S) = - I p (x)logp (x) . xeS S ~ The entropy, H(S), has the usual properties discussed in information theory and thermodynamics: 0^H(S)1 points in X(A) is given as H(S X ) = -^— H(S X ) log M For M=l: H(S X )=0. lU Remark: The normalized entropy of a set, S , in X(A) satisfies &£ H(b )= 1. For a given segmentation. S= {S , . .., S } of a picture into L regions, the average ihtra-set dispersion is measured by means of 1 L x H(S) = i z H(S*) . L £=1 £ Minimizing the average intra-set dispersion is equivalent to maximizing H( S) over all possible partitions of S into regions. 2.1.6 Metrics in the Property Space A. Objectives The decomposition of scenes into regions depends decisively on how the distance of any two points in the property space is evaluated. We will restrict ourselves to considering weighted Euclidean metrics, i.e., for any two points x, y_ £ X, the distance is given as d(x,y_) = { i U n (x n -y n )]^ l/2 n=l where w >0 is the weight for the n-th coordinate. This definition of a distance is equivalent to multiplying each coordinate of vectors in X with a particular weight (see [SEB62]). Hence, (w^) > (w.A ) C an be interpreted as regarding the i-th coordinate to be more important than the j-th coordin- ate, where A £ = max{x^} - min{x }. The necessity of specifying the relative importance of each coordin- ate with respect to all others can be seen by applying the Theorem of the 15 Ugly Duckling ([WAT69a], [WAT69b]) to vectors in the property space, X: Let each coordinate of vectors in X represent a predicate with values or 1, and let all possible predicates be represented as coordin- ate axes of X. Then, with each predicate, its negation is also a coordinate axis of X. The Theorem of the Ugly Duckling states that any pair of two vectors in X are as similar to each other as any other pair of two vectors, where the similarity is given as the number of corres- ponding coordinates which are equal. This "similarity" is identical to the Hamming distance of binary sequences. The Theorem of the Ugly Duck- ling still holds whenever any finite number of values can be attained by each coordinate and the similarity is defined in the same way as the number of equal, corresponding coordinates. As predicates with a finite number of discrete values are the properties defined in Section 2.1.1, we conclude that to introduce dissimilarities between vectors in X, the properties have to be selected and weighted. The procedure of attaching weights to properties is usually referred to as feature selection. As the relative importance of features is not known a priori, we have to give a training algorithm to determine the weights. Although the problem of feature selection has been exten- sively investigated in the literature, these studies do not lead to efficient algorithms decomposing scenes into regions. An approach to feature selection using entropy maximization is given in [TOU69]. This method has characteristics typical to many other feature selection tech- niques: It is assumed that (l) the number of classes to be recognized is known and that (2) the components of feature (i.e. property) vectors are normally distributed random variables. Both assumptions are usually not applicable to picture segmentation. 16 To obtain an algorithm that computes a metric from a given train- ing set of examples, we consider the following system; 1. A training set, {P , ..., P , , ..,, P T >, of examples is given, consisting of several pictures already decomposed into regions. 2. The distance between two points x_, y_ eX is given by the weighted Euclidean distance N d(x,y_) = £{<, 2 (x -y ) 2 ) 1/2 — *- n n n n n=l N with the constraint n w = 1. This constraint guarantees n=l n that metrics derived from pictures differing only by a linear transformation (e.g. shrinkage) will be equal (cf. [SEB62]). N " • Remark ; Without the constraint II u = 1, d is not necessarily n=l a metric, but only a pseudo-metric (see [BIR6T]), as d(x,y_) = is possible even when x^y for seme weight vectors oj_ = {oj , ..., u }. 3. The weights {^ , ..., ov, } are to be adjusted so that P , . . „ , P satisfy the [Region Criterion] of Section 2.1.5. According to the [Region Criterion], H(S) is to be maximized for the training set of sample pictures. To obtain a maximized H(s), we have to adjust the weight vector, _o), so that the density is distributed as uniformly as possible within each region of P , ..., P w . B. Algorithm to Determine a Pseudo-Metric in the Property Space Maximization Problem : We assume that a standard tessellation of X and a training set S ~{ S 2.» "** S 5' ••*' S j} > ^ L re S ions is given, where the £ -th region, S^ , consists out of M^ cell groups. At this point, the coordinate axes IT of X have some arbitrary scaling and we assume a weight vector w_ to be w_ = (l, . .<,, l). Any change of w_ = (to, , . . . 7 ukJ changes a coordinate value, x 9 to tox . ' If the unit cell, C, of the standard tessel- lation was initially given "by some origin, :x = (xl.*..., X. n ) and Ax/ = (Ax 19 ..., Ax n ), this will change to x' = ( w i x 01 > ••••» W ]\j x on^ and Ax'= (u Jx ... s u) Ay ), The number of elements of S^ contained in the unit cell that contains a vector, x £ X, is denoted by An ( x ,A x ) <, Thus, the normalized density, p(S«,x), is given by A n (x,Ax) p(Sn ,x) = — • , and 36 Ax 1 «««Ax N An(x,Ax) An(x,Ax) H(S») = - / {^ log(i r- )} dx v-- dx w xeS U l ^V"^ M * ^i--* Ax N ^1 N With the weight vectors, w_, and the constraint, n w =1, this entropy n=l n becomes : M £ Av V S £>=M^AV ' Mx-.AxM log( An( gt) )dxj-..dx^ x £b £ Hence, H (S«) attains a maximum when the distribution of An(x' ,Ax' ) is as uniform as possible for varying to. The Algorithm OPTW-H ; lo Select some values Az n , Az_, .,., Az. T with A z > 0: 1' 2' ' N n 2. For each point x e S simultaneously; take x as the center of a polyclinder, and expand +Az in each dimension n, l*n£N; 18 3. Repeat step 2 for all x, x' e s that have not yet reached the following stopping criterion; For some n, the expansion per- formed in step 2 leads to a nonempty intersection with the interior of an adjacent polycylinder with a point x' e g as center. If this criterion is reached for some n, x, and x * , then the last expansion +y\z in step 2 is deleted and further expansions along the n-th coordinate terminated. km If all expansions in step 2 are terminated "by the criterion of step 3: For l*ra£M, , the polycylinder around x E S^ is given by its centerpoint, x , and its radii \r^ =An. Az., where J * ' -m ill n. is the number of expansions around x in the n-th dimension c i # -m Hence, the weight w for the n-th coordinate is given by * ° n M t \ 1 (m) w n - M ^^i •• m=l The result of this algorithm depends critically on the selection of Az , ..., Az in step 1, i.e. on the coarseness of the chosen raster. 2.2 The Clustering Approach to Scene Segmentation The basic idea of clustering is to partition a set of objects into subsets such that each subset contains objects which are as similar to each other as possible. Each of these subsets is called a cluster (a review of clustering techniques developed before 1965 is given in [BAL65]). In order to investigate the applicability of clustering techniques to scene segmentation, the concept of clustering will be precisely stated and related to that of region finding in the next section. Subsequently, 19 existing clustering techniques will "be reviewed as far as they are used to obtain a clustering approach to scene segmentation. 2.2.1 A Formal Concept of Clustering The underlying concept for the definition of a cluster, "being developed in this section, is that a cluster, E, is a collection of objects such that the cohesion between all objects in R is somewhat larger than the sum of cohesions within the sets of any partition of S. A frequently used interpretation of cohesion is "average similarity". However, to apply clustering to region finding techniques, cohesion will be construed analogously to the mutual information defined in information theory,, Definition 2.15 For any standard tessellation, T(x ) , of the property" o ' space, X, a standard tessellation, T'(x ), is called K-embedded in T(x°) iff K is the smallest positive inte- n n K n ger such that J K translations T. ...T. e T 1 : C = U T, (C) A l A K k=l \ The following is an example for a (U,U)-tessellation which is U-embedded in another (k,k )-tessellation: zztiz zdiiz "1 — T ^ '■=F=$ -■T^ '^ embedded tessellation Definition 2.l6 Let T(x°) and T'(x° ) be standard tessellations of X and T'(x° ) be K-embedded in T(x°). The cohesion c(rQ of a set, R, consisting of P T(x )-cells, C^^. C is defined by: r(R) = E H(C ) - H(R) . P=l P 20 Examples 1. 3 1 " 1 " 1 ! x~ rT " fx^ff^X 1 \ |X ,-*U 4 . I 5c xT oc i x 44 — J IX" I XI P=U, cohesion: 1.375 "bits tessellation T(x°) tessellation T r (x_° ) : each cell contains k points of X, k- embedded in T(x°). 2. X :— rl- i4=_ S==; = r 1? x X 1 — ■ — -a cohesion: U.00 bits 3. PC x pc X 1 -a — PC i i " 3f B "3~~ i .i ' i pc 4 — I cohesion: -2.00 bits Cohesions for different scene segmentations are illustrated in Figures 2 and 3. The cohesion, c(R), can be interpreted in terms of information theory as the average mutual information of the T(x )-cells constituting R on the ensemble of T'(x )-cells belonging to R. In the context of 21 fc^ --1- ._!. -|. i _! -i— i I i f L. ..L. -\- . I — ■\— = t=4«— = =^4=^u=U^= ■• »— I- -r -+■ — L J 4- ,i 1 1 --!■— -■»-• (4,4)- IMBEDDED TESSELLATION Figure 2a (^,M - Tessellation Which is U-Imbedded in Another (U,U) - Tessellation »L "I ,_L A ! x = t 1- X X I — I -J L-- ■ U- _L »l X X X X I X ! P T" H^ ^.L s h S L T =A I x I 41 i i x I i _j I X ~r i i — I X I I _i h x =» t= COHESION: 1.375 BITS COHESION: 4.00 BITS Figure 2b Cohesions of Simple Regions 22 COHESIONS: £(S 1 ) = 3.75 BITS £(S 2 )=3.78 BITS £(S 3 ) = 2.24 BITS £(S 4 ) = 2.62 BITS £(S 5 )=3.75 BITS £(S|)=1.94 BITS £(S 2 )=1.52 BITS Figure 3 Cohesions of Different Regions in a Scene 23 measure theory (cf. [HAL50]), it is. shown in [WAT69a] that C is a supra- additive measure, i.e. for two sets, R and R , <;(R U R ) ^ c(R, ) + c( R ? ) if z;(R. ) * c,^R p ) > 0. This is due to the fact that the entropy is a sub-additive measure, i.e. H(R U Rp.)_< H(R ) + H(R ). If the cohesion ^(R) is negative, then for some i (i&i^P) , H(C. . ) = 0. The extrema are given by -log P <_ ^(R) < P log K - log (P»K). Definition 2.l6 now permits us ^ Q define a cluster. Definition 2.17 Let T(x ) be a standard tessellation such that there o' exists another standard tessellation, T'(x ) } that is K-embedded in T(x ) with K>1. A chain-connected set, R, of P T(x_ ) -cells is called an , £-cluster with respect o' to T'(x ) , if e >0 is the largest number such that there p exists a partition {R ,..., Rj of R., ^(r) - E^(R^)> e. P=l P 2.2.2 Application of Clustering to Scene Segmentation The task of finding an optimal partitioning of a picture into regions satisfying the [Region Criterion] can usually not be solved in a reasonable time by comparing all possible partitions. Instead, we employ the following strategy: (1) Using a parameterized heuristic rule, find disjoint chain- connected sets of picture points such that each set is part of exactly one of the regions to be foundo (2) Each of the sets determined in (l) is used as a core for a region by applying a "grow algorithm" to each set so that eventually the entire picture is partitioned. (3) Changing the parameters of step (l) allows further reparti- tioning. 2k Psychologically, the decomposition of pictures into closed domains is usually guided by two aspects [ZUS7Q]; The attributes of points in a domain, (l) do not vary very much in their distribution, and (2) are quite similar to each other. The first point of view led to the [Region Criterion] of Section s 2.1„5. The second suggests the application of a clustering technique. To develop a method which employs both ideas, we make the following heuristic assumption [HA]: [HA] Each region of a satisfactorily decom- posed picture contains exactly one distinguished cluster. A "distinguished cluster" in this context is an e-cluster such that for any other e'-cluster contained in the same region, e >>e'. Applying [HA] to the above strategy, we obtain the following algorithm for segmentation of a scene S: Algorithm S : 1. Select some value e; 2 Determine all e-clusters in S ,, obtaining the clusters q I q f q t 1 » p » * * * » T * 3. Apply the algorithm GROW until all points in s are ■ contained in one of the sets S„ , 1^£^L, obtained by joining new points to S'. The algorithm GROW joins points to a previously established cluster center in accordance with the [Region Criterion]: Algorithm GROW : Let x e S be a point not yet contained in any of the sets S , . „ . , S of step 3 in the algorithm S. Perform the operation S.-*- S U {x} if (1) S U {x} is a chain-connected set and (2) V A * j tf H(.S,0 .{x}) = 5H(S A U {x}). 25 2.2.3 Review of Some Clustering Techniques The necessity of using a clustering technique arises in step 2 of algorithm S. Although there have been many such techniques suggested (cf. [BAL65] , [SS63] ) , their applicability in our case is severly restricted by: 1. The number of points to be clustered is very large ( >10 ) s but the number of attributes (gray value, color, etc.) is smaller than, for example, those considered in the numerical taxonomy of biological objects [SS63]. 2. It is not possible to make a priori assumptions about the probability density distribution of points in the property space. 3. Similarity measures as well as the metric in the property space are subject to pre-clustering considerations and may be adapted during the clustering procedure. Under these restrictions, we consider (l) a probabilistic and (2) a graph-theoretical technique. Their applicability to our scene segmenta- tion approach will be evaluated in Section 2.2.4. 1. A Probabilistic Clustering Technique We consider the probabilistic clustering technique of the following scheme [TSY71]: For a given set, X, of patterns, x, find L probability density dis- L tributions, P , ..., P , such that the mixture density, P(x) = £ P P„(x.k)» 1 L a=l Z l attains each of its maxima (" modes " ) in exactly one of the disjoint, chain- L connected subsets X , ..., X of X with U X = X. 1 L £=1 L This scheme satisfies restrictions 2. and 3. and is often referred to as an example for "self-learning" [TSYTl]. In order to reduce the 26 computational complexity and in accordance with ..re strict ion 1. we reduce this scheme to extimatln^ anodes . After having found all modes, the : remaining points in X can he classified according to the nearest neighbor classification rule which classifies a point, x, "to the set containing its nearest neighbor. It is shown in [CH6T] that the probability of error for this rule is less than twice the error-probability of Bayesian decision rules. The following method for multivariant mode-seeking in a set, X, is proposed in [HFTO] and [MD65]: 1. Compute the maximum eigen-vector (associated with the largest eigen-value) , of the covariance matrix of X; 2. Project X onto its maximum eigen-vector; 3. Determine the extrema of the one-dimensional probability density obtained in 2.; k. Partition X with hyperplanes perpendicular to the eigen-vector found in step 1. and intersecting the eigen-vector at the locations of relative minima found in step 3°; 5. If only one extrema was found in step 3., the above procedure is repeated, starting at step 1. with the next largest eigen-vector; 6. For each new domain found in k. , the above procedure is repeated from step 1. 2. Graph-Theoretical Clustering Techniques a. Matula's Clustering Concept In [MAT70] and [MAT71], D.W. Matula introduced a concept of cluster- ing which can be summarized in terms of our notation as follows: 27 A weighted graph, G = (r,E), consists of a set, r, of nodes and a set, E, of edges between nodes in r such that each edge e-eE is associated with a weight, w(e). Let S he a set of points representing a picture in the property space, X(A). S can be considered as a complete graph, G(s) = (s,E(s)) = {(x,x')| x, x' e $• We t£Lke the weight , co(e) , of an edge, e = (x,x' )eE(S), to be the distance between its end points: co(e) = -77 j-\ • Consequently, the weight of an edge in the graph, G(S), can be interpreted as the affinity between its endpoints. The affinity between two disjoint sets, S', S" e S of nodes in G(S) is given by oo(S',S") = X Z I co(x,x'). CI a" AtD s. to I o 1 o — ~ We consider G(S) to be pruned by specifying the threshold affinity graph , G = {(x,x') e E(S)|d(x,x') 0. If {G ,G } is a partition of any graph G, the cut set , C(G ,G p ), is given by C(G ,G p ) = {(x,x' |x is node in G , x' is node in G p }. A graph, G, of order >2 in which every cut set has k ^ edges is called k-edge-connect ed , and a maximal k-edge-connected subgraph of G is called a k- component of G. For any graph of order >2, the edge-connectivity is defined as A(G) = min{|C||C is a cut set of G}. For any x e S U G(S), its cohesiveness h(x) is defined as h(x) = max {X(G' ) |G' is a subgraph of G and G' contains x}, and the strength of the graph, G, is a(G) = max{A^.G l )|G' is a subgraph of G}. Every k-component (k>l) which does not contain a (k+l) -component of G, as well as every trivial component (consisting of one vertex only) of G is a cluster of G. G itself is a cluster iff a(G) = X(G). Any subgraph k' of a cluster k in G with X(k') = A(k) is called a subcluster of G. Under the relation "is a proper subgraph of", the subclusters of a graph form a partial order with clusters as maximal elements. An algorithm , 28 determining the k-components and clusters of a graph is given in [MAT71], b. Detection of Clusters Using Minimal Spanning Trees In [ZAHTl], an algorithm for detecting clusters using minimal span- ning trees is given. The .following summary is a reinterpretation of this concept adapted for application to our- scene segmentation approach. A spanning tree , ST(s), of the graph, G(s), defined under a. is a connected graph containing all nodes of G(S) but no circuits. A minimal spanning tree, MST(s) of G(s) is a spanning tree with a minimal sum of all weights. If G and G~ are two interconnected subgraphs of G, their distance , p(G ,G ), is defined as the minimal weight of all edges connecting G and G . The link set, 2^(G ,G ), is the set of all edges connecting G 1 and G^ with weights = p(G ,G ). Any subset, C, of S is called a 6-clump iff for any partition {C ,C 2 > of C, p,(C,S-C) - p.(C ,C 2 ) £ 6 with 6>0. It is shown in [ZAHTl] that the restriction of an MST(g) to a 6-clump in s. is a connected subtree of the MST(s). A 6-clump is a set internally bound together stronger than the bounds between itself and nodes outside, whereas an e-cluster of Definition 2.17 is completely defined by .its ■ own internal properties without respect to its environment. An efficient algorithm for determining a minimum spanning tree of a graph is given in [SEP70] and others are referenced in [ZAH71]. The strategy to detect an e-cluster, given in [ZAH71], is to determine incon- sistent edges , whose weights differ significantly from the average weight of edges in the MST. 2.2.U A Combined Clustering Technique for Scene Segmentation When applying a graph clustering technique to a set, S, of points in 29 the property space, S is regarded as a complete graph, Gfe). Clusters, as defined in [MATTl], have to be obtained by taking appropriate values for t and searching for k- components in G for some k. [MATT0,Tl] do not give a way to choose t and k. A connection between the concepts of [ZAHTl] and [MATT0,Tl] is established by the following theorem: (*) Let G be a complete graph and MST(G) be a minimal spanning tree of G. Then, the threshold affinity graph, G , partitions G into complete subgraphs, G" , G , ..., G / s,iff there are exactly n(t) edges, (x^x'), ###j (* (* ) > x ' (+ ) ^ ' in MST(G) with d(x. , x- ) i "t for i =i in(t). The nodes of the sub- t t graphs, G , ..., G /.\ s are those of the subtrees of MST(G) after cutting all edges with weights >, t. Proof: This property follows immediately from Theorems 1-3 in [ZAH71]. Any edge in a minimal spanning tree connects two subtrees , and by Theorem 2 of [ ZAHTl], it is the edge with a smallest weight among all edges connecting the nodes of both subtrees in the complete graph containing all nodes of the tree. (*) allows us to reduce the problem of choosing the cut level, t , by considering all edges in the complete graph, G, to the task of looking at the edges of MST(G) only. The strength of a complete graph, G, with M nodes is .2, — ' eeE(x) where E(x) is the set of all edges having x as one end- point and w(e) = -r? pr , if x' is the other endpoint of e. For |E(x_)| = 1, we define cu(x) = 0. If co(G) = Z w(x)» the normalized weight of a x node in G node is denoted by oj(x) TTgT * ,(x) = Interpreting the normalized weight of a node as a probability den- sity, we can now apply the mode seeking algorithm indicated in Section 2.2.3 to minimal spanning trees. The resulting modes are nodes in the MST to be used as cluster centers. The distance between two nodes in a tree is given by the sum of all weights co(e) of edges forming the connect- ing path between the nodes. After having determined the modes of a MST, a partition of subtrees can be obtained by applying the nearest neighbor classification technique. Combining the preceding concepts, we obtain the following algorithm to partition a chain-connected set, S, of points in the property space into regions; 32 Algorithm C 1. In the complete graph, G(S), determine the minimal spanning tree, MST(s), using the distances in the property space as weights of edges in G(S); 2. Compute normalized weights of all nodes in MST(S); 3. Apply the mode-seeking algorithm of Section 2.2.3 to MST(S), C C C yielding modes x , x , ..., x , defined as the cluster modes in MST(S); C C h. Using (x } , ..., {x } as initial classes, apply the nearest neighbor classification method to obtain a partition, {S-. , ..., S } of chain-connected sets in 8: 1 n 5. Using parameters 8., and , repartition {S , ..., S } by applying the algorithms MERGE and SLICE. a. Algorithm MERGE: Merge two adjacent subtrees, S. and S. J of MST(S) if s(S. U S.,S.,S.) >6 n , for some parameter 0, . 1 J i J 1 1 b. Algorithm SLICE: Partition a subtree, S., of MST(S) into subtrees, S, S, ..., S. / . \ > if applying steps 1 to h of Algorithm C yields a partition {S. n , ..., S. /.%} of il' in(ij subt rees in S. with c(S.;S ... s. , . ,) > 8„. i l il' ' m(i) 2 33 CHAPTER 3 STRUCTURAL ANALYSIS OF SCENES 3.1 Data Representation of Composite Scenes We assume to be given the following results of the scene segmenta- tion procedure described in Chapter 2: I. A scene, S, has been partitioned into M regions, S , ... S ; r r r II. Considering a set A = &. , . . . , A„} of T attributes applic- able to regions, the vector (A (S ), ..., A (S )) has been evaluated for each region, S , in S. m r r r III. Let R = {R, , ..., R q } be a set of functions Sxs->W r = W r U{*} with 14S4S, R r : J 3 S (S.,S ) -» R r (S.,S ) lti^-M mapping pairs of different regions into a value set, w , which consists of a discrete finite set, w of real numbers s and the don't care value, *. For each pair, (S.,S.), of different regions, the vector ■^ J (R^(S.,S.), , R^(S.,S.)) has been evaluated. 1 1 J b 1 J r r Remarks: 1 The attributes A , ..., A map regions into discrete, finite value sets, V (l*t*T), as do the local attributes introduced in Definition 2.1. Two different types of such attributes can be considered: a. attributes obtained by averaging local attributes over all picture points contained in a region. (Example: The arithmetic mean of gray values of points in a region, 3k (Example: An attribute indicating the shape of a region, e.g. with a value set {1, 2, 3, *}, where 1 = circular, 2 = triangular, 3 = rectangular, and * = (not 1, 2, or 3 or not specified).) 2. The functions, R , . v , R can be interpreted as binary- relations with many values. (Example: The quantized distance, d (S.,S.), of two regions, S. and S. is 1 J 1 ,1 d r (S, ,Sj = [ , q | i 1 | Z I ||x. -x ib i l|S 7x,sS. x. e S "- 1 : where | |x| | denotes the Euclidean norm of a point, x = (x- 9 x 2 ) 9 in the plane and [r] denotes the smallest integer less than or equal to r.) 3. Although n-ary relations with n>2 may be used to describe properties of regions in a scene, the restriction to binary-relations, like the functions R , does not restrict the generality of scene description-. It is shown in [MONTI] that n-ary relations with n>2 can be reasonably well represented or approximated by binary relations. We consider two representations of scenes: A- The Relationship Matrix - The relationship matrix, R = [r. ..], is introduced in [CRTl]. For a scene partitioned into M regions, S^ ..., S^ E is an MX M matrix. The entries, r. ., of R are: for i 4 j: S-dimensional vectors, (R^(S . ,S . ) , ..., r£(S.,S.)) for i = j: T-dimensional vectors, (A^(S. ,S . ) , ..., a£(S.,S.) 35 A complete description of all properties of a region, S., is provided by the i-th row and the i-th column of R. R is r r symmetric if all functions, R en, are symmetric with respect to their arguments, (S.,S.) e S . -*- J B. The Digraph Representation - A scene, S = {S , . . . , S } , can be represented as a digraph, G = (r ,E ), whose node set is r = {S , ..., S }. The set, E , of edges in G consists of labelled and weighted edges. There is an edge, e. ., such that: for i 4 j: u, with ltueS, indicates the subscript in r r u R of a function in R and e. . is associated u 1J with the weight , to( e . . ) = R (S.,S.). ij u i' j for i = j: u, with l^u^T, indicates the subscript of r u an attribute in A and e. . is associated with lj the weight, co(e U .) = A r (S.). Remark: To simplify this representation, an edge, (S.,S.), J with a label, u, is deleted whenever R (S.,S.) = *. u 1 j 3.2 Synthesis of Composite Objects in Scenes The scene segmentation procedure described in Section 2 results in partitioning scenes into primitive regions. Usually, scenes consist of composites of primitive regions, called figures . Hence, scene analysis has to cope with synthesizing figures from primitive regions. The synthesis of figures is done by merging regions according to some criteria to be established. However, these criteria depend upon the specific environment and the intentions considered. In this section, the formal structure of 36 two such approaches is briefly reviewed, using the scene representations introduced in Section 3.1. 3.2.1 Synthesis of Composites Using Graph Transformation Rules Let G r = (r r ,E ) be a digraph representing the scene S = {S , ..., S }, The synthesis of composites is obtained by merging regions, yielding a new graph G = (r ,E ). Nodes in r may either represent primitive regions c c c c in S or regions obtained by merging regions in S. The edges between any r r two nodes in V , which are also both nodes in r , remain unchanged. All c r . . r r other edges in E have to be determined. The function mapping G to G D c c is a special graph transformation. ■ It is shown in [SMC 70 ] that many graph transformation rules can be conveniently inferred from invariance principles such as conserva- tion of union and intersection of point sets. 3.2.2 Synthesis of Composites Using Cartesian Covers The concept of cartesian covers is introduced in [MMC 70]. A discrete, n-dimensional set E = {(x, , ..., x )| x.eE. for 1 < i < n} with 1 n ' i i = = H = {0, 1, ..., h.-l| h. > 0} (l=i=n) is considered. An element (x , ..., x ) in E is usually denoted by e with n-1 i-1 j = x + E (x . I h.J n i=i n " 1 k=o n " k and e £ E is called an event. A cartesian literal , X^ 1 , is defined as the set Xf = ((x r ....xJIx.eA.} for A. S H. (1 < i <*n) l i = = 37 A set Lt E is called a cartesian complex if it is represented as the intersection of cartesian literals: L = f\ XV , where I£{1, ..., n}. L is also called iel X an interval, if there are n numbers a., b.e H. (i=i=n) such that ' 11 i A. = {x . I a . < x . 2} > and A' 2 ={0,2}; A_1 A 2 A l A 2 then, L = I'/l X. is an interval, while L' = X X is a factor. Let f: E-*r[0,l] ,#} be a mapping defining the sets F = {eeE|f(e) = a} for ae{0,l,*}, F* = {e e E|0 < f(e) < 1}, and for some Xe[0,l]: F 1X = {eeE|f(e) > X}, F° X = {eeE|f(e) < X} . A set D(f |x) = {L. | i=l,2,...} of cartesian complexes is called a cartesian cover of f under X if F ^ U L. *•=. F U F*. If all cartesian complexes in D(f|x) are intervals i (factors), D(f|x) is also called an interval (factor) cover. For simplicity, we assume X = 1 and write D(f|x) = D(f), F 1X = F 1 , and F° A = F°. The application of cartesian covers to the formation of composites consists of two steps: 1. We interpret the property space as an n-dimensional discrete vector space E. Then, the attribute values of a region are represented by an event in E. Thus, a scene S = {S , ..., S } is represented in E by n events e n , . . . , e . 1 n 2. We establish criteria to define the optimality of a cartesian cover. Then, we try to find an optimal set of cartesian complexes covering the events e , ... e . A composite is formed by joining all events covered 38 by exactly one cartesian complex. An efficient algorithm to form optimal cartesian covers is described in [MR 72]. Examples for criteria defining the optimality of a cartesian cover L are: a) E(LflF ) = Max. - maximize the number of events in F covered by complexes in L; b) i(h) = Min. - minimize the number of literals needed to represent L; c) den(L,F ) = Max. - maximize E(L0F ) per literal in L. where , E(LHF ) ■ den(L,F ):= fc(L) For synthesizing switching circuits, the importance of these criteria is well established. However, in the case of scene analysis, little is known whether these criteria do actually contribute to the description of regions, and further experimentation is necessary. For this purpose, weights for specific attributes or events may be introduced to compute weighted covers . 3.2.3 Matching Models to Scenes It seems to be appropriate to assume that most scenes we are dealing with are composed of composite figures such that each can be regarded as a realization of some known 'model figure'. Instead of imposing criteria of hardly known importance and meaning, we may try to fit models to a scene. For this purpose, the relationship matrix appears to be a useful data structure. A model, M, is a collection of m objects M = {M , ..., M }, repre- sented by an (m,n )-relationship matrix R(M). We assume that we are given 39 a set {M., ..... M } of p models. Then, the task of interpreting a scene —1 — p S = {S, , .... S } in terms of models, M_ , ..., M , can be defined as 1' n — 1 ~i? finding an optimal match of S to some M. such that (a) each region in S J is associated with exactly one model, and (b) the number of regions not or only poorly matching a model is kept at a minimum. Although approaches toward matching algorithms have been made (e.g. [BP TO]), their efficiency is still rather poor. Uo CHAPTER k CONCLUSIONS The property space representation allows us to define the density of the spatial distribution of local properties of picture points. Properly normalized, this density can be interpreted as a probability density which serves to define entropy and cohesion of a set of picture points. It is shown for an arbitrary but simple example that the cohesion gives a good theoretical basis for applying clustering methods. Clearly, much more experimentation is necessary to establish this concept. Similar to parsing in syntax analysis, scene analysis has (l) to infer a global structure from local properties, and (2) to recognize given structures by matching their components to regions in the picture. Successful parsers usually employ a combination of the 'top-down' and 'bottom-up' method, and this strongly suggests an approach combining these two schemes in scene analysis, too. Therefore, it seems to be highly desirable to extend global analysis techniques as graph transformations and covering so that they will also employ local properties. kl REFERENCES [ATT 5^] Attneave, F. , "Some Informational Aspects of Visual Perception", Psychology Review, vol. 6l (195*0, pp. 183-193. [AA56] Attneave, F. and Arnoult, M.D., "The Quantitative Study of Shape and Pattern Perception", Psychology Bulletin, vol. 53 (1956), PP. U52-U71. [BAL 65] Ball, G.H. , "Data Analysis in the Social Sciences - What About the Details?", Proceedings of the 1965 Fall Joint Computer Conference, vol. 27, part 1 (1965), pp. 533-559. [BP 70] Barrow, H.G. and Popplestone, R.J., "Relational Descriptions in Picture Processing", in Machine Intelligence, Vol. 6 , Melzer, B. and Michie, D. (eds.), Edinburgh University Press, 1970, pp. 377-396c [BIR 67] Birkhoff, G„ , Lattice Theory , American Mathematical Society, Providence, Rhode Island, 1967. [BON 6k] Bonner, R.E. , "On Some Clustering Techniques", IBM Journal of Research and Development, January 196U, pp. 22-32. [BF 70] Brice, C R. and Fennema, C.L., "Scene Analysis Using Regions", Artificial Intelligence, vol. 1 (1970), pp. 205-226. [CH 67] Cover, T.M. and Hart, P.E. , "Nearest Neighbor Pattern Classifi- cation", IEEE Transactions, vol. IT-13 (1967), pp. 21-27. [CR 71] Chien, Y.T. and Ribak, R., "Relationship Matrix as a Multi- Dimensional Data Base for Syntatic Pattern Generation and Recognition", Proceedings of the Two Dimensional Digital Signal Processing Conference, October 1971, Columbia, Missouri. [CUL 68] Cullen, H.F. , Introduction to General Topology , Boston, 1968. [FER 67] Ferguson, T.S., Mathematical Statistics , Academic Press, New York, 1967. [FOY 6U] Foy, W.H., "Entropy of Simple Line Drawings", IEEE Transactions, vole IT-10, April 1964, pp. 165-I67. [FRA 67] Fralick, S.C., "Learning to Recognize Patterns Without a Teacher", IEEE Transactions, vol. IT-13, January 1967, pp. 57-64. k2 [FU 69] Fu, K.S., "On Sequential Pattern Recognition Systems" , in Methologies of Pattern Recognition , Watanabe, S. (ed.), Academic Press, New York, 1969. [GAL 68] Gallager, R.G. , Information Theory and Reliable Communication , Wiley and Sons, New York, 1968. [GC 66] Green, R.T. and Courtis, M.C., "information Theory and Figure Perception: The Metaphor that Failed", Acta Psychologica, vol. 25 (1966), pp. 12-35, [GR 69] Gower, J.C. and Ross, G.J.S. , "Minimum Spanning Trees and Linkage Cluster Analysis", Applied Statistics, vol. 18 (1969), PPo 5U-6U. [GRA 7l] Gray, S.B. , "Local Properties of Binary Images in Two Dimensions", IEEE Transactions, vol. C-20 (l97l), pp. 551-561. [GRE 65] Grenander, U. , "Some Direct Estimates of the Mode", Ann. Math. Stat., vol. 36 (1965), pp. 131-138. [GUZ 68] Guzman, A., "Decomposition of Scenes into Bodies", AFIPS Conference Proceedings, vol. 33, part 1 (1968), pp. 291-30U„ [GUZ 70] Guzman, A., "Analysis of Curved Line Drawings Using Context and Global Information", in Machine Intelligence , Vol. 6, Meltzer, B. and Michie, D. (eds.), Edinburgh University, 1970, pp. 325-375. [HAL 50] Halmos, P.R. , Measure Theory , Princeton Press, 1950. [HA 67] Hart, P.E. , "A Brief Survey of Preprocessing for Pattern Recog- nition", Stanford Research Inst. Report No. RADC-TR-66-819 (l9t»7). [HA 69] Harary, F. , Graph Theory , Addison-Wesley , New York, 1969. [HARL 70] Harlick, R.M. , "Multi-Image Clustering", Proceedings of the 1970 Army Numerical Analysis Conference, pp. 75-90. [HF 70] Hennchon, E.G. and Fu, K.S., On Mode Estimation in Pattern Recognition , Purdue University Press, Lafayette, Indiana, 1970. [HK 67] Haralick, R.M. and Kelly, G.L., "Pattern Recognition with Measurement Space and Apatial Clustering for Multiple Images", Proceedings of the IEEE, vol. 57, April 1967. [HY 6l] Hocking, J„G. and Young, G.S. , Topology , Addison-Wesley, New York, 196I0 k3 [KIT 63] Kittel, C, Introduction to Solid State Physics , 2nd Edition, John Wesley, Nev York, 1963. [MAT TO] Matula, D.W. , "Cluster Analysis via Graph Theoretic Techniques", Proceedings of the Louisiana Conference on Combinatorics , Graph Theory, and Computing, Mullin, R.C. (ed.), University of Manitoba, 1971, pp. 199-212. [MAT 71 ] Matula, D.W. , "k-Components , Clusters, and Slicings in Graphs", preprint in 1971. To appear in SIAM Journal of Applied Mathematics, [MD 65] Mattson, R.L. and Dammann, J.E. , "A Technique for Determining and Coding Subclasses in Pattern Recognition Problems", IBM Journal of Research and Development, July 1965. [MI 72] Michalski, R.S., "Varivalued Logic and Its Applications to Pattern Recognition", Department of Computer Science, University of Illinois, Urbana, Illinois Report, 1972 (in preparation) „ [MMC 72] McCormick, B.H. and Michalski, R.S. , "CARTESIAN COVERS - A Theoretical Introduction", Department of Computer Science, University of Illinois, Urbana, Illinois, 1972 (in preparation). [MON 71] Montanari , U. , "Networks of Constraints: Fundamental Properties and Applications to Picture Processing", Department of Computer Science Report, Carnegie-Mellon University, Pittsburgh, Pennsylvania, January 1971. [MP 67] Minsky, M.A. and Papert, S. , Project MAC Progress Report IV , MIT Press, Cambridge, Massachusetts, 1967. [MR 72] Michalski, R.S. and Raulef s , P., "Computer Synthesis of Cartesian Covers and Varivalued Logic Expressions", Department of Computer Science Report, University of Illinois, Urbana, Illinois, 1972 (in preparation). [ORE 62] Ore, 0., Theory of Graphs , American Mathematical Society Publication, vol. 38, Providence, Rhode Island, 1962. [PR 70] Preparata F.P. and Ray, S.R. , "An Approach to Artificial Non- symbolic Cognition", Coordinated Science Laboratory Report No, R-l+78, University of Illinois, Urbana, Illinois, 1970. kk [ROB 65] Roberts, L.G. , "Machine Perception of Three Dimensional Solids", Optical and Electro-Optical Information Processing , Tippet, et. al. (eds.), MIT Press, Cambridge, Massachusetts, 1965. [ROS 69] Rosenfeld, A., Picture Processing by Computer , Academic Press, New York, 1969. [SEB 62] Sebestyen, G.S. , Decesiori-Making Processes in Pattern Recognition , New York, 1962. [SEP TO] Seppanen, J. J. , "Algorithm 399", Communications of the ACM (Oct. 1970), pp. 621-622. [SMC TO] Schwebel, J.C. and McCormick, B.H. , "Consistent Properties of Composite Formation Under a Binary Relation", Information Sciences, vol. 2 (19T0), pp. 179-209. [SS 63] Sokal, R.R. and Sneath, P. H. A. , Principles of Numerical Taxonomy , San Francisco, 1963. [TOU 69] Tou, J.T., "Feature Selection for Pattern Recognition Systems", in Methodologies of Pattern Recognition , Watanabe, S. (ed.), Academic Press, New York, 1969, pp. ^93-508 . [TSY 71] Tsypkin, Ya. Z. , Adaption and Learning in Automatic Systems , Academic Press, New York, 1971. [WAT 60] Watanabe, S. , "Information-Theoetical Aspects of Inductive and Deductive Inference", IBM Journal of Research and Development, vol. h (i960), pp. 208-231. [WAT 69a] Watanabe, S. , Knowing and Guessing , New York, 1969. [WAT 69b] Watanabe, S., "Pattern Recognition as an Inductive Process", Methodologies of Pattern Recognition , Watanabe, S. (ed.), Academic Press, New York, 1969, pp. 521-53 1 +. [ZAH 71] Zahn, C.T., "Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters", IEEE, vol. C-20 (1971), pp. 66-86. [ZUS 70] Zusne, L, Visual Perception of Form , Academic Press, New York, 1970. ormAEC-427 U.S. ATOMIC ENERGY COMMISSION (6/6 ?L. UNIVERSITY-TYPE CONTRACTOR'S RECOMMENDATION FOR DISPOSITION OF SCIENTIFIC AND TECHNICAL DOCUMENT ( See Instructions on Reverse Side ) AECM 3201 AEC REPORT NO. COO-2118-0028 UIUCDCS-R-72-l±96 2. TITLE METHODOLOGICAL ASPECTS OF SCENE SEGMENTATION J. TYPE OF DOCUMENT (Check one): [3J a. Scientific and technical report ^] b. Conference paper not to be published in a journal: Title of conference Date of conference Exact location of conference. Sponsoring organization □ c. Other (Specify) ». RECOMMENDED ANNOUNCEMENT AND DISTRIBUTION (Check one): a. AEC's normal announcement and distribution procedures may be followed. "2 b. Make available only within AEC and to AEC contractors and other U.S. Government agencies and their contractors. ] c. Make no announcement or distribution. .. REASON FOR RECOMMENDED RESTRICTIONS: I. SUBMITTED BY: NAME AND POSITION (Please print or type) Peter Raulefs Fellow, Department of Computer Science Organization Department of Computer Science University of Illinois Urbana, Illinois 6l801 Signature r^Ur £q^ H^ Date January 2k, 1972 FOR AEC USE ONLY AEC CONTRACT ADMINISTRATOR'S COMMENTS, IF ANY, ON ABOVE ANNOUNCEMENT AND DISTRIBUTION RECOMMENDATION: PATENT CLEARANCE: □ a. AEC patent clearance has been granted by responsible AEC patent group. LJ b. Report has been sent to responsible AEC patent group for clearance. I I c. Patent clearance not required. BLIOGRAPHIC DATA IEET 1. Report No. UIUCDCS-R-T2-H96 3. Recipient's Accession No. 5. Report Date June, 1972 Title and Subtitle Methodological Aspects of Scene Segmentation Author(s) Peter Raulefs 8. Performing Organization Rept. No. Performing Organization Name and Address Dept. of Computer Science Univ. of Illinois at Urbana-Champaign Urbana, Illinois 6l801 10. Project/Task/Work Unit No. ILLIAC III 11. Contract /Grant No. AT(11-1)-2118 . Sponsoring Organization Name and Address U. S. Atomic Energy Commission 13. Type of Report & Period Covered 14. Feb. - June 1972 Supplementary Notes . Abstracts Scene segmentation into regions and structural analysis of scenes is discussed. The spatial distribution of properties of points in a digitized picture is used to define the entropy of point sets (regions) in a picture. An optimal partitioning of a scene into regions is given by maximizing the average entropy of all regions. Using entropy, the cohesion of regions is defined and applied to analyzing the clustering approach to scene segmentation. Structural analysis of scenes in terms of synthesizing composites is briefly reviewed by considering graph transformations and cartesian covers. . Key Words and Document Analysis. 17a. Descriptors J. Identifiers/Open-Ended Terms 1:. COSAT1 Field/Group { Availability Statement ■ RELEASE UNLIMITED 19. Security Class (This Report) UNCLASSIFIED 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 22. Price 'M NTIS-35 ( 10-70) USCOMM-DC 40329-P7I stf z^ VST* JUL 26 1973 ^0^*0 mm m^^H SB IB HHnffl •■•/ ij ■. r»i »« « ^B 1 ■ BU I i HBO ROM I I