Perception granular computing in visual haze-free task


Expert Systems with Applications 41 (2014) 2729–2741
Contents lists available at ScienceDirect

Expert Systems with Applications

j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / e s w a
Perception granular computing in visual haze-free task
0957-4174/$ - see front matter Crown Copyright � 2013 Published by Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.eswa.2013.11.006

⇑ Corresponding author. Tel.: +86 01062600505.
E-mail addresses: huhong@ict.ac.cn (H. Hu), pangl@ics.ict.ac.cn (L. Pang),

tiandp@ics.ict.ac.cn (D. Tian), shizz@ics.ict.ac.cn (Z. Shi).
Hong Hu, Liang Pang ⇑, Dongping Tian, Zhongzhi Shi
Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Science, Beijing 100080, China

a r t i c l e i n f o a b s t r a c t
Keywords:
Granular computing
Leveled granular system
Fuzzy logic
Machine learning
Haze free
Brain-like computer
In the past decade, granular computing (GrC) has been an active topic of research in machine learning and
computer vision. However, the granularity division is itself an open and complex problem. Deep learning,
at the same time, has been proposed by Geoffrey Hinton, which simulates the hierarchical structure of
human brain, processes data from lower level to higher level and gradually composes more and more
semantic concepts. The information similarity, proximity and functionality constitute the key points in
the original insight of granular computing proposed by Zadeh. Many GrC researches are based on the
equivalence relation or the more general tolerance relation, either of which can be described by some dis-
tance functions. The information similarity and proximity depended on the samples distribution can be
easily described by the fuzzy logic. From this point of view, GrC can be considered as a set of fuzzy logical
formulas, which is geometrically defined as a layered framework in a multi-scale granular system. The
necessity of such kind multi-scale layered granular system can be supported by the columnar organiza-
tion of the neocortex. So the granular system proposed in this paper can be viewed as a new explanation
of deep learning that simulates the hierarchical structure of human brain. In view of this, a novel learning
approach, which combines fuzzy logical designing with machine learning, is proposed in this paper to
construct a GrC system to explore a novel direction for deep learning. Unlike those previous works on
the theoretical framework of GrC, our granular system is abstracted from brain science and information
science, so it can be used to guide the research of image processing and pattern recognition. Finally, we
take the task of haze-free as an example to demonstrate that our multi-scale GrC has high ability to
increase the texture information entropy and improve the effect of haze-removing.

Crown Copyright � 2013 Published by Elsevier Ltd. All rights reserved.
1. Introduction

Lin (2012) pointed out that Granulation seems to be a natural
methodology deeply rooted in human thinking. Many daily things
are routinely granulated into sub-things (Lin, 2012). In the IEEE-
GrC2006 conference of information about the granular computing
(GrC), the outline of GrC is defined as a general computation theory
for effectively using granules such as classes, clusters, subsets,
groups and intervals to build an efficient computational model
for complex applications with huge amounts of data, information
and knowledge (Zadeh, 1997). Just as the scholars summarized in
the IEEE-GrC2006 conference though the label is relatively recent,
the basic notions and principles of GrC, though under different
names, have appeared in many related fields, such as information
hiding in programming, granularity in artificial intelligence, divide
and conquer in theoretical computer science, interval computing,
cluster analysis, fuzzy and rough set theories, neutrosophic com-
puting, quotient space theory, belief functions, machine learning,
databases and many others (Bargiela & Pedrycz, 2006). The above
definition of GrC is too augmental and the subjects about classes,
clusters, subsets, groups and intervals have already studied by arti-
ficial intelligence and mathematics for a long time. What is really
new point for GrC? We think that the new or main point of the
GrC lies in the original insight of GrC proposed by Zadeh, in which
there are three basic concepts that underlie human cognition:
granulation, organization and causation. Informally, granulation
involves decomposition of whole into parts; organization involves
integration of parts into whole; and causation involves association
of causes with effects. Granulation of an object A leads to a
collection of granules of A, with a granule being a clump of points
(objects) drawn together by indistinguishability, similarity,
proximity or functionality (Zadeh, 1997). In this original insight
of GrC, Zadel pointed out three important aspects about GrC: (1)
the GrC is a main character of human cognition, (2) so called GrC
is based on indistinguishability, similarity, proximity or functional-
ity, (3) there is a close relationship among granulation, organiza-
tion and causation. Based on these points, we think that it is
necessary to find some key points of GrC in human cognition.
There are two kinds of GrC research: perception-level and
knowledge-level. A perception-level GrC does a series feature
transformation and tries to find meta-knowledge implied in sam-
ples; a knowledge GrC tries to process knowledge or structure

http://crossmark.crossref.org/dialog/?doi=10.1016/j.eswa.2013.11.006&domain=pdf
http://dx.doi.org/10.1016/j.eswa.2013.11.006
mailto:huhong@ict.ac.cn
mailto:pangl@ics.ict.ac.cn
mailto:tiandp@ics.ict.ac.cn
mailto:shizz@ics.ict.ac.cn
http://dx.doi.org/10.1016/j.eswa.2013.11.006
http://www.sciencedirect.com/science/journal/09574174
http://www.elsevier.com/locate/eswa


2730 H. Hu et al. / Expert Systems with Applications 41 (2014) 2729–2741
information based on metal-knowledge. In this paper we focus on
the perception-level.

Indistinguishability, similarity and proximity can be described
by equivalence relation or tolerance relation, and these relations
can be described by some kind distance functions. From the rele-
vant literature, it is easy to see that many GrC researches focus
on classification and clustering (Yao, 2000, 2001; Zhang & Zhang,
2003, 2004a). Zhang and Zhang (2003, 2004a, 2004b, 2005) use
the quotient space theory to try to study indistinguishability and
similarity. Yao (2001) extends the equivalent class to rough
approximation set. The quotient space structure described by
equivalence relation is used to probe the structure of granules such
as classes, clusters, subsets, groups etc. In a more general way, Lin
(1998, 2007) and Yao (1998, 1999) use binary relations and neigh-
borhood systems to study indistinguishability and similarity
respectively, the geometric concepts: partitions, covering and
topology, and neighborhood can be described by binary relations
in the algebra. Pedrycz, Hirota, Pedrycz, and Dong (2012) define
granular on fuzzy sets and discuss several operations and their
granular consistency (Pedrycz et al., 2012). Lin (2012) gives out a
summary about the history of granular computing, he discusses
all formal description of granular computing and some further
directions e.g. GrC, databases and data mining, GrC and clouding
computing etc. In fact, GrC should be discussed in the framework
of human cognition from perception to pattern recognition and
knowledge processing. Although the concept of granular comput-
ing has been proposed more than ten years, only few people pay
attention to this subject. In fact granular computing pays much
more attention to the leveled computing of intelligence. Just as
Yao (2006) pointed out: ‘‘Granules in the family are called focal
elements of discussion at the level. Each level is represented by a
plane. While granules at the same level are of similar nature, gran-
ules at different levels may be very different. Consequently, we
may use different vocabularies and languages for descriptions at
different levels.’’ The leveled computation revealed by granular
computing is very important for machine learning, e.g. the famous
approach deep learning. The term deep learning gained much
attraction in the mid-2000s after a publication by Bargiela and
Pedrycz (2006), Castro (1995). Nowadays it becomes a huge wave
of technology trend for big data and artificial intelligence. Deep
learning simulates the hierarchical structure of human brain, pro-
cesses data from lower level to higher level, and gradually com-
poses more and more semantic concepts. These facts mean that
deep learning has a close relationship with the granular
computing.

All the granular computing researches aforementioned, in gen-
eral, neglect information transformation and feature abstraction,
which are very important for deep learning. In this paper, we pro-
pose a novel framework of granular system, which has ability to
process information transformation and object-background sepa-
ration. We take the haze-free task as an example to validate the
ability of our granular system.

These facts mean that the deep learning has a close relationship
with the granular computing proposed by us.

The main contributions of this paper include:

(1) The basic notions and principles of GrC termed with differ-
ent names, but they have appeared in many related fields,
such as information hiding in programming, granularity in
artificial intelligence, divide and conquer in theoretical com-
puter science, interval computing, cluster analysis, fuzzy and
rough set theories, neurotrophic computing, quotient space
theory, belief functions, machine learning, databases, and
many others, so old version of Granular computing is just
an abstraction of old methods, in this paper we give a novel
concrete model for granular computing which has a multi
scale layered structure from feature abstraction to
classification.

(2) Our novel granular system has a close relation with deep
learning, so it develops a new focus for deep learning. It is
the first time that fuzzy logic is introduced for leveled fea-
ture abstraction in deep learning.

(3) Although fussy logic is often mentioned in granular comput-
ing, for example,fuzzy logic and rough set technique are
used by Lin (1999) for word computing, and Liu, Xiong,
and Wu (2012) use fuzzy lattices in the classification based
on hyperspherical granular computing (Liu et al., 2012). In
this paper, fuzzy logic is not only used for describing granu-
lar similar to hyperspherical granular in Liu et al. (2012), but
also for feature abstraction and classification. For this pur-
pose, we propose a novel and effective approach which is
combined fuzzy logical designing, PSVM and back
propagation.

(4) The granular computing proposed by us gives a novel
approach for the task of haze-free, the experiments’ result
show that this approach is sound.

The rest of this paper is organized as follows: In Section 2,we try
to give out a formal definition of granular system and granular
computing based on the a tolerance relation, which is described
by fuzzy logical formula; in Section 3, we discuss the algorithm
to design a granular system; in Section 4, we give out a concrete
example of designing a granular system for haze-free task; at last
Section 5 is the discussion and looking forward to the future.

2. Granular system based on tolerance relation

The difference between a granular system based on equivalence
relation and a granular system based on tolerance relation is that
an equivalence relation will divide a space into nonoverlapping
covering while a tolerance relation will create overlapping cover-
ing of this space.

Yiyu Yao proposed granular computing paradigm for concept
learning in which two learning strategies are investigated. A global
attribute-oriented strategy searches for a good partition of a uni-
verse of objects and a local attribute–value-oriented strategy
searches for a good covering (Yao & Deng, 2013). In this paper,
granular computing is started from feature vectors e.g. images,
not attributes. In order to simulation perception procession of
our cognition, we define a set of multi-scale nested convex regions
with a corresponding computing based on this set. There are two
main purposes to build such a granular system based on tolerance
relation:

(1) Granular systems are designed to describe similarity and
proximity of information, which can be described by toler-
ance relation. Granular systems based on tolerance relation
can be viewed as a topological structure built by topological
bases on the topological space ðX; sÞ induced from a metric
space ðX; disÞ by the metric dis. Granular systems based on
tolerance relation can be used to describe domain stricture,
which represents indistinguishability, similarity and prox-
imity of examples.Classification is determined by the indis-
tinguishability, similarity and proximity of information.
There are two kind similarity among examples-static simi-
larity and dynamical similarity.

(a) If elements of classes are distributed in standard convex

regions, we can use some kind distance function to
describe classes distribution domains. In this case, simi-
larity between two objects can be intuitively described
by distance functions. If disðx; yÞ is a distance function


H. Hu et al. / Expert Systems with Applications 41 (2014) 2729–2741 2731
in the n-dimensional space Rn and c is a point in Rn , the
formula disðc; yÞ < r described a convex region D in Rn
which takes c as its center. Every point y in this region
is equal to y ¼ c þ e, here e is some kind noise, if D is just
a ball, e can be viewed as white noise which has an
amplitude less than r.We denote such kind similarity
as static similarity.

(b) The dynamical similarity is different from static similar-
ity, if one object O1 continuously changes to another
object O2, e.g. a tadpole continuously grow up to a frog,
then O1 and O2 are dynamical similar, i.e. if all elements
in a class A are dynamical similar, the distribution
domain of A is a connected domain. Dynamical similarity
will cause distribution domain become very complicate
and have a nonlinear borderline. Although,a dynamical
similarity may cause the inner class difference larger
than the among classes difference, it can also be
described by equivalence relation or tolerance relation.
The difference of a granular system based equivalence
relation and granular system based on tolerance relation
is that an equivalence relation will divide a space into
non overlapping covering and a tolerance relation will
create overlapping covering of this space. The relation
described by the formula disðx; yÞ < r is the special case
of a tolerance relation.
(2) The main purpose of information transformation in pat-
tern recognition is to recognize or classify different
objects from their mixture, so information transformation
used in pattern recognition should be taken place in a
granular system which describes static similarity and
dynamical similarity. This is the main point we will dis-
cuss in this paper.

Now we try to use fuzzy logical formula based on distance func-
tion to define granular systems. There are three distance axioms:
ð1Þdisða;aÞ¼0;ð2Þdisða;bÞ¼disðb;aÞ;ð3Þdisða;bÞþdisðb;cÞP disða;cÞ.
disðx;yÞ<r defines a tolerance relation and disðx;yÞ¼0 defines an
equivalent relation, and every tolerance relation can be viewed
as an abstract distance which does not obey ðd3Þ, so any granular
systems based on equivalent relation and tolerance relation can
be described by distance function in a geometrical way.

Definition 1 (A simple fuzzy logical formula based on distance
function). A simple fuzzy logical formula based on distance func-
tion spða; cjdis; d; xÞ is denoted as:

spða; cjdis; d; xÞ¼ mðdisða; cjxÞ; dÞ ð1Þ

where disða; cjxÞ¼ disða � x; c � xÞ, x � x ¼ðx1x1; . . . ; xnxnÞ, and
disðÞ is point to point distance function, and x is a weight vector
for dimensions of feature vector space S; mðÞ can be viewed as a
membership function of a fuzzy set, it is usually a continuous func-
tion. For example

mðdisða; cjxÞ; dÞ¼
ðd � disða; cjxÞÞ=d; disða; cjxÞ < d
0; disða; cjxÞ P d

�
ð2Þ

where the distance function can be a set to set distance function,
e.g. Hausdorff function.

For the point to point distance function case, we define the r-cut
set of spða; cjdis; d; xÞ as.

sprða; cjdis; d; xÞspða; cjdis; d; xÞ > r (strong r-cut set) or
sprða; cjdis; d; xÞ¼ spða; cjdis; d; xÞ P r (r-cut set).

sprðx; cjdis; d; xÞ defines an open convex region and denoted
as a granule, and sprðx; cjdis; d; xÞ defines a closed convex
region.
Definition 2 (leveled perception granular system based on tolerance
relation Gsys). A granular system based on tolerance relations of
distance function (granular system for short) is a set of granules,
every granule is a 2-tuple fg; SFg, here g is a convex region which
is described by tolerance relations of distance function, and SF is
a set of fuzzy logical functions (denoted as ‘‘adjoint functions’’)
which are computed from the convex region g, the outputs of all
fuzzy logical functions in SF are denoted as ‘‘an adjoint vector’’ of
this granule. The granules of Gsys have the following attributions.

(1) Multi-scale leveled structure: The metric space X (e.g. a
finite connected volume region in the n-dimensional real
space Rn) is the only level 0 granule, the level 0 granule is
denoted as GðcoeG0Þ¼ fX; SFg, where coeG0 is a coefficient
set, coeG0 is usually empty, and the function SF is a set of
fuzzy logical functions (denoted as adjoint functions). The
convex region of a level 1 granule GðcoeG1Þ¼ fg1; SF1g is
defined by the conjunction of finite number r-cut sets

spr a; c11jdis
1
; d11; x

� �
; spr a; c12jdis

1
; d12; x

� �
; . . . ; spr a; c1k1jdis

1
;

�
d1k1 ; xÞ, where coeG

1 is its coefficient set to define the convex

region g1 , so g1 can also be written as g1ðcoeG1Þ and
coeG1 ¼ c11; c12; . . . ; c1k1 2 X; d

1
1; d

1
2; . . . ; d

1
k1
; dis1

n o
. The first

level of our granular system is denoted as
C1ðXÞ¼ fGðcoeG1Þg. If the l level granules
GðcoeGlÞ¼ fgl; SFlg have been defined, the l þ 1 level granules
can be defined as GðcoeGlþ1Þ, which can be defined as the
convex region created by the intersection of gl and finite

number strong r-cut sets: spr a; clþ11 jdis
lþ1
; dlþ11 ; x

� �
;

spr a; clþ12 jdis
lþ1
; dlþ12 ; x

� �
; � � � ; spr a; clþ1klþ1

�
jdislþ1; dlþ1klþ1 ; xÞ. A

level l þ 1 granule’s coefficient is coeGlþ1 ¼
fclþ11 ; c

lþ1
2 ; . . . ; c

lþ1
klþ1
2 X; dlþ11 ; d

lþ1
2 ; . . . d

lþ1
klþ1
; dislþ1g, where klþ1 is

denoted as the number of simple logical formulas in same

level. The centerofGðcoeGlþ1Þ is clþ1i 2 GðcoeG
lÞ, and its radius

is dlþ1i 6 d
l
i; lim l !1d

l
i ¼ 0. The GðcoeG

1Þ is called as the
‘‘the father granule of GðcoeGlþ1Þ ’’.

(2) Granular computing (GrC): A granular computing is
described by a set of fuzzy logical formula upon above multi
scale leveled structure of convex regions.
The purpose of a granular computing (GrC) is to transfer fea-
ture information and classy points in an input space X, so at
least one level of a granular computing (GrC) outputs a fuzzy
label for points in an input space X. If there are totally m
classes, a fuzzy label L is a m dimensional fuzzy vector
L ¼fl1; l2; . . . ; lmg; and

P
i¼1;...;mli ¼ 1.

Castro (1995) proved that Fuzzy logic controllers using fuzzy
rules are universal approximations, later, Li and Philip Chen
(2000) shows a proof of the equality between a forward neu-
ral circuit (or circuit) and a fuzzy logical inference. So it is
not difficulty to prove that any continuous functions
F : Rn�!½0; 1�n can be simulated by such kind nested layered
granular computing with arbitrary small errors.
A level l þ 1 adjoint function F lþ1 receives its input from the
outputs of level l þ p; p > 1 adjoint functions, i.e. if a leveled
granular system Gsys has k levels, GrC is taken from level k
to 1, so the level of a GrC is upside down with the level of the
Gsys. The 1st level GrC takes place in the smallest kth level
granules’ convex regions of the Gsys. Two kinds layered
computing can be taken place over a granular system. In
the first kind layered computing,the adjoint feature vectors
of larger scale level n granules are computed based on the


2732 H. Hu et al. / Expert Systems with Applications 41 (2014) 2729–2741
adjoint feature vectors of smaller scale n þ 1 level gran-
ules,such kind layered computing has strictly nested struc-
ture, (Fig. 1(a)), and is denoted as ‘‘nested layered
computing’’. In the second kind layered computing, adjoint
feature vectors of level n granules can be computed based
on all adjoint feature vectors of smaller granules, which have
level greater than n (Fig. 1(b)), such kind layered computing
is denoted as ‘‘unnested layered computing’’. Nested layered
computing is a special case of unnested layered computing.

(3) Radiuses of convex regions: A granular system can have
countable infinite or finite levels. The radiuses of gran-
ules’convex regions decrease and tend to zero when the
level goes to infinite.

(4) Centers’ grid: The centers of granules will distribute in a so
called center grid, we call the set of all centers of granules
of level l þ 1 on the granule GðcoeGlÞ as the center grid of
level l granule GðcoeGlÞ denoted as Gclþ1ðGðcoeGlÞÞ . We
denote the set of all centers of level l þ 1 granules over X
as Gclþ1ðXÞ and all centers of level l þ 1 granules over a level
k < l granule GðcoeGkÞ as Gclþ1ðGðcoeGkÞÞ.
The center grid is usually discrete, but it can also be a contin-
uous set e.g. the whole metric space X.

(5) Shape of granules: the shapes of granules is defined by their
distance functions disðÞ . If disðÞ can be an abstract distance
function, then a granule’s convex region can be an arbitrary
convex region. Every level uses same distance function, so
the granules in the same level have the same shape, but
for different levels, granules’region may have different
shapes.

(6) The cover over a granule: In order to create a cover over
GðcoeGlÞ, the elements in the centers’ grid Gclþ1 ðGðcoeGlÞÞ
should be tight enough. Such kind cover can be formerly
defined as:
Clþ1ðGðcoeGlÞÞ¼ ^
16i6klþ1

sprðx; clþ1i jdis
lþ1
; dlþ1i ; xÞ^ gðcoeG

lÞjclþ1i 2 Gc
lþ1ðGðcoeGlÞÞ; sprðx; clþ1i jdislþ1; d

lþ1
i ; xÞ^ gðcoeG

lÞ – /
� �
All level l þ 1 granules create a cover of the whole space X,
and denoted as the level l þ 1 cover or the level l þ 1 layer
of X, Clþ1ðXÞ¼ [

GðcoeGlÞ2ClðXÞ
fClþ1ðGðcoeGlÞÞg .

Radial Basis neural network (Haykin, 2008), which can be used
to simulate continuous functions, is an example of two layers gran-
ule system.

Definition 3 (Hyper-granules and mini-granules). All level n þ 1
granules GðcoeGnþ1Þ, which are contained in a level n granule
GðcoeGnÞ, denoted as ‘‘mini-granules’’, and the level n granule
GðcoeGnÞ is denoted as a ‘‘hyper-granule’’.
Fig. 1. Two kind layered compu
After the theory of fuzzy logic was conceived by Zadeh (1965),
many fuzzy logical systems have been presented, for example,
the Zadeh system, the probability system, the algebraic system,
and Bounded operator system, etc. According to universal approx-
imation theorem (Haykin, 1994), in this paper, the extended
Bounded operator is selected, which is denoted as ‘‘q-value
Weighted Bounded operator’’. It is not difficult to prove that
q-value weighted fuzzy logical formulas can precisely simulate
any continuous functions F : Rn�!½0; 1�n with arbitrary small error,
or vice versa, i.e. every GrC can be completed by a set of fuzzy
logical functions of q-value weighted bounded operator with
arbitrary small error.

Definition 4 (Bounded Operator Fð�f ;�fÞ). Bounded product:
x�f y ¼ maxð0; x þ y � 1Þ, and Bounded sum: x�f y ¼ minð1; x þ yÞ,
where 0 6 x; y 6 1 .

In order to simulate GrC, it is necessary to extend the Bounded
Operator to Weighted Bounded Operator. The fuzzy formulas de-
fined by q-value weighted bounded operators is denoted as q-value
weighted fuzzy logical functions.

Definition 5 (q-value Weighted Bounded operator Fð�f ;�fÞ).
q-value Weighted Bounded product:

p1�f p2 ¼ F�f ðp1; p2; w1; w2Þ
¼ maxð0; w1p1 þ w2p2 �ðw1 þ w2 � 1ÞqÞ ð3Þ

q-value Weighted Bounded sum:

p1�f p2 ¼ F�f ðp1; q2; w1; w2Þ¼ minðq; w1p1 þ w2p2Þ ð4Þ

where 0 6 p1; p2 6 q.
For association and distribution rules, we define:
ðp1Df p2ÞHf p3 ¼FHf ðFDf ðp1;p2;w1;w2Þ;p3;1;w3Þ and p1Dfðp2Hf p3Þ¼
FDf ðp1; FHf ðp2; p3; w2; w3Þ; w1; 1Þ, Here Df ; Hf ¼�f or �f . We can
prove that �f and �f follow the associative condition (see
Appendix C) and

x1�f x2�f x3 . . .�f xn ¼ min q;
X

16i6n

wixi

 !
ð5Þ

x1�f x2�f x3 . . .�f xn ¼ max 0;
X

16i6n

wi xi �
X

16i6n

wi � 1
 !

q

 !
ð6Þ

For more above q -value weighted bounded operator Fð�f ;�fÞ fol-
lows the Demorgan Law, i.e.
ting over granular system.


H. Hu et al. / Expert Systems with Applications 41 (2014) 2729–2741 2733
Nðx1�f x2�f x3 . . .�f xnÞ¼ q � min q;
X

16i6n

wi xi

 !

¼ max 0; q �
X

16i6n

wi xi

 !

¼ max 0;
X

16i6n

wiðq � xiÞ�ð
X

16i6n

wi � 1Þq
 !

¼ Nðx1Þ�f Nðx2Þ�f Nðx3Þ � � ��f NðxnÞ:
ð7Þ

But for the q-value weighted bounded operator Fð�f ;�fÞ, the
distribution condition is usually not hold, and the boundary condi-
tion is hold only all weights equal to 1, for p1�f q ¼ F�f ðp1; q;
w1;w2Þ¼maxð0;w1 p1 þð1�w1ÞqÞ and p1�f q ¼ F�f ðp1; q; w1; w2Þ¼
minðq; w1 p1 þ w2 qÞ. In this paper, we show that the task of
haze-free can be completed by a common GrC based on fuzzy
logical formulas of bounded fuzzy operator.

3. Hybrid designing of leveled perception granular system based
on fuzzy logic and PSVM

Owing to the limitation of the scope, in this paper only nested
layered GrC is discussed. A nested layered GrC is defined by the in-
put and output relation of a granular computing on a granular sys-
tem. There are three kinds relations between nearby layers (layers
k and k þ 1) of a nested GrC: (1) binary logic; (2) fuzzy logic; (3)
alogical relation.

Because fuzzy logic and binary logic are all created by the sig-
moid function, so back propagation method can be used to mod-
ify weights of all layers. In order to speed up the learning
process, for a layered GrC, we combine logical designing with
PSVM (Fung & Mangasarian, 2001), such kind novel approach is
called as ‘‘Logical support vector machine (LPSVM)’’. For nested
layered GrC, parameters in the binary logical layers can be di-
rectly designated according to the binary relation; for the fuzzy
logical layers, parameters can also be set according to these lay-
ers’functions, but a suitable small adjustment by back propaga-
tion is necessary, this is similar to the deep learning proposed
by Geoffrey Hinton such that a many-layered neural network
could be effectively pre-trained one layer at a time, treating each
layer in turn as an unsupervised restricted Boltzmann machi-
ne,then using supervised backpropagation for fine-tuning. For
the non logical (alogical) layer, parameters should be learned
based on samples according to the input and output relation
function fiðx1; x2; x3; . . . ; xnÞ, we can use Back Propagation method
or PSVM, to learn weights for fiðx1; x2; x3; . . . ; xnÞ .

The designing strategy of LPSVM:

� Step 1: Except for the alogical layer’s weights, designing the lay-
ers’ weights according to the logical (binary or fuzzy) relations,
for fuzzy logical relations, a suitable modification of weights
maybe be necessary according to the task of this layer;
� Step 2: Alogical layers’ weights are computed for the input layer

to the last output layer. For an alogical layer i, if X is the input
train set, computing the inner layers’ output from the 1st layer
to the ði � 1Þth layer based on X;
� Step 3: Using PSVM to compute the ith layer’s weights W i

according to (8);
� Step 4: Repeat the Step 2 to Step 4, until the output error is

small enough.
� Step 5: using back propagating approach to modify all layers

weights. The weight vector W l of
W i ¼ X0DU ð8Þ
Where The weight vector W i of the node, U is computed by (9) and
X and D are the problem data, i.e. X ¼ ½X1; . . . ; Xn�, and diagonal ma-

trix D ¼
y1 0 0
..
. . .

. ..
.

0 . . . yn

2
64

3
75;ðXi; yiÞ is a training sample with Xi feature

vector and target yi .
U ¼
I

m
þ DðXX0 þ EE0ÞD

� ��1
E ð9Þ
Where m is a positive parameter selected for guarantee of a small
magnitude kW ik; I is the identity matrix, and E is a vector with all
elements are 1.

4. Granular system for visual task

The columnar organization of our brain’s primary visual cortex
strongly supports the granular system defined aforementioned.
Many functions of the primary visual cortex are still unknown,
but the columnar organization is well understood. The lateral
geniculate nucleus (LGN) transfers information from eyes to brain
stem and primary visual cortex (V1) (Mountcastle, 1997). Colum-
nar organization of V1 plays an important role in the processing
of visual information.

Local similarity of information processing gives rise to Colum-
nar organization has a granular structure. V1 is composed of a grid
of ð1 	 1mm2Þ neural area of hypercolumns (hc) in our brain’s pri-
mary visual cortex. Every hypercolumn contains a set of minicol-
umns (mc), which have same focus. Each hypercolumn analyzes
information from one small region (described by a distance func-
tion) of the retina. Adjacent hypercolumns analyze information
from adjacent areas of the retina, so the structure of a columnar
organization can be described by a set of fuzzy logical formulas
similar to a granular system. Hypercolumns (or supercolumns),
minicolumns (mc) can be viewed as granules. Similar to the pri-
mary visual cortex,in our granular system, there are two kind gran-
ules:hyper-granule and mini-granules in some levels of our
granular system. A hyper-granule contain a bundle of mini-
granules.

Definition 6 (Perception Granular system of Columnar Organization
(COGsys)). A perception columnar organization is a special per-
ception granular system, in which, there is at least one hyper-
granule GðcoeGnþ1Þ such that all mini-granules included in it have
same convex region, but different adjoint functions.

In this paper, in order to simulating visual cortex, a granular
system of columnar organization (COGsys) is designed for the
haze-free task.In our Hybrid designing approach (LPSVM), we
firstly design Leveled Granular Systems with the help of fuzzy lo-
gic, and then we use PSVM to accomplish the learning for some
concrete visual tasks.

4.1. The theory of image matting

According to Levin, Lischinski, and Weiss (2008), image matting
refers to the problem of softly extracting the foreground object
from an input image and a trimap image. ‘‘Tripmap’’ means three
kinds of regions, white denotes definite foreground region, black
denotes definite background region and gray denotes undefined
region. Formally, image matting methods take I as an input, which
is assumed to be a composite of a foreground image F o and a back-
ground B in a linear form and can be written as I ¼ aF o þð1 � aÞB .
For the haze-free task, the fuzzy label of haze or non-haze is
described by the parameter a . And the task of image matting tries
to find a function Fo ¼ fFoðIÞ . Closed form solution assumes that a is
a linear function of the input image I in a small window


2734 H. Hu et al. / Expert Systems with Applications 41 (2014) 2729–2741
w : ai 
 aIi þ b;8i 2 w . Then to solve a spare linear system to get
the alpha matte. Our GrC approach gets rid of the linear assump-
tion between a and I. Instead, we try to introduce nonlinear rela-
tion between a and I:

aw ¼ FðW IwÞ ð10Þ

here W Iw is the image block included in the small window w, and aw
is its center pixel’s fuzzy label. We take color or texture in local win-
dow as our input feature, and the trimap image as the target. After
training, the neural fuzzy logical network will generate the result of
alpha matte. In the application of alpha matting, our method can re-
move the haze using dark channel prior as the trimap.

4.2. Leveled perception granular system for haze-free task

In this section, we try to design a Perception Granular system of
Columnar Organization (COGsys) for the haze-free task, here only
nested layered GrC is needed. The recognition of our Leveled
Fig. 2. A 4 layers’ structure of a granular sy
Granular System (see Fig. 2) is started with the recognition orien-
tation or simple structure of local patterns, then the trimap image
is computed based on these local patterns. Eq. (11), which has a
high ability to simulate fuzzy logic operator (see the detail in the
appendix) is used to design GrC. The weight wi in Eq. (11) can be
viewed as connections among granules. A nested layered GrC is de-
fined by the input and output relation of a granular computing on a
granular system. just as above mentioned, there are three ways to
design weights of a layered GrC:according to the binary or fuzzy
logical relation about this layered GrC and according to the input
and output relation function fiðx1; x2; x3; . . . ; xnÞ from training
samples.

Ulþ1;i ¼
X

k

wlþ1;i;k � Ilþ1;k;i

Olþ1;i ¼ sigmðUlþ1;i; T lþ1;i; kÞ
ð11Þ

where sigmðÞ is a sigmoid function Eq. (12), and Olþ1;i is the output
of a level l þ 1 granule.
stem for haze-background separation.


H. Hu et al. / Expert Systems with Applications 41 (2014) 2729–2741 2735
sigmðx; t; kÞ¼ 1=f1 þ expf�k � ðx � tÞgg ð12Þ

The Theorem 1 discussed in the Appendix A guarantee that above
defined granule computing can simulate a boolean function with
arbitrary small error. As the designing of Gsys contains two parts:
(1) convex regions, (2) adjoint fuzzy logical functions for GrC. The
following Gsys for haze-free task is a very simple convex regions
can be described by distance function.

The input space is just an image, which is a 5-dimensional space
X ¼fðx; y; r; g; bÞg, here every example ðx; y; r; g; bÞ represents a pix-
el of this image, ðx; yÞ is the pixel’s location and ðr; g; bÞ is pixel’s
color value. the nested granular system is build on the image. A
granular system is built upon images, with fuzzy logical formula
spða; cjdis; d; xÞ here x ¼ð1; 1; 0; 0; 0Þ, and d ¼ 0 for level 1 GrC,
and d > 1 for higher level GrCs. All levels’ centers are located on
the whole image plane, so every centers grid is just the image
plane and granules are overlapped.

In the following pages, we focus on the designing of adjoint fuz-
zy logical functions for GrC.

If there are k levels in our Gsys, the kth level receives the input
image I, and the first level granule outputs the result haze free im-
age. The relation between input and output of a level-l granule is
described by Eq. (11). The weights among granules can be designed
by LPSVM, the weights of 1st and 2nd layers are designed by fuzzy
logic, and the weights of the 3rd layer are designed by PSVM to
learn the trimap image.

For the sake of simplicity, in the following Gsys, we use the or-
der of GrC level which is upside down with the granular system le-
vel, and one layer may contain two GrC levels.

The GrC of COGsys is formally defined as bellow:

(1) The 1st layer – fuzzy logical layer Every hyper-granule
(Fig. 3) in the 1st layer tries to change a 3 	 3 pixels’ image
block Ib 3	3 into a binary 3 	 3 pixels’texture pattern. The
input image is normalized. A hyper-granule HG ¼ðg; SFÞ in
the 1st layer contains 3 	 3 mini-granules to focus a 3 	 3
small window, every mini-granule focuses only one pixel,
so the convex region of a hyper-granule is described by
disðx; cÞ� 0 . A hyper-granule completes the task of image
processing. There are three kinds fuzzy logical functions in
a hyper-granule’s SF :
Fig. 3.
pattern
just a 3
every g
(1) In a local image pattern recognition way (LIPW): the 1st
processing directly transforms every pixel’s value to a
fuzzy logical one by a sigmoid function.

F1ðfIbg3	3Þ¼ fsigmðpi;jÞg3	3 ð13Þ
Every 1st layer granule tries to change a local image into a binary texture
. For a hyper granule is defined by a distance function disðx; cÞ < 3, which is
	 3 small window, a hyper-granule in the 1st layer contains 9 granules, and
ranule focuses only one pixel.
here pi;j; i; j ¼ 1; 2; 3 is the RGB pixel value in a small 3 	 3
window;

(2) In a local Binary Pattern operator simulating way

(LBPW). The 2nd processing is also completed by a sig-
moid function; the difference is that every boundary pix-
el’s value is fuzzy exclusive OR � with the center pixel’s
value before sending it to a sigmoid function,

F 2ðfIbg3	3Þ¼ ffðpi;jÞg3	3 ð14Þ
Here fðpi;jÞ¼ sigmðpi;j � p2;2Þ when i; j – 2, and fðp2;2Þ¼ 0. F2
is similar to a Local Binary Pattern operator (LBP) mentioned
in Ojala, Pietikäinen, and Harwood (1996) as a mean of sum-
marizing local gray-level structure. The operator takes a local
neighborhood around each pixel, thresholds the pixels of the
neighborhood at the value of the central pixel and uses the
resulting binary-valued image patch as a local image descrip-
tor. It was originally defined for neighborhoods, giving 8 bit
codes based on the 8 pixels around the central one. Such
processing emphasizes the contrast of texture, and our exper-
iments support this fact.

(3) Hybrid LIPW and LBPW (LBIPW). The adjoint function

F 3ð�Þ in LBIPW is same as F2ð�Þ in LBPW, except that
fðp2;2Þ¼ p2;2 in F 3ð�Þ, while fðp2;2Þ¼ 0 in F 2ð�Þ .
Every granule in a 1st layer’s granule has only one input
weight wij in Fig. 3, which equals 1; when k !þ1, the coef-
ficient k in Eq. (11) changes the outputs from fuzzy values to
binary numbers.

(2) The 2nd layer–binary logical layer Every 2nd layer mini-
granules try to recognize a definite shape (see Fig. 4), so they
share the same convex region with a 1st layer hyper-gran-
ule, which focuses on the same small 3 	 3 window in an
image, and can be described by disðx; cÞ < 2. If there are total
q local small patterns, a hyper-granule in the 2nd layer con-
tains q (in our system q ¼ 256 or 512) mini-granules of the
2nd layer, which have same receptive field, but with a differ-
ent adjoint fuzzy logical function, which tries to recognize a
definite shape from the output of a 1st layer hyper-granule.
For example, the ‘‘\’’ shape in Fig. 4 can be described by a
adjoint fuzzy logical formula (Eq. (15)). The ‘‘and’’ operator
for 9 inputs in Eq. (15)can be created by a granule mc
(see. Fig. 4). In Eq. (15), every pixel Pij has two states mij
and mij . Suppose the unified gray value (or RGB value) of
Pij is gij, and an image module needs a high value gij at the
place of mij and a low value at mij . So the input for the gran-
ule mc at mij is Iij ¼ gij, and at mij is Iij ¼�ð1:0 � gijÞ. A not
gate mc0 is needed for Iij ¼�ð1:0 � gijÞ, here gij; i; j ¼ 1; 2; 3
is the output of a 1st layer hyper-granule.
P¼m11 ^m12 ^m13 ^m21 ^m23 ^m31 ^m33 ^m22 ^m32 ð15Þ

wij ¼
1; if the jth bit of a binary pattern ¼ 1
�1; if the jth bit of a binary pattern ¼ 0

�
ð16Þ
where for LIPW and LBIPW, j ¼ 1; 2; 3; . . . ; 9; for LBPW, the cen-
ter 1st-layer granule is useless, so j ¼ 1; 2; 3; . . . 8. There are three
kinds hyper-granules in the 2nd-layer, which receive three differ-
ent outputs of a 1st-layer’s hyper-granule, so a hyper-granule in
the 2nd-layer may work in one of following three ways:

1. In the local image pattern recognition way (LIPW): every 2nd
layer hyper-granule contains 512 2nd-layer’s mini-granules,
and inputs of these 2nd-layer’s mini-granules come from a
1st-layer’s hyper-granule which works in LIPW way. Every
2nd-layer’s hyper-granule tries to classify the image block in
this window into 512 binary texture patterns (BTP), e.g. eight
important BTPs are shown in Fig. 5. The pixel value is ‘‘1’’ for


Fig. 5. Every the 2nd layer’s granule contains 256 or 512 granule which corresponds to 256 or 512 modules in above picture.

Fig. 4. A hyper-granule in the 2nd layer contains q granules which have same receptive field and try to recognize q definite small shapes. A ‘and’ granule is needed for every
2nd layer granule.

2736 H. Hu et al. / Expert Systems with Applications 41 (2014) 2729–2741
white and ‘‘0’’ for black. In this mode, 3 	 3 granules of the 1st
layer output a 3 	 3 vector, i.e., a 3 	 3 fuzzy logical pattern of a
BTP, which is computed by a sigmoid function.

2. In the local Binary Pattern operator simulating way (LBPW), a
2nd-layer’s hyper-granule contains 256 2nd-layer’s mini-
granules which receive input from the output of a 1st-layer’s
hyper-granule, which works in the way of LBPW.

3. In the hybrid LIPW and LBPW (LBIPW) way, a 2nd-layer’s hyper-
granule contains 512 2nd-layer’s mini-granules which receive
input from the output of a 1st-layer’s hyper-granule, which
has 9-dimensions.

In our system, a Gsys is built for every color channel R,G or B, so
a hyper-granule in the 2nd layer has a 512 	 3 dimensions output
or 256 	 3 dimensions output.

As a binary logical layer, in order to recognize a binary pattern,
an ‘and’ granule with index i is needed (see Fig. 4) for every 2nd-
layer granule, and the weights of this ‘and’ granule to the 1st-layer
granules are set as Eq. (16), the corresponding parameters in Eq.
(11) are set as the threshold T i ¼ 5:1, and k ¼ 0:9.

4.2.1. The 3rd layer – alogical layer
The convex region of this layer can also be described by

disðx; cÞ < 2. The output of a hyper-granule in the 2nd layer, which
has 3 	 256 or 3 	 512 dimensions, is transformed to the 3rd-layer
granules to compute the similarity parameter or fuzzy value ai in
Eq. (10), the weights of this layer is computed by psvm, the target
is provided by so called dark channel prior which is computed by
the approach mentioned in He, Sun, and Tang (2011). As all ai
are optimised on the whole image, in this layer,the whole image
is the only convex region. As the small windows focused by hy-
per-granules in the 2nd-layer are overlapped, the focuses of 3rd-
layer’s granules are also overlapped.

4.2.2. The 4th layer – fuzzy logical layer
In this layer, a granule tries to remove the haze from original

image. A granule in the 4th layer computes a pixel of a haze free
image according to fuzzy logical equation Eq. (17)
IiðxÞ¼ minfq; aiðxÞ � JiðxÞþð1 � aiðxÞÞ � Aig¼ JiðxÞ�f Ai ð17Þ

where Ji is the haze free image, Ii is the original image, Ai is the
global atmospheric light which can be estimated from dark
channel prior, ai is the alpha matte generated by 3rd layer, and
�f is the q-value Weighted Bounded sum with weights
w1 ¼ aiðxÞ; w2 ¼ 1 � aiðxÞ, here q is max gray or RGB value of a pixel,
and aiðxÞ and ð1 � aiðxÞÞ are weights. Although we can use back
propagation approach to compute pixels’ value JiðxÞ given the haze
image pixel value IiðxÞ based on Eq. (17), for the sake of simplicity,
we directly use the Eq. (18) mentioned by He et al. (2011) to com-
pute the haze free image. As every aiðxÞ is computed upon the
whole image, the pixel of haze-free image is also computed upon
whole image, so the whole image is also the convex region of this
layer.

JiðxÞ¼
IiðxÞ� Ai

maxðaiðxÞ; a0Þ
þ Ai ð18Þ

where Ji is the haze free image, Ii is the original image, Ai is the glo-
bal atmospheric light which can be estimated from dark channel
prior, ai is the alpha matte generated by 3rd layer, and a0 is a
threshold, a typical value is 1.

4.3. Experiments result

The haze-free experiment result

(1) The haze-free and texture information entropy
Texture information can give out a rough measure about the
effect of haze-freeing, we use the entropy of the texture his-
togram to measure the effect of deleting haze from images.
The entropy of the histogram is described in Eq. (19).
Haze makes the texture of an image unclear, so theoretically
speaking, haze removing will increase the entropy of the
texture histogram.
Entropy : H ¼�
XG�1
i¼0

pðiÞlog2½pðiÞ� ð19Þ


Fig. 6. The processing result of granular system for visual haze-free task.

H. Hu et al. / Expert Systems with Applications 41 (2014) 2729–2741 2737
The pðiÞ denote the rate of each pattern in histogram. In general pðiÞ
define in Eq. (20). Here patterns we use are the LBPs in Eq. (14),
where Sigmðin � iCÞ¼ 1, if in > iC þ 10 else Sigmðin � iCÞ¼ 0 .
Table 1
The tex
Area2:

Area

Area
Area

Fig. 7.
granule
pðiÞ¼ hðiÞ=ðNMÞ; i ¼ 0; 1; . . . ; G � 1 ð20Þ
In Fig. 6(a), (b), (c), (d) and (e) are the results of LBPW, LBIPW, LIPW,
the linear mode (LMKH) by He et al. (2011), and the original image
respectively. From Fig. 6, we can see that the texture structure in
the waist of a mountain becomes vaguer from LBPW, LBIPW, LIPW
to LMKH. For the sake of the 2nd kind processing in the 1st-layer’s
granules pays much more attention to the contrast, LBPW has the
highest ability to remove the haze, LBPW and LIPW are complemen-
tary approaches, LBIPW, which is the cooperation of them, has a
similar ability as the linear approach proposed by He et al. (2011).
According to the results showed in the Table 1, which are about tex-
ture information entropy of the image, we can see that the texture
information entropy is increased after haze-free processing, so our
approaches have higher ability to increase the texture information
entropy than the linear approach proposed by He et al. (2011).
ture information entropy of the image blocks (Area1: the waist of a mountain;
right bottom corner) in the Fig. 6.

LBPW LBIPW LIPW LMKH Original

1 5.4852 5.2906 5.1593 4.8323 1.0893
2 6.1091 10.3280 10.2999 9.1759 8.3718

The relation among the precision (rmse) of PSVM learning and k parameters in th
.

Theoretically speaking, LBPW is a pure texture processing, so LBPW
has a highest value, LIPW is much more weaker than LBPW, LBIPW
is the hybrid of LBPW and LIPW, so it has a average ability. The tex-
ture information entropy of the Area1 correctly reflects this fact. But
for the Area2, as it already has a clearest texture structure in the ori-
ginal image, the deleting of haze may cause overdone. The texture
information is over emphasized by LBPW in the Aera2, so it has a
lowest texture information entropy and almost becomes a dark
area. This fact means that overtreatment is more easier to appear
in a non linear processing than a linear one in the haze-free task.

(2) The effect about the degree of fuzzyness
Just as the Theorem 1 mentioned above, the parameter k in
Eq. (10) can control the fuzzyness of a granule, when the
parameter k in Eq. (10) tends to infinite, a granule behaves
from a fuzzy logical formula to a binary logical formula. This
experiment is about the relation among the precision (rmse)
of PSVM learning and k parameters in the first and second
layer. LBPW is a pure texture processing and pays much
more attention to the contrast of an image’s nearby pixels,
a set of large k is necessary for a low rmse, which corre-
sponds to binary logic; but LBIPW and LIPW aphe pear to
prefer fuzzy logic for a set of small k when rmse is small. A
possible explanation for this fact is that LBP proposed by
Ojala et al. (1996) is binary, not fuzzy, and has a sound clas-
sification ability for image understanding under binary pat-
tern, but LBIPW and LIPW are not binary, they have fuzzy
information at least for the center pixel of a 3 	 3 small
window (Fig. 7).
e first and second layer, the parameter k in Eq. (10) can control the fuzzyness of a


Fig. 9. Simulating fuzzy logical and-or by changing thresholds of Eq. (11). The X-
axis is the threshold value divided by 0.02, the Y-axis is errG. The real line is errAnd
between I1 �f I2 and V i , and the dot line is the errOr between I1�f I2 and V i .

Fig. 8. More result of granular system for visual haze-free task.

2738 H. Hu et al. / Expert Systems with Applications 41 (2014) 2729–2741
(3) The comparison between our approach and LMKH
To illustrate the effect of our approach in haze-free task, we
apply it on the other images and compare with LMKH
(Fig. 8). Half of the result is better than the LMKH, the rest
is as good as the LMKH by manual evaluation.

5. Discussion

In this paper, we give out a concrete example to show that the
theory of GrC can help us to design the brain-like computer. The
experimental results show that LPSVM is a promising approach
for designing of a granular system similar to a columnar organiza-
tion for image haze-removing task. The concept of granular com-
puting is proposed by Bargiela and Pedrycz (2006). Just as he
said: a granule is a clump of objects (points) drawn together by
indistinguishability, similarity, proximity or functionality. The nec-
essary of granular computing to study the information transforma-
tion in the pattern recognition lies in indistinguishability,
similarity, proximity or functionality of sensed information Due
to the local similarity in the information processing of pattern rec-
ognition, multi-scale information processing is a common phe-
nomenon in pattern recognition. In actual fact, the GrC based on
the leveled granular system aforementioned can simulate all mul-
ti-scale information processing with arbitrary small error. This fact
is very important for the hot approach-deep learning. In this paper,
we use a novel designing approach (LPSVM) to design a granular
system similar to the structure of columnar organization of visual
cortex, We demonstrate that fuzzy logic and machine learning can
be hybrid and cooperated easily to design a granular system.

This approach not only give out a novel concrete realization of
abstract models for granular computing mentioned Lin (2012),
but also gives a new focus for deep learning. For more,the corre-
sponding of GrC can simulate multi-scale information processing
for the task of haze-free of images, and our experiments show that


H. Hu et al. / Expert Systems with Applications 41 (2014) 2729–2741 2739
our approach has some approvement for the task of haze-free com-
paring to the approach proposed by the linear approach proposed
by He et al. (2011).

For further directions, although our LPSVM gives out a concrete
example for designing a granular system for haze free task, many
details of LPSVM should be studied in the task of pattern recogni-
tion under the framework of deep learning, especially for the lay-
ered feature abstraction in the task of pattern recognition. For
more, we will extend the investigation by looking at other nested
layered computing for more complex tasks. However, since layered
computing has no feedback, which is important for many visual
tasks in dynamical situations, we also plan to extend our layered
granular computing to a more general one which allows for both
feedback and dynamical regulation for the task of computer vision.

Acknowledgments

This work is partially supported by the National Program on
Key Basic Research Project (973 Program) (No. 2013CB329502),
the National Natural Science Foundation of China (Nos.
61072085, 61035003, 61202212, 60933004), the National High-
tech R and D Program of China (863 Program) (No.
2012AA011003), the National Science and Technology Support
Program (2012BA107B02) and the China Information Technology
Security Evaluation Center (CNITSEC-KY-2012-006/1).

Appendix A. Sigmoid function and Binary Logic

Theorem 1. Suppose in Eq. (11), every
wlik ¼ bk T i; bk > 0; T i > 0; 1 6 k 6 K, for more, C ¼fSiji ¼ 1; . . . ; Lg
is a class of index sets, and every index set Si is a subset of
f1; 2; 3; . . . ; Kg, then we have:
(1) If fðx1; x2; . . . ; xkÞ¼ _
l¼1;...;L

ð^xjiÞ
jl2Sl

is a disjunctive normal form

(DNF) formula, and the class C ¼fSiji ¼ 1; . . . ; Lg is the class

which has the following two characters: (1). for every

Si; Sj 2 C; Si \ Sj – Sk 2 C for all k and i – j (this condition

assures that fðx1; x2; . . . ; xkÞ has a simplest form); (2). every Si
has the character

P
j2Sl

bj > 1, where Si 2 C, and any index sets

S0 R C have character
P

j2S0bj < 1, or if
P

j2S0bj > 1, there must

be an index set Si 2 C such that S0 \ Si ¼ Si (this condition

assures C is the largest), then the output described by Eq. (11)

can simulate the DNF formula fðx1; x2; . . . ; xkÞ¼ _
l¼1;...;L

ð^xjiÞ
i2Sl

with arbitrary small error, where xi ¼ zi, if the corresponding

input Ii ¼ zi, or xi ¼ �zi if Ii ¼ 1 � zi .
(2) If a neural cell described by Eq. (11) can simulate the Boolean

formula fðx1; x2; . . . ; xkÞ with arbitrary small error, and ð^xiÞ
i2Sl

is an item in the disjunctive normal form of fðx1; x2; . . . ; xkÞ,

i.e. fðx1; x2; . . . ; xkÞ¼ 1 at xj ¼ 1 for all j 2 Sl and xj ¼ 0 for all

j R Sl, then
P

i2Sl
bi > 1.

(3) If a couple of index sets Sl1 and Sl2 can be found in the formula

fðx1;x2;. . .;xkÞ¼ _
l¼1;...;k;

ð ^
t2Sl

xtÞ, such that ð ^
t12Sl1

xt1Þ^ð ^
t22Sl2

xt2Þ¼

zi ^ �zi ¼ false, then the output described by Eq. (11) can’t

simulate the formula fðx1; x2; . . . ; xkÞ .
Proof.

(1) If It ¼ 1, for all t 2 Sl, and It ¼ 0, for all t R Sl, becauseP
i2Sl

bi > 1, then for the index set Sl is a subset of
f1; 2; 3; . . . ; Kg, we have
V i ¼ 1=½expð�kðUi � T iÞþ 1�

¼ 1=½expð�kð
X

16k6K

wikIk � T iÞÞþ 1�

¼ 1=½expð�kð
X
i2Sl

bi � 1ÞT iÞÞþ 1�;
so limk!þ1V i ¼ 1 ¼ fðx1; x2; . . . ; xkÞ. If It ¼ 1;8t 2 S0; It ¼ 0;8t R S0 and
S0 R C, then according to the condition of this theorem: if

P
i2S0bi < 1,

limk!þ1V i ¼ 0 ¼ fðx1; x2; . . . ; xkÞ; if
P

i2S0bi > 1, then there is an in-

dex set Si 2 C such that S0 \ Si ¼ Si , then limk!þ1V i ¼ 1 ¼
fðx1; x2; . . . ; xkÞ. So when k !1, the error between output described
by Eq. (11) and fðx1; x2; . . . ; xkÞ trends to 0.

(2) If the output described by Eq. (11) can simulate the Boolean
formula fðx1; x2; . . . ; xkÞ which is not a constant with arbi-
trary small error, and for a definite binary input
x1; x2; . . . ; xk , then the arbitrary small error is achieved when
k trends to infinite and ðUi � T iÞ¼

P
k2Sl

wikIk � T i – 0 where
Sl is the set of the labels and Ii ¼ 1, for all i 2 Sl , and Ii ¼ 0, for
all i R Sl.
The theorem’s condition supposes that every
wik ¼ bkT i; bk > 0; T i > 0; 1 6 k 6 K, and x1; x2; . . . ; xk are bin-
ary number 0 or 1, so if fðx1; x2; . . . ; xkÞ is not a constant,
when fðx1; x2; . . . ; xkÞ¼ 0, there must be limk!þ1V i ¼ 0; and
when fðx1; x2; . . . ; xkÞ¼ 1, it is necessary for limk!þ1V i ¼ 1.
limk!þ1V i ¼ 0 needs that -kð

P
i2Sl

biT i � T iÞ trends to minus
infinite and limk!þ1V i ¼ 1 needs that -kð

P
i2Sl

biT i � T iÞ
trends to plus infinite. So if fðx1; x2; . . . ; xkÞ¼ 1 at xj ¼ 1
for all j 2 Sl and xj ¼ 0 for all j R Sl, in order to guarantee
limk!þ1errfðx1;x2;...;xkÞðwi;1; wi;2; . . . ; wi;k; T iÞ ¼ 0;

P
i2Sl

bi > 1
must be hold, here errfðx1;x2;...;xkÞðwi;1; wi;2; . . . ; wi;k; T iÞ is the
error between output described by Eq. (11) and
fðx1; x2; . . . ; xkÞ.

(3) The third part of the theorem is based on the simple fact that
for a single neuron V i is monotone on every input Ii which
can be zi or 1 � zi. h

Appendix B. Sigmoid function and fuzzy logic

For more above granular computing can approximately simu-
late Bounded operator. Bounded operator Fð�f ;�fÞ Bounded prod-
uct p�f q ¼ maxð0; p þ q � 1Þ, Bounded sum p�f q ¼ minð1; p þ qÞ.
Based on Eq. (11), the membrane potential’s fixed point under in-
put Ik is Ui ¼

P
k wikIk and the output at the fixed point is

V i ¼ 1=ðexpð�Ui þ T iÞþ 1Þ.
If there are only two inputs I1; I2ðI1; I2 2 ½0; 1�Þ in Eq. (11), we set

w1 ¼ 1:0 and w2 ¼ 1:0, then Ui ¼ I1 þ I2 .
Now we try to prove that the Bounded operator Fð�f ;�fÞ is the

best fuzzy operator to simulate neural cells described by (3) and
the threshold Ti can change the neural cell from the bounded oper-
ator �f to �f by analyzing the output at the fixed point
V i ¼ 1=ðexpð�Ui þ T iÞþ 1Þ. If C > 0 is a constant and
Ui ¼ I1 þ I2 P C, then 1=ðexpð�C þ T iÞþ 1Þ6 V i < 1 . When
Ui ¼ I1 þ I2 !þ1V i ! 1, so in this case, if C is large enough,
V i 
 1. If -C 6 Ui ¼ I1 þ I2 6 C, then 1=ðexpðC þ T iÞþ 1Þ6 V i 6
1=ðexpð�C þ T iÞþ 1Þ, according to equation (a). We can select a
T i, that makes jT i þ

P1
j¼2ð�Ui þ T iÞ

j
=j! �

P1
k¼2ð�1Þ

k expð�kðUi T iÞÞj
small enough, then V i 
 I1 þ I2 .


2740 H. Hu et al. / Expert Systems with Applications 41 (2014) 2729–2741
V i ¼ 1=ðexpð�Ui þ T iÞþ 1Þ

¼ 1 � expð�Ui þ T iÞþ
X1
k¼2
ð�1Þk expð�kðUi � T iÞÞ

¼ Ui � T i �
X1
j¼2
ð�Ui þ T iÞ

j
=j! þ

X1
k¼2
ð�1Þk expð�kðUi T iÞÞ

¼ Ui � T i �
X1
j¼2
ð�Ui þ T iÞ

j
=j! þ

X1
k¼2
ð�1Þk expð�kðUi T iÞÞ:ðaÞ

So in this case, V i 
 I1�f I2 ¼ minð1; I1 þ I2Þ.
Similarly, if Ui ¼ I1 þ I2 !�1 V i ! 0. So when C is large

enough and Ui ¼ I1 þ I2 6�C < 0, then V i 
 0. When -C 6 Ui ¼
I1 þ I2 6 C, if we select a suitable T i which makes
T i þ

P1
j¼2ð�Ui þ T iÞ

j
=j! �

P1
k¼2ð�1Þ

k expð�kðUi � T iÞÞ
 1, then
V i 
 I1�f I2 ¼ maxð0; I1 þ I2 � 1Þ.

Based on above analysis, the Bounded operator fuzzy system is
suitable for GrC described by Eq. (11) when ai ¼ 1:0; w1 ¼ 1:0 and
w2 ¼ 1:0. For arbitrary positive ai; w1 and w2, we can use corre-
sponding q-value weighted universal fuzzy logical function based
on Bounded operator to simulate such kind neural cells. If a weight
w is negative, a N-norm operator NðxÞ¼ 1 � x should be used.

Experiments done by scanning the whole region of ðI1; I2Þ in
½0; 1�2 to find the suitable coefficients for �f and �f show that
above analysis is sound. We denote the input in Eq. (11) as
~x ¼ðI1; I2Þ. The ‘‘errOr’’ for �f and ‘‘errAnd’’ for �f are shown in
Fig. 9 as the solid line and the dotted line respectively. In Fig. 9,
the threshold T i is scanned from 0 to 4.1 with step size 0.01. The
best T i in Eq. (4) for �f is 2.54 and the best T i in Eq. (4) for �f is
0, when a ¼ 1:0; w1 ¼ 1:0 and w2 ¼ 1:0. In this case the ‘‘errOr’’
and ‘‘errAnd’’ is less than 0.01. Our experiments show that suitable
T i can be found. So in most cases, the bounded operator Fð�f ;�fÞ
mentioned above is the suitable fuzzy logical framework for the
neuron defined by Eq. (3). If the weight 0 < w1 and 0 < w2, we
should use a q-value weighted bounded operator Fð�f ;�fÞ to rep-
resent above neuron.
Appendix C. Associative condition and Demorgan law of q-
weighted bounded operator

It is easily to see �f follows the associative condition and
x1�f x2�f x3 . . .�f xn ¼ minðq;

P
16i6nwixiÞ.

For �f , we can prove the associative condition is hold also. The
proof is listed as below:

If w1p1 þ w2p2 �ðw1 þ w2 � 1Þq P 0, we have:

ðp1�f p2Þ�f p3 ¼F�f ðF�f ðp1;p2;w1;w2Þ;p3;1;w3Þ
¼F�f ðw1p1þw2p2�ðw1þw2�1Þq;p3;1;w3Þ
¼maxð0;w1p1þw2p2�ðw1þw2�1Þqþw3 p3
�ð1þw3�1ÞqÞ
¼maxð0;w1p1þw2p2þw3p3�ðw1þw2þw3�1ÞqÞ;

if w1p1 þ w2 p2 �ðw1 þ w2 � 1Þq < 0, we have

ðp1�f p2Þ�f p3 ¼F�f ðF�f ðp1;p2;w1;w2Þ;p3;1;w3Þ
¼F�f ð0;p3;1;w3Þ¼maxð0;0þw3 p3 �ð1þw3 �1ÞqÞ

¼maxð0;w3 p3 �w3qÞ ¼
for06p36q0

¼maxð0;w1 p1 þw2p2 þw3p3 �ðw1 þw2 þw3 �1ÞqÞ;

so ðp1�f p2Þ�f p3 ¼ p1�fðp2�f p3Þ¼ maxð0; w1 p1 þ w2p2 þ w3 p3
�ðw1 þ w2 þ w3 � 1ÞqÞ.
By inductive approach, we can prove that �f also follows the
associative condition and x1�f x2�f x3 . . .�f xn ¼ maxð0;

P
16i6n wixi

�ð
P

16i6nwi � 1ÞqÞ.
For more if we define NðpÞ¼ q � p (usually, a negative weight

wi corresponds a N-norm), above weighted bounded operator
Fð�f ;�fÞ follows the Demorgan Law, i.e.

Nðx1�f x2�f x3 . . .�f xnÞ¼ q � min q;
X

16i6n

wi xi

 !

¼ max 0; q �
X

16i6n

wixi

 !

¼ max 0;
X

16i6n

wiðq � xiÞ�ð
X

16i6n

wi � 1Þq
 !

¼ Nðx1Þ�f Nðx2Þ�f Nðx3Þ . . .�f NðxnÞ:
References

Andrzej, B., & Pedrycz, W. (2006). The roots of granular computing. In GrC (pp. 806–
809).

Castro, J. L. (1995). Fuzzy logic controllers are universal approximators. IEEE
Transactions on Systems, Man and Cybernetics, 25(4), 629–635.

Fung, G., & Mangasarian, O. L. (2001). Proximal support vector machine classifiers.
In Proceedings of the seventh ACM SIGKDD international conference on Knowledge
discovery and data mining (pp. 77–86). ACM.

Haykin, S. (1994). Neural networks: A comprehensive foundation. Prentice Hall PTR.
Haykin, S. (2008). neural networks: A comprehensive foundation. Englewood cliffs:

Prentive-Hall.
He, K., Sun, J., & Tang, X. (2011). Single image haze removal using dark channel

prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(12),
2341–2353.

Levin, A., Lischinski, D., & Weiss, Y. (2008). A closed-form solution to natural image
matting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2),
228–242.

Lin, T. Y. (1998). Granular computing on binary relations I: Data mining and
neighborhood systems. Rough Sets in Knowledge Discovery, 1, 107–121.

Lin, T. Y. (1999). Granular computing: Fuzzy logic and rough sets. Computing with
words in information/intelligent systems (Vol. 1, pp. 183–200). Springer.

Lin, T. Y. (2007). Neighborhood systems: A qualitative theory for fuzzy and rough sets.
Berkeley: University of California. 94720.

Lin, T. Y. (2012). Granular computing: Practices, theories, and future directions. In
Computational complexity (pp. 1404–1420). Springer.

Li, H.-X., & Philip Chen, C. L. (2000). The equivalence between fuzzy logic systems
and feedforward neural networks. IEEE Transactions on Neural Networks, 11(2),
356–365.

Liu, H., Xiong, S., & Wu, C.-a. (2012). Hyperspherical granular computing
classification algorithm based on fuzzy lattices. Mathematical and Computer
Modelling.

Mountcastle, V. B. (1997). The columnar organization of the neocortex. Brain,
120(4), 701–722.

Ojala, Timo, Pietikäinen, Matti, & Harwood, David (1996). A comparative study of
texture measures with classification based on featured distributions. Pattern
Recognition, 29(1), 51–59.

Pedrycz, Adam, Hirota, Kaoru, Pedrycz, Witold, & Dong, Fangyan (2012). Granular
representation and granular computing with fuzzy sets. Fuzzy Sets and Systems,
203, 17–32.

Yao, Y. Y. (1998). Relational interpretations of neighborhood operators and rough
set approximation operators. Information Sciences, 111(1), 239–259.

Yao, Y. Y. (1999). Granular computing using neighborhood systems. In Advances in
soft computing (pp. 539–553). Springer.

Yao, Y. Y. (2000). Granular computing: Basic issues and possible solutions. In
Proceedings of the 5th joint conference on information sciences (Vol. 1, pp. 186–
189) Citeseer.

Yao, Y. Y. (2001). On modeling data mining with granular computing. In 25th Annual
international computer software and applications conference, 2001. COMPSAC 2001
(pp. 638–643). IEEE.

Yao, Y. Y. (2001). Information granulation and rough set approximation.
International Journal of Intelligent Systems, 16(1), 87–104.

Yao, Y. (2006). Granular computing for data mining. In Defense and security
symposium (pp. 624105). International Society for Optics and Photonics.

Yao, Y., & Deng, X. (2013). A granular computing paradigm for concept learning. In
Emerging paradigms in machine learning (pp. 307–326). Springer.

Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353.

http://refhub.elsevier.com/S0957-4174(13)00906-8/h0015
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0015
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0020
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0020
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0020
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0025
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0030
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0030
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0035
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0035
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0035
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0040
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0040
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0040
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0045
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0045
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0050
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0050
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0055
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0055
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0060
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0060
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0065
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0065
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0065
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0070
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0070
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0070
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0075
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0075
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0080
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0080
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0080
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0085
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0085
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0085
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0090
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0090
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0095
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0095
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0100
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0100
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0100
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0105
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0105
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0110
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0110
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0115
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0115
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0120


H. Hu et al. / Expert Systems with Applications 41 (2014) 2729–2741 2741
Zadeh, L. A. (1997). Toward a theory of fuzzy information granulation and its
centrality in human reasoning and fuzzy logic. Fuzzy Sets and Systems, 90(2),
111–127.

Zhang, L., & Zhang, B. (2003). Theory of fuzzy quotient space (methods of fuzzy
granular computing). Journal of Software, 14(4), 770–776.

Zhang, L., & Zhang, B. (2004a). The quotient space theory of problem solving.
Fundamenta Informaticae, 59(2), 287–298.
Zhang, L., & Zhang, B. (2004b). The quotient space theory of problem solving.
Fundamenta Informaticae, 59(2), 287–298.

Zhang, L., & Zhang, B. (2005). Quotient space model based hierarchical machine
learning. International conference on neural networks and brain, 2005. ICNN&B’05
(Vol. 1). IEEE, pp. xiv–xiv.

http://refhub.elsevier.com/S0957-4174(13)00906-8/h0125
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0125
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0125
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0130
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0130
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0135
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0135
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0140
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0140
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0145
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0145
http://refhub.elsevier.com/S0957-4174(13)00906-8/h0145

	Perception granular computing in visual haze-free task
	1 Introduction
	2 Granular system based on tolerance relation
	3 Hybrid designing of leveled perception granular system based on fuzzy logic and PSVM
	4 Granular system for visual task
	4.1 The theory of image matting
	4.2 Leveled perception granular system for haze-free task
	4.2.1 The 3rd layer – alogical layer
	4.2.2 The 4th layer – fuzzy logical layer

	4.3 Experiments result

	5 Discussion
	Acknowledgments
	Appendix A Sigmoid function and Binary Logic
	Appendix B Sigmoid function and fuzzy logic
	Appendix C Associative condition and Demorgan law of q-weighted bounded operator
	References