Evidence Sets: Contextual Categories

L.M. Rocha
Computer Research Group, MS P990
Los Alamos National Laboratory
Los Alamos, NM 87545

In: Proceedings of the meeting on Control Mechanisms for Complex Systems, Physical Science Laboratory, New Mexico State University, Las Cruces, New Mexico, January 1997. M. Coombs (ed.). NMSU Press, pp. 339-357.

NOTE: This paper contains many equations and figures, so its HTML version is a bit rough. Its full, accurate, version can be dowloaded in "zipped" postscript (.zip) format. Notice that in some systems you may have to press the "shift" key while clicking on this link.

Abstract

Evidence Sets are set formalisms which extend fuzzy sets and interval valued fuzzy sets through the Dempster-Shafer theory of evidence (DST). The membership of an element of an evidence set is defined as a collection of weighted subintervals of the real unit interval. The weighting of the intervals of membership is implemented by the basic probability function of DST and interpreted as an observer's subjective gradation of membership according to available evidence. Evidence sets are well positioned to model human linguistic categorization processes because they offer a mechanism to explicitly contextualize uncertainty. Each of the graded intervals of membership is associated with a particular context, which can then be reduced or eliminated as more evidence is gained through an extended theory of approximate reasoning or through the incorporation of more knowledge into the model by pragmatic, evolutionary, strategies. Evidence sets also capture the full scope of uncertainty forms usually recognized in generalized information theory.

1. Mathematical Background

Let X denote a nonempty universal set under consideration. Let P (X) denote the power set of X. An element of X represents a possible value for a variable x. X can be countable or uncountable. The term continuous domain is often used to refer to the latter case. An uncountable set is by definition an infinite set, a countable set can be both finite or infinite.

1.1. Dempster-Shafer Theory of Evidence

Evidence theory, or Dempster-Shafer Theory (DST) [Shafer, 1976], is defined in terms of the set function m referred to as the basic probability assignment: m: P (X) [0, 1], such that m(SV‹t$W‹|$U…öŒª) = 0 and _A\&}X m(A) =&nbs p;1. m(A) denotes the proportion of all available evidence which supports the claim that A P (X) accurately represents the prospective value of our variable x. Further, m(A) qualifies A alone, it does not imply any additional claims regarding other subsets of X, including subsets of A or the complement of A. DST is based on a pair of nonadditive measures: belief (Bel) and plausibility (Pl) uniquely obtained from m. Note that "m(A) measures the belief one commits exactly to A, not the total belief that one commits to A." Bel(A), the total belief committed to A, is instead given by the sum of all the values of m for all subsets of A. Any set A P (X) with m(A) > 0 is called a focal element. A body of evidence is defined by the pair (F, m), where F represents the set of all focal elements in X, and m the associated basic probability assignment. F is assumed to be finite, that is, there is a finite number of focal elements in a body of evidence, even if the universal set X is infinite. The set of all bodies of evidence is denoted by B. Total ignorance is expressed in DST by m(X) = 1 and m(A) = 0 for all A X. Full Certainty is expressed by m({x}) = 1 for one particular element of X, and m(A) = 0 for all A {x}.

1.2 Uncertainty

Klir[1993] classifies uncertainty into two main forms: ambiguity and fuzziness. Ambiguity is further divided into the categories of nonspecificity and conflict. Mathematically ambiguity is identified with the existence of one-to-many relations, that is, when several alternatives exist for the same question or proposition. Nonspecificity is associated with unspecified alternatives, and conflict with the existence of several alternatives with some distinctive characteristic. DST provides an ideal framework for the study of ambiguity, as it enlarges the scope of traditional probability theory, and it can be interpreted in terms of other uncertainty theories [Resconi, et al, 1993]. Fuzziness is usually identified with lack of sharp distinctions. Fuzzy sets are usually used to formalize this kind of uncertainty. The elements of a fuzzy set are in it included according to a membership degree between 0 and 1. In Fuzzy Logic terms, the truth value of a proposition, now a possibility value, ranges between 0 and 1. A measure of fuzziness is the lack of distinction between a set and its complement [Yager, 1979, 1980].

In Rocha[1997a] I have presented all the measures of uncertainty needed to deal with the evidence sets presented next. In particular I developed measures of uncertainty for both discrete and nondiscrete domains. Also introduced the notion of relative uncertainty as opposed to absolute uncertainty. The former relates the uncertainty present in a given situation, to the maximum uncertainty possible in its universal set. Relative uncertainty requires measures which vary between 0 and 1 for no and maximum uncertainty respectively. In this paper I will simply utilize these measures, please refer to Rocha[1997a] for more detailed definitions.

1.3 Fuzzy Sets and Interval Valued Fuzzy Sets

A crisp set entails no uncertainty whatsoever; if an element x of X is a member of a set A \&} X, then it will not be a member of its complement A^c \&} X. A fuzzy set introduces fuzziness as the above law of contradiction is violated: x can both be a member (to a degree) of A and A^c . A fuzzy set A is defined by a membership function

. Fuzzy sets can be extended to interval valued fuzzy sets (IVFS), a case of probabilistic sets [Hirota, 1981] and type 2 fuzzy sets [Zadeh, 1975]. However, we do not need a probabilistic/possibilistic representation to define an IVFS; all we need is to assign a sub-interval of [0, 1] to each element x of X : A: X I([0,1]), where I represents the set of intervals in [0, 1].

IVFS offer, in addition to fuzziness, a nonspecific description of membership in a set. An IVFS A, for each x in X, captures two forms of uncertainty: vagueness (as in the case of normal fuzzy set) and nonspecificity. Vagueness, or fuzziness, is absolutely specific; when we create a fuzzy set we have perfect knowledge of the degree to which a certain element x of X belongs to A. In contrast, when we create an IVFS we have nonspecific knowledge of the degree of membership; hence the utilization of an interval to describe the membership of x in A.

2. Cognitive Categorization

2.1 Cognitive Categorization and Embodied Constructivism

"Most of our words and concepts designate categories. [...] Categorization is not a matter to be taken lightly. There is nothing more basic than categorization to our thought, perception, action, and speech. Every time we see something as a kind of thing, for example, a tree, we are categorizing. [...] An understanding of how we categorize is central to any understanding of how we think and how we function, and therefore central to an understanding of what makes us human". [Lakoff, 1987, pages xiii, 5, and 6]

Categories are bundles of somehow, in some context, associated concepts. Cognitive agents survive in a particular environment by categorizing their perceptions, feelings, thoughts, and language. The evolutionary value of categorization skills is related to the ability cognitive agents have to discriminate, and group, relevant events in their environments which may demand reactions necessary for their survival. If organisms can map a potentially infinite number of events in their environments to a relatively small number of categories of events demanding a particular reaction, and if this mapping allows them to respond effectively to relevant aspects of their environment, then only a finite amount of memory is necessary for an organism to respond to a potentially infinitely complex environment. In other words, only through effective categorization can knowledge exist in complicated environments.

Thus, knowledge is equated with the survival of organisms capable of using memories of categorization processes to choose suitable actions in different environmental contexts. It is not the purpose of this article to dwell into the interesting issues of evolutionary epistemology [Campbell, 1974; Lorenz, 1971]; I simply want to start this discussion by positioning categorization not only as a very important aspect of the survival of memory empowered organisms, but in fact, a necessary aspect of such organisms in the context of natural selection. Understanding categorization as an evolutionary (control) relationship between a memory empowered organism and its environment, implies the understanding of knowledge not as an observer independent mapping of real world categories into an organism's memory, but rather as the organism's, embodied, thus subjective, own construction of relevant to its survival distinctions in its environment. This is the basis for the constructivist position of systems theory and second order cybernetics [Piaget 1971; Maturana and Varela, 1987; Glanville, 1988; Von Glasersfeld, 1990; Klir, 1991], which I have discussed elsewhere [Rocha, 1996a, 1997a; Henry and Rocha, 1996].

Since effective categorization of a potentially infinitely complex environment allows an organism to survive with limited amounts of memory, we can also see a connection between uncertainty and categorization. Klir [1991] has argued that the utilization of uncertainty is an important tool to tackle complexity. If the embodiment of an organism allows it to recognize (construct) relevant events in its environment, but if all the recognizable events are still too complex to grasp by a limited memory system, the establishment of one-to-many relations between tokens of these events and the events themselves, might be advantageous for its survival. In other words, the introduction of uncertainty may be a necessity for systems with a limited amount of memory, in order to maintain relevant information about their environment. Thus, it is considered important for models of human categories to capture all recognized forms of uncertainty.

George Lakoff [1987] has stressed the relevance of the idea of categories as subjective constructions of any beings doing the categorizing, and how it is at odds with the traditional objectivist scientific paradigms. In the following, I will address the historical relation between set theory and our understanding of categories; in particular, I will discuss what kind of extensions we need to impose on fuzzy sets so that they may become better tools in the modeling of subjective, uncertain, cognitive categories.

2.2 Models of Cognitive Categorization

It is important to separate the idea of a model of cognitive categorization and a model of a category. Though obviously dependent on one another, categories are included in more general models of cognitive categorization and knowledge representation. Agreeing on what the structure of a category might be, is far from agreeing on what the structure and workings of cognitive categorization models should be. It is also a simpler problem. Though, undoubtedly, the specific model of knowledge organization selected will dictate some of the properties of categories, the particular structure chosen to represent categories in such models does not have to offer an explanation for knowledge organization. All that is asked of a good category representation, is that it may allow the larger imbedding model of knowledge representation to function. For instance, if we use mathematical sets to represent categories, our models of knowledge representation may use set theory connectives and/or they may use more complicated sets of mappings and operations. Thus, evaluating sets as prospective representations of categories should be done by analyzing the kinds of limitations they necessarily impose on any kind of model, and not simply models circumscribed to basic set-theoretic operations.

2.2.1. The Classical View

The classical theory of categorization defines categories as containers of elements with common properties. Naturally, the classic, crisp, set structure was ideal to represent such containers: an element of a universe of observation can be either inside or outside a certain category, if it has or has not, respectively, the defining properties of the category in question. Further, all elements have equal standing in the category: there are no preferred representatives of a category all or nothing membership.

One other characteristic of the classical view of categorization has to with an observer independent epistemology, or objectivism. Cognitive categories were thought to represent objective distinctions in the real world. Frequently, this objectivism is linked to the way classical categories are constructed on all-or-nothing sets of objects: "if categories are defined only by properties inherent in the members, then categories should be independent of the peculiarities of any beings doing the categorizing" [Lakoff, 1987, page 7]. I do not subscribe to this point of view; we can use classical categories both in realist or constructivist epistemologies. The properties of classical, all-or-nothing, categories, need not be considered inherent in the members. The question is who or what is to establish the shared properties of a particular category. A model, where these shared properties are regarded as observer dependent, that is, established in reference to the particular physiology and cognition of the agent doing the categorizing, is built under a constructivist epistemology. If on the other hand, these properties are considered to be the one and ultimate truth of the real world, then the aim is the definition of an objectivist model of reality.

Any modern theory of categorization will include classical categories as a special case of a more complex scheme, and that does not mean some categories are objective and others are subjective. Thus, classical categories have to do with an all-or-nothing description of sets, based on a list of shared properties defined in some model. The chosen structure of categories and the chosen model of knowledge representation/manipulation, which can be objectivist or constructivist, may be independent concerns when modeling cognitive categorization.

2.2.2. Prototype Theory and Fuzzy Sets

"Prototypes do not constitute any particular processing model for categories [...]. What the facts about prototypicality do contribute to processing notions is a constraint -- process models should not be inconsistent with the known facts about prototypes. [...] As with processing models, the facts about prototypes can only constrain, but do not determine, models of representation." [Rosch, 1978, pg. 40]

Eleanor Rosch [1975, 1978] proposed a theory of category prototypes in which, basically, some elements are considered better representatives of a category than others. It was also shown that most categories cannot be defined by a mere listing of properties shared by all elements. Naturally, fuzzy sets became candidates for the simulation of prototype categories on two counts: (i) membership degrees could represent the degree of prototypicality of a concept regarding a particular category; (ii) a category could also be defined as the degree to which its elements observe a number of properties, in particular, these properties may represent relevant characteristics of the prototype -- the element that best represents a category. These two points are distinct. The first one does not collide with the present day concept of categories because it makes no claim whatsoever on the mechanisms of creation and manipulation of categories. It may be challenged, as I will do in the sequel, on the grounds that due to its simplicity, models using it must be extremely complicated. Nonetheless, it does offer the minimum requirement a category must observe: a group (set) of elements with varying degrees of representativeness of the category itself.

Now, the second point goes beyond the definition of a category and enters the domain of modeling the creation of categories. As in the classic case, categories are seen as groups of elements observing a list of properties, the only difference is that elements are allowed to observe these properties to a degree. However, the so called radial categories [Lakoff, 1987] cannot be formed by a listing of properties shared by all its elements, even if to a degree. They refer to categories possessing a central subcategory core, defined by some coherent (to a model or context) listing of properties, plus some other elements which must be learned one by one once different contexts are introduced, but which are unpredictable from the core's context and its listing of shared properties⁽¹⁾. Thus, the second interpretation of fuzzy sets as categories leads fuzzy logic to a corner which renders it uninteresting to the modeling of cognitive categorization.

2.2.3 Fuzzy Objectivism

With fuzzy sets and approximate reasoning Zadeh [1965, 1971] substitutes a classic logic of truth by a logic of degrees of truth: instead of having members of classes/categories which belong or not belong to it, we have members who possibly belong to a category to a certain degree. Lakoff [1987] believes that the utilization of degrees of truth adds nothing to the main shortcoming of classical categories, as they are usually thought of as objective graded degrees that exist in the real world; objectivism is merely replaced by fuzzy objectivism. Now, even if Zadeh's initial formulation of fuzzy sets may have been indeed a pragmatically objectivist one, nothing prevents us from using fuzzy sets as representations of categories within a constructivist epistemology. Categories defined by fuzzy sets may represent degrees of prototypicality which may vary according to contexts introduced in imbedding models of categorization processes. In particular, a model may take into account levels of physiological subjectivity as desired by Lakoff [1987]. A computational example of such a model has been developed by Medina-Martins and Rocha [1992; Medina-Martins, Rocha, et al, 1994; Medina-Martins, 1995].

Since fuzzy sets, at least to a degree, can be included in objectivist or constructivist frameworks, its dismissal as good models of cognitive categories has to be made on different grounds. In the following I will maintain that fuzzy sets are unsatisfactory because they (i) lead to very complicated models, (ii) do not capture all forms of uncertainty necessary to model mental behavior, and (iii) leave all the considerations of a logic of subjective belief to the larger imbedding model, which makes them poor tools in true constructivist approaches. A formal extension based on evidence theory is proposed next.

3. Sets and Cognitive Categorization

3.1 Fuzzy Sets

Whenever fuzzy set models of cognitive categories have been proposed, a model of cognitive categorization or human reasoning has also been included in the package. Zadeh [1975] proposed a theory of approximate reasoning based of fuzzy predicate logic. Gorzaczany [1987] proposed a method of inference in approximate reasoning based on interval-valued fuzzy sets. As previously discussed, fuzzy sets are actually fairly accurate representations of categories simply because they are able to represent prototypicality (understood as degree of representativeness); how the prototype effects are constructed is, on the other hand, a different matter. Critics [Osherson and Smith, 1981; Smith and Osherson, 1984; Lakoff, 1987] have shown that the several fuzzy logic connectives (e.g. conjunction and disjunction), cannot conveniently account for the prototypicality of the elements of a complex category, which may depend only partially on the prototypicality of these elements in several of its constituent categories and may even be larger (or smaller) than in any of these.

A complex category is assumed to be formed by the connection of several other categories; approximate reasoning defines the sort of operations that can be used to instantiate this association. Smith and Osherson's [1984] results, showed that a single fuzzy connective cannot model the association of entire categories into more complex ones. Their analysis centered on the traditional fuzzy set connectives of (max-min) union and intersection. They observed that max-min rules cannot account for the membership degrees of elements of a complex category which may be lower than the minimum or higher than the maximum of their membership degrees in the constituent categories. Their analysis is very incomplete regarding the full-scope of fuzzy set connectives, since we can use other operators [see Dubois and Prade, 1985], to obtain any desired value of membership in the [0, 1] interval of membership. However, their basic criticism remains extremely valid: even if we find an appropriate fuzzy set connective for a particular element, this connective will not yield an accurate value of membership for other elements of the same category. Hence, a model of cognitive categorization which uses fuzzy sets as categories will need several fuzzy set connectives to associate two categories into a more complex one (in the limit, one for each element). Such model will have to define the mechanisms which choose an appropriate connective for each element of a category. Therefore, a model of cognitive categorization based solely on fuzzy sets and their connectives will be very complicated and cumbersome. No single fuzzy set connective can account for the exceptions of different contexts, thus the necessity of a complex model which recognizes these several contexts before applying a particular connective to a particular element.

3.2 Interval Valued Fuzzy Sets

The introduction of a theory of approximate reasoning based on interval valued fuzzy sets [Gorzaczany, 1987; Türken, 1986] represents a step forward in the modeling of cognitive categorization, as it offers a second level of uncertainty, but it only slightly improves the contextual problem referred above. The membership degrees of IVFS are nonspecific. This second dimension of uncertainty allows us to interpret the interval of membership of an element in a category, as the membership degree of this element according to several different contexts, which we cannot a priori know. In particular, Turksen's concept combination mechanisms are based on the separation of the disjunctive (DNF) and conjunctive normal forms (CNF) of logic compositions in fuzzy logic. In two-valued logic the CNF and DNF of a logic composition are equivalent: CNF = DNF. But in fuzzy logic, for certain families of conjugate pairs of conjunctions and disjunctions, we have instead DNF \&} CNF for some of the fuzzy logic connectives. Turksen[1986] proposed that fuzzy logic compositions could be represented by IVFS's given by the interval [DNF, CNF] of the fuzzy set connective chosen. With IVFS based connectives, he was able to deal more effectively with the shortcomings of a pure fuzzy set approach.

Turksen's model simplifies the pure fuzzy set approach since we will find more categories which can be combined into complex categories with a single connective used for all elements of the universal set, though it does not apply to all categories. The problem is that categories demand membership values which more than nonspecific can be conflicting. That is, the contextual effects may need more than an interval of variance to be accurately represented. Also, even though IVFS use nonspecific membership, thus allowing a certain amount of contextual variance, the several contexts are not explicitly accounted for in the categorical representation.

4. Evidence Sets: Membership and Belief

An evidence set A of X, is defined by a membership function of the form [Rocha, 1994, 1995b]:

A(x): X B[0, 1]

where, B[0, 1] is the set of all possible bodies of evidence (Fx, m^x) on I[0, 1]. Such bodies of evidence are defined by a basic probability assignment m^x on I([0, 1]), for every x in X (focal elements must be intervals). Notice that [0, 1] is an infinite, uncountable, set, while X can be countable or uncountable. Thus, evidence sets are set structures which provide interval degrees of membership, weighted by the probability constraint of DST. They are defined by two complementary dimensions: membership and belief. The first represents a fuzzy, nonspecific, degree of membership, and the second a subjective degree of belief on that membership, which introduces conflict of evidence as several, subjectively defined, competing membership intervals weighted by the basic probability constraint are created.

4.1 Consonant Evidence Sets

An interesting case occurs when we restrict F^x to consonant bodies of evidence, that is, to a nested structure of interval focal elements:

. In this instance we obtain a sort of graded and nested structure of several IVFS (Figure 1), which leads to consonant belief measures:

. Instead of using a single interval with maximum degree of belief, to formalize the nonspecificity of the degree of membership of element x of X in a set A, as is the case of IVFS, a consonant evidence set uses several nested intervals (three in the case of Figure 1) with different degrees of belief, stating our graded evidence claims regarding the membership of element x of X in A.

4.2 Non Consonant Evidence Sets

When F^x is no longer restricted to consonant bodies of evidence, we obtain evidence sets that are a bit more "incoherent", that is, disjoint intervals of membership exist for the same membership degree in an evidence set. In other words, the evidence we possess leads to a conflicting characterization of the membership value of x. Figure 2 shows an example of a nonconsonant evidence set.

5. Evidence Sets and Uncertainty

A fuzzy set captures vagueness in a specific way; an IVFS introduces nonspecificity; a consonant evidence set introduces grades or shades of nonspecificity; and finally, a nonconsonant evidence set introduces conflict as we have cases where the degree to which an element is a member of a set is represented by disjoint sub-intervals of [0, 1] with different evidential strengths. The three forms of uncertainty are clearly present in human cognitive processes. More than simply measuring fuzziness, as approximate reasoning models do, models of uncertain reasoning based on evidence sets need to effectively measure all the three uncertainty forms. Hence, we need a 3-tuple of measures of the 3 main kinds of uncertainty to aid us in the decision making steps of our uncertain reasoning models. Each situation, each set, should be qualified in its uncertainty content with something like: (Fuzziness, Nonspecificity, Conflict) [Rocha et al, 1996; Rocha, 1997a] .

The three forms of uncertainty define a 3 dimensional uncertainty space for set structures, where crisp sets occupy the origin, fuzzy sets the fuzziness axis, IVFS the fuzziness-nonspecificity plane, and evidence sets most of the rest of this space. The total uncertainty, U, of an evidence set A is defined by U(A) = (IF(A), IN(A),IS(A)). The three indices of uncertainty, which vary between 0 and 1, IF (fuzziness), IN (nonspecificity), and IS (conflict) where introduced in Rocha[1996a, 1997a], where it was also proven that IN and IS possess good mathematical properties, wanted of information measures. These indices are identified with a different way of measuring information, referred to as relative uncertainty, better suited for nondiscrete domains. For a complete discussion of these issues, please refer to [Rocha, 1996a, 1997a; Rocha et al 1996]. The uncertainty situation of the several set structures known is summarized in table I.

6. Contextual Interpretation of Evidence Sets

"To speak of a prototype at all is simply a convenient grammatical fiction; what is really referred to are judgements of degree of prototypicality." [Rosch, 1978, page 40 second italics added]

None of the fuzzy set and IVFS approaches to cognitive categorization consider, explicitly, the notion of subjectivity. This is so because fuzzy sets do not offer an explicit account of belief in evidence; in other words, we have degrees of prototypicality and not judgements of degrees of prototypicality as Eleanor Rosch required in the previous quote. The interpretation I suggest [Rocha, 1994] for the multiple intervals of evidence sets, in light of the problem of human categorization processes, considers each interval of membership Ij^x, with its correspondent evidential weight m^x( Ij^x), as the representation of the prototypicality of a particular element x of X , in category A according to a particular perspective. In other words, each interval Ij^x represents a particular perspective of the element x of a category represented by an evidence set A. Thus, each element x of our evidence set A will have its membership varying within several intervals representing different, possibly conflicting, perspectives. An IVFS, for instance, represents the case where we have a single perspective on the category in question, even if it admits a nonspecific representation (an interval)⁽²⁾. The ability to maintain several of these perspectives, which may conflict at times, in representations of categories such as evidence sets, allows a model of cognitive categorization or knowledge representation to directly access particular contexts affecting the definition of a particular category, essential for radial categories. In other words, the several intervals of membership of evidence sets refer to different perspectives which explicitly point to particular contexts. In so doing, evidence sets facilitate the inclusion of subjectivity in models of cognitive categorization in addition to the inclusion of the several forms of uncertainty.

"Whenever I write in this essay 'degree of support' that given evidence provides for a proposition or the 'degree of belief' that an individual accords the proposition, I picture in my mind an act of judgment. I do not pretend that there exists an objective relation between given evidence and a given proposition that determines a precise numerical degree of support. Nor do I pretend that an actual human being's state of mind with respect to a proposition can ever be described by a precise real number called his degree of belief, nor even that it can ever determine such a number. Rather, I merely suppose that an individual can make a judgement. Having surveyed the sometimes vague and sometimes confused perception and understanding that constitutes a given body of evidence, he can announce a number that represents the degree to which he judges that evidence to support a given proposition and, hence, the degree of belief he wishes to accord the proposition." [Shafer, 1976, p. 21, italics added]

Shafer's intent captured in the previous quotation seems to follow Rosch's earlier quotation in the context of cognitive categorization. The degrees of belief on which evidence theory is based do not aspire to be objective claims about some real evidence, they are rather proposed as judgements, formalized in the form of a degree. Likewise, Rosch's prototypes are not assumed to be an objective grading of concepts in a category, but rather judgements of some uncertain, highly context-dependent, grading. Evidence sets offer a way to model these ideas since an independent membership grading of elements (concepts) in a category is offered together with an explicit formalization of the belief posited on this membership. In a sense, in evidence sets, membership in a category and judgments over membership are different, complementary, qualities of the same phenomenon. None of the other structures so far presented is able to offer both this independent characterization of membership and a formalization of judgments imposed on this membership: traditional set structures (crisp, fuzzy, or interval-valued) alone offer only an independent degree of membership, while evidence theory by itself offers primordially a formalization of belief which constrains the elements of a universal set with a probability restriction.

7. Belief-Constrained Approximate Reasoning

7.1. Uncertainty Increasing Operations Between Evidence Sets with Context Preservation

Recently, Zhu and Lee [1995] have proposed a belief based multi valued logic which defines a connection between evidence theory and multi valued logics in much of the same way as evidence sets do, that is, with the establishment of degrees of belief on truth values given by intervals of the unit interval. While evidence sets [Rocha, 1994] were defined in the context of set theory, Zhu and Lee thought of this extension in terms of multi valued logics. This way, in the former we speak of belief based, interval valued membership in a set, while in the latter we speak of belief based, interval valued truth value of a proposition. Most of the operators discussed in this section are equivalent to Zhu and Lee's formulation, though their interpretation might differ.

The operations of complementation, interesection, and union are the most basic connectives in a theory of approximate reasoning. All other connectives can be easily construed from these, therefore, I only discuss these three operators here. Naturally, complementation, intersection, and union as defined below for evidence sets, subsume, as special cases, the same operations for IVFS and fuzzy sets.

7.1.1 Complementation

The interval valued membership function of elements of X in an IVFS A is given by: A(x) =

. Its complement can be defined as the negation of the interval limits in reverse order: [e.g. Gorzaczany, 1987]. The membership function of an evidence set A of X is given, for each x, by n intervals weighted by a basic probability assignment m^x:

The complement of an evidence set [Rocha, 1995b] is defined as the complement of each of its interval focal elements with the preservation of their respective evidential strengths:

7.1.2 Intersection

The intersection of two IVFS [Gorzaczany, 1987] is defined as the minimum of their respective lower and upper bounds of their membership intervals. Given two intervals of [0, 1]

and

, the minimum of both intervals is an interval

. Given two evidence sets A and B defined for each x of X by:

and

where I_i and J_j are intervals of [0,1]. Their intersection is an evidence set C(x) = A(x) SV‹t$W‹|$U…öŒª B(x), whose intervals of membership K_k and respective basic probability assignment m_C(K_k) are defined by:

7.1.3 Union

The union of two IVFS [Gorzaczany, 1987] is defined as the maximum of their respective lower and upper bounds of their membership intervals. Given two intervals of [0, 1]

and

, the maximum of both intervals is an interval

. Given two evidence sets A and B defined by (1) and (2), their union is an evidence set C(x) = A(x) B(x), whose intervals of membership K_k and respective basic probability assignment m_C(K_k) are defined by:

7.1.4 Increasing Uncertainty

By utilizing the connectives of sections 7.1.2 to 7.1.3, the uncertainty of our model tends to increase, as two bodies of evidence on the unit interval are combined into a new one, by preserving most perspectives (contexts) involved. There will be at least as many intervals in the combined set as the minimum of intervals in the combining sets. In other words, if i^x and j^x represent the number of intervals (perspectives) present, respectively, in combining sets A and B for element x, then the combined set C will have at least MIN(i^x,j^x) intervals for concept x. An alternative to this way of combining evidence sets is described below.

7.2 Uncertainty Decreasing Operation Between Evidence Sets

We can combine evidence sets by preserving all their perspectives (though with reduced weights as the joined basic assignment must still add up to 1) as above, thus increasing the uncertainty complexity, or we can combine them only according to the coherent perspectives (those aiming at the same intervals) by utilizing Dempster's rule of combination, and decrease the uncertainty complexity. Given two evidence sets A and B defined by (1) and (2), their uncertainty decreasing combination is an evidence set C(x) = A(x) B(x), whose intervals of membership K_k and respective basic probability assignment m_C(K_k) are defined by:

Dempster's rule of combination eliminates all focal elements which do not coincide (or intersect) in both bodies of evidence being combined, while the operations of section 7.1 maintain some evidential weight for these, though enhancing those that do intersect.

7.3 Choosing an operation: fast decision making and metaphor

Dempster's rule of combination is used to combine different bodies of evidence over the same frame of discernment. It is an all or nothing rule, that is, if the focal elements of two distinct bodies of evidence being combined are disjoint, no combination is possible. In this situation, in DST, if we still consider that there is relevant interaction between the two bodies of evidence which our frame of discernment cannot capture, then we either rethink our basic probability assignments or the frame of discernment is changed by introducing new primitives common to both bodies of evidence. Now consider that our model of categorization, by utilizing Dempster's rule, reaches a combination of categories whose bodies of evidence are completely incoherent. That is, no new category is obtainable. If this result is reached in some intermediate step of an approximate reasoning process, the process is naturally stopped. To be able to continue with this process, we have to obtain some transitional category. Since the frame of discernment of the belief attributes of an evidence set is the unit interval, we cannot aim to refine it in any way. For this reason, I have proposed uncertainty decreasing and increasing operations for evidence sets [Rocha, 1994]. If the evidence sets being combined are at least partially coherent, we can use Dempster's rule which will reduce the uncertainty present. If this coherency is not attainable, we can choose an uncertainty increasing operation which largely maintains the evidence from both structures being combined, until a more coherent state of evidence is encountered at a later stage.

The uncertainty decreasing operation can be used when we have coherent evidence of membership in combining evidence sets, and when we wish to reduce dramatically the amount of uncertainty present in some simulation of human reasoning processes. In an artificial system, this operation might be identified we fast decision-making processes. Say, if we possess two categories which must be combined in order to make a fast decision, then uncertainty must be reduced and the most coherent result chosen. On the other hand, if we do not have coherent membership evidence, or if we do not need to engage in fast decision making, but instead desire to search for more conflictuous, far-fetched, associations (from wildly different contexts), then the uncertainty increasing operations should be chosen. These operations by enabling the maintenance of many different contexts of the same category, which may be later reduced by the uncertainty decreasing operation, may open the way for the modeling of metaphor in models of human categorization, as previously uncorrelated contexts, considered incoherent by Dempster's rule, may be bridged together.

8 Applications: Data-Mining and Cognitive Categorization

The notion of uncertainty, is very relevant to any discussion of the modeling of linguistic/mental abilities. From Zadeh's [1971, 1975] approximate reasoning to probabilistic and even evidential reasoning [Schum, 1994], uncertainty is more and more recognized as a very important issue in cognitive science and artificial intelligence with respect to the problems of knowledge representation and the modeling of reasoning abilities [Shafer and Pearl, 1990]. Engineers of knowledge based systems can no longer be solely concerned with issues of linguistic or cognitive representation, they must describe "reasoning" procedures which enable an artificial system to answer queries. In many artificial intelligence systems, the choice of the next step in a reasoning procedure is based upon the measurement of the system's current uncertainty state [Nakamura and Iwai, 1982; Rocha, 1991; Medina-Martins et al, 1992, 1994]. Evidence sets include fuzzy sets and evidence sets as special cases, however since they capture all forms of uncertainty recognized, and include a logic of belief and context-dependency, we do not need to keep on extending its mathematical formalism to type III and Type III sets, in order to model linguistic categories. Now that a theory of evidential approximate reasoning and measures of uncertainty are defined for evidence sets, we can extend fuzzy data-retrieval systems to an evidence set formulation. In particular, a conversational relational database based on Nakamura and Iwai's [1982] work, has been developed [Rocha, 1991, 1997c]. Its implications for linguistic modeling have been discussed in [Henry and Rocha, 1996].

Nakamura and Iwai's data-retrieval system is based on a structure with two different kinds of objects: Concepts (e.g. Key-words) and Properties (e.g. records like books). Each concept is associated with a number of properties which may be shared with other concepts. Based on the amount of properties shared with one another, a measure of similarity is established between concepts (figure 3). The inverse of similarity creates a (usually non-Euclidean) distance measure.

Users input an initial concept of interest. The system creates a bell-shaped fuzzy membership function centered on this concept and affecting all its close neighbors. After this, the system enters an interactive question-answering stage to try to reduce the fuzziness of this fuzzy set referred to as a knowledge space. Concepts with high degree of fuzziness are selected, and the user is asked if she is interested in them. If the answer is "YES" another bell-shaped membership function is created over this new concept, and fuzzy union is performed with the previous state of the knowledge space. If the answer is "NO" the inverse of the bell shaped membership function is created, and fuzzy union is performed. After a few steps the system reaches a less fuzzy state of the knowledge space which captures the users interests. Associated records are then retrieved (figure 3).

By extending Nakamura and Iwai's scheme to evidence sets, more forms of uncertainty can be introduced and several contexts can be treated simultaneously. Consider that instead of one single database (with concepts and properties) we have several databases which share at least some of their concepts. Since they have different properties, the similarity between concepts is different from database to database. For instance, key-word "fuzzy logic" will be differently related to key-word "logic" in the library databases of a control engineering research laboratory or a philosophy academic department. We may desire however to search for materials in both of these contexts. Evidence sets can be used to quantify the relative interest in each of these contexts (the probability restriction from DST), and the extended approximate reasoning operations of intersection and union can be used in very much the same process as the one for the fuzzy set case. Instead of reducing only the fuzziness of the knowledge space, this system aims at reducing all the three main forms of uncertainty discussed earlier.

In figure 4 the bell shaped membership function is depicted for two different databases with their probabilistic weighting of 0.3 and 0.7 respectively. Notice that concepts that are close to the central concept x in one context may not be close in the other context, thus the introduction of contextual conflict. Figure 4 shows the membership functions as specific and not as interval valued, this is just to facilitate understanding. In reality, membership functions would be created from the 2 different database metrics by utilizing Turksen's DNFCNF interval-valued relationships. Such a database retrieval system has been developed and discussed in deatil in Rocha [1997c].

References

Campbell, D.T. [1974]."Evolutionary Epistemology." In: The Philosophy of Karl Popper. P.A. Schilpp (Ed.). Open Court Publishers. pp. 413-463.

Dubois, D., and H. Prade [1985]."A review of fuzzy set aggregation connectives." Information Sciences V. 36, 85-121.

Glanville, Ranulph [1988]. Objekte. Merve Verlag.

Gorzalczany, M. B. [1987]."A method of inference in approximate reasoning based on interval-valued fuzzy sets." In: Fuzzy Sets and Systems Vol. 21, pp. 1-17.

Henry, C. and L.M. Rocha [1996]."Language theory: consensual selection of dynamics." Cybernetics and Systems '96. R. Trappl (Ed.). Austrian Society for Cybernetics, Vienna, pp. 477-482. To be reprinted in the journal Cybernetics and Systems.

Hirota, K. [1981]."Concepts of probabilistic sets." In: Fuzzy Sets and Systems Vol. 5, pp. 31-46.

Holland, J.H., K. Holyoak, R. Nisbett, and P. Thagard [1986]. Induction. MIT Press.

Klir, George J. [1991]. Facets of Systems Science. Plenum Press.

Klir, George J. [1993]."Developments in uncertainty-based information." In: Advances in Computers. M. Yovits (Ed.). Vol. 36, pp 255-332.

Lakoff, G. [1987]. Women, Fire, and Dangerous Things: What Categories Reveal About the Mind. Univ. Chicago Press.

Lorenz, K. [1971]."Knowledge, beliefs and freedom." In: Hierarchically Organized Systems in theory and practice. P. Weiss (Ed.). Hafner.

Maturana, H., and F. Varela [1987]. The Tree of Knowledge: The Biological Roots of Human Understanding. New Science Library.

Medina-Martins, Pedro R. [1995]."Metalogues: an abridge of a genetic psychology of non-natural systems." In: Communication and Cognition - Artificiall Intelligence Vol. 12, nos 1-2. pp. 111-156.

Medina-Martins, Pedro R. and Luis Rocha [1992]."The in and the out: an evolutionary approach." In: Cybernetics and Systems Research '92. Robert Trappl. World Scientific Press. pp 681-689.

Medina-Martins, Pedro R., Luis Rocha, et al [1994]."Metalogues: an essay on computers' psychology -- from childhood to adulthood." In: Cybernetics and Systems 94. R, Trappl (Ed.). World Scientific Press. pp. 565-572.

Nakamura, K., and S. Iwai [1982]."A representation of analogical inference by fuzzy sets and its application to information retrieval systems." In: Fuzzy Information and Decision Processes. Gupta and Sanchez (Eds.). North-Holland. pp. 373-386.

Osherson, D., and E. Smith [1981]."On the adequacy of prototype theory as a theory of concepts." In: Cognition Vol. 9, No. 1, pp. 35-58.

Pask, Gordon [1975]. Conversation, Cognition, and Learning: A Cybernetic Theory and Methodology. Elsevier..

Piaget, J. [1971]. The Construction of Reality in the Child. Ballantine Books.

Resconi, G., G. Klir, U. St.Clair, and D. Harmanec [1993]."On the Integration of Uncertainty Theories." International Journal Of Uncertainty, Fuzziness and Knowledge-Based Systems 1(1), pp. 1-18.

Rocha, Luis M. [1991]."Fuzzification of Conversation Theory." Paper delivered at Principia Cybernetica Conference Free University of Brussels. Heylighen (ed.)

Rocha, Luis M. [1994]."Cognitive categorization revisited: extending interval valued fuzzy sets as simulation tools concept combination." In: Proc. of the 1994 Int. Conference of NAFIPS/IFIS/NASA IEEE Press, pp. 400-404.

Rocha, Luis M. [1995a]."Contextual genetic algorithms: evolving developmental rules." In: Advances in Artificial Life. J. Moran, A. Moreno, J.J. Merelo, and P. Chacon (Eds.). Springer -Verlag, pp. 368-382.

Rocha, Luis M. [1995b]."Interval based evidence sets." In: Proc. of ISUMA-NAFIPS'95 IEEE Press, pp. 624-629.

Rocha, Luis M. [1996a]. "relative Uncertainty: Measuring Information in Discrete and Nondiscrete Domains". Proceedings of the North American Fuzzy Information Processing Society - NAFIPS96. IEEE Press.

Rocha, Luis M. [1996b]."Eigenbehavior and symbols." Systems Research 12 (3) (In Press).

Rocha, L.M. [1997a]. "Relative Uncertainty and Evidence Sets: A Constructivist Framework". Intenational Journal of General Systems. Vol. 25. (In Press).

Rocha, L.M. [1997b]. "Emergent Morphology: Developing Evolutionary Strategies". Evolutionary Computation. (Submitted).

Rocha, L.M. [1997c]. Evidence Sets and Contextual Genetic Algorithms: Exploring Uncertainty, Context, and Embodiment in Biological and Cognitive Systems. PhD Dissertation. SUNY-Binghamton.

Rocha, L.M., V. Kreinovich, and K. B. Kearfott [1996]."Computing uncertainty in interval based sets." In: Applications of Interval Computations. V. Kreinovich and K.B. Kearfott (Eds.). Kluwer, pp. 337-380.

Rosch, E. [1975]."Cognitive representations of semantic categories." Journal of Experimental Psychology: General Vol. 104. pp. 192-233.

Rosch, E. [1978]."Principles of Categorization." In: Cognition and Categorization. E. Rosch and B. Lloyd (Eds.). Hillsdale. pp. 27-48.

Shafer, G. and J. Pearl (Editors) [1990]. Readings in Uncertain Reasoning. Morgan Kauffman.

Shafer, Glenn [1976]. A Mathematical Theory of Evidence. Princeton Unversity Press.

Smith, E. and D. Osherson [1984]."Conceptual combination with prototype concepts." Cognitive Science V. 8, 337-361.

Turksen, I.B. [1986]."Inteval valued fuzzy sets based on normal forms." Fuzzy Sets and Systems Vol. 20, pp. 191-210.

von Glasserfeld, E. [1990]."An exposition of constructivism: why some like it radical." Constructivist Views on the Teaching and LEarning of Mathematics. R. Davis (Ed). JRME Monographs. Reprinted in Klir [1991] 229-238.

Yager, R. [1979]."On the measure of fuzziness and negation. part I: membership in the unit interval." In: International Journal of Man-Machine Studies Vol. 11, pp. 189-200.

Yager, R. [1980]."On the measure of fuzziness and negation. part II: Lattices." Information and Control V. 44, 236-260.

Zadeh, Lofti A. [1965]."Fuzzy Sets." In: Information and Control Vol. 8, pp. 338-353.

Zadeh, Lofti A. [1971]."Quantitative fuzzy semantics." In: Information Sciences Vol. 3, pp. 159-176.

Zadeh, Lofti A. [1975]."The concept of a linguistic variable and its application to approximate reasoning, I, II, and III." In: Information Sciences Vol. 8, pp. 199-249, pp. 301-357, Vol.9, pp. 43-80.

Zadeh, Lofti A. [1978]."PRUF-a meaning representation language for natural languages." In: International Journal of Man-Machine Studies Vol. 10, pp. 383-410.

Zhu, Q. and E. S. Lee [1985]."Evidence theory in multivalued logic systems." In: International Journal of Intelligent Systems Vol. 10, pp. 185-199.

1. An example of a radial category [after Lakoff, 1987] is the category of mother. A listing of core properties, coherent in the context of birth, would be, for instance: woman who gives birth, raises, nurtures, educates a child. However, members of the category of mother exist which do no obey such listing: adoptive mother, surrogate mother, etc. These members do not obey the entire list; however, they are elements of the category mother. They are also not random elements, but are unpredictable until a different context is introduced.

2. This idea of interpreting bodies of evidence as perspectives, spins off from a generalization of Gordon Pask's [1975] Conversation Theory which I have proposed with the construction of a data-retrieval system [Rocha, 1991, Medina-Martins and Rocha, 1992].