Javascript must be enabled to continue!

Handling Fuzzy Similarity for Data Classification

Representing and consequently processing fuzzy data in standard and binary databases is problematic. The problem is further amplified in binary databases where continuous data is represented by means of discrete ‘1’ and ‘0’ bits. As regards classification, the problem becomes even more acute. In these cases, we may want to group objects based on some fuzzy attributes, but unfortunately, an appropriate fuzzy similarity measure is not always easy to find. The current paper proposes a novel model and measure for representing fuzzy data, which lends itself to both classification and data mining. Classification algorithms and data mining attempt to set up hypotheses regarding the assigning of different objects to groups and classes on the basis of the similarity/distance between them (Estivill-Castro & Yang, 2004) (Lim, Loh & Shih, 2000) (Zhang & Srihari, 2004). Classification algorithms and data mining are widely used in numerous fields including: social sciences, where observations and questionnaires are used in learning mechanisms of social behavior; marketing, for segmentation and customer profiling; finance, for fraud detection; computer science, for image processing and expert systems applications; medicine, for diagnostics; and many other fields. Classification algorithms and data mining methodologies are based on a procedure that calculates a similarity matrix based on similarity index between objects and on a grouping technique. Researches proved that a similarity measure based upon binary data representation yields better results than regular similarity indexes (Erlich, Gelbard & Spiegler, 2002) (Gelbard, Goldman & Spiegler, 2007). However, binary representation is currently limited to nominal discrete attributes suitable for attributes such as: gender, marital status, etc., (Zhang & Srihari, 2003). This makes the binary approach for data representation unattractive for widespread data types. The current research describes a novel approach to binary representation, referred to as Fuzzy Binary Representation. This new approach is suitable for all data types - nominal, ordinal and as continuous. We propose that there is meaning not only to the actual explicit attribute value, but also to its implicit similarity to other possible attribute values. These similarities can either be determined by a problem domain expert or automatically by analyzing fuzzy functions that represent the problem domain. The added new fuzzy similarity yields improved classification and data mining results. More generally, Fuzzy Binary Representation and related similarity measures exemplify that a refined and carefully designed handling of data, including eliciting of domain expertise regarding similarity, may add both value and knowledge to existing databases.

IGI Global

Roy Gelbard Avichai Meged

Encyclopedia of Artificial Intelligence

2011

Title: Handling Fuzzy Similarity for Data Classification

Description:

Representing and consequently processing fuzzy data in standard and binary databases is problematic.

The problem is further amplified in binary databases where continuous data is represented by means of discrete ‘1’ and ‘0’ bits.

As regards classification, the problem becomes even more acute.

In these cases, we may want to group objects based on some fuzzy attributes, but unfortunately, an appropriate fuzzy similarity measure is not always easy to find.

The current paper proposes a novel model and measure for representing fuzzy data, which lends itself to both classification and data mining.

Classification algorithms and data mining attempt to set up hypotheses regarding the assigning of different objects to groups and classes on the basis of the similarity/distance between them (Estivill-Castro & Yang, 2004) (Lim, Loh & Shih, 2000) (Zhang & Srihari, 2004).

Classification algorithms and data mining are widely used in numerous fields including: social sciences, where observations and questionnaires are used in learning mechanisms of social behavior; marketing, for segmentation and customer profiling; finance, for fraud detection; computer science, for image processing and expert systems applications; medicine, for diagnostics; and many other fields.

Classification algorithms and data mining methodologies are based on a procedure that calculates a similarity matrix based on similarity index between objects and on a grouping technique.

Researches proved that a similarity measure based upon binary data representation yields better results than regular similarity indexes (Erlich, Gelbard & Spiegler, 2002) (Gelbard, Goldman & Spiegler, 2007).

However, binary representation is currently limited to nominal discrete attributes suitable for attributes such as: gender, marital status, etc.

, (Zhang & Srihari, 2003).

This makes the binary approach for data representation unattractive for widespread data types.

The current research describes a novel approach to binary representation, referred to as Fuzzy Binary Representation.

This new approach is suitable for all data types - nominal, ordinal and as continuous.

We propose that there is meaning not only to the actual explicit attribute value, but also to its implicit similarity to other possible attribute values.

These similarities can either be determined by a problem domain expert or automatically by analyzing fuzzy functions that represent the problem domain.

The added new fuzzy similarity yields improved classification and data mining results.

More generally, Fuzzy Binary Representation and related similarity measures exemplify that a refined and carefully designed handling of data, including eliciting of domain expertise regarding similarity, may add both value and knowledge to existing databases.

Back

Abstract. Fuzzy Inference System requires several stages to get the output, 1) formation of fuzzy sets, 2) formation of rules, 3) application of implication functions, 4) compositi...

Generated Fuzzy Quasi-ideals in Ternary Semigroups

Here in this paper, we provide characterizations of fuzzy quasi-ideal in terms of level and strong level subsets. Along with it, we provide expression for the generated fuzzy quasi...

ω – SUBSEMIRING FUZZY

Mapping ρ is called a fuzzy subset of an empty set of S if ρ is the mapping from S to the closed interval [0,1]. A fuzzy subset ρ introduced into this paper is a fuzzy subset of se...

New Approaches of Generalised Fuzzy Soft sets on fuzzy Codes and Its Properties on Decision-Makings

Background Several scholars defined the concepts of fuzzy soft set theory and their application on decision-making problem. Based on this concept, researchers defined the generalis...

New Approaches of Generalised Fuzzy Soft sets on fuzzy Codes and Its Properties on Decision-Makings

Background Several scholars defined the concepts of fuzzy soft set theory and their application on decision-making problem. Based on this concept, researchers defined the generalis...

FUZZY‐FUZZY AUTOMATA

Based on the concept of fuzzy sets of type 2 (or fuzzy‐fuzzy sets) defined by L. A. Zadeh, fuzzy‐fuzzy automata ate newly formulated and some properties of these automata are inves...

Fuzzy Chaotic Neural Networks

An understanding of the human brain’s local function has improved in recent years. But the cognition of human brain’s working process as a whole is still obscure. Both fuzzy logic ...

Some new fuzzy query processing methods based on similarity measurement and fuzzy data clustering

In relational and object-oriented database systems there is always data that is naturally fuzzy or uncertain. However, to deal with complex data types with fuzzy nature, these syst...

Email:
Password:

Email:

Handling Fuzzy Similarity for Data Classification

Related Results