Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Subgraph Mining

View through CrossRef
The amount of available data is increasing very fast. With this data, the desire for data mining is also growing. More and larger databases have to be searched to find interesting (and frequent) elements and connections between them. Most often the data of interest is very complex. It is common to model complex data with the help of graphs consisting of nodes and edges that are often labeled to store additional information. Having a graph database, the main goal is to find connections and similarities between its graphs. Based on these connections and similarities, the graphs can be categorized, clustered or changed according to the application area. Regularly occurring patterns in the form of subgraphs —called fragments in this context—that appear at least in a certain percentage of graphs, are a common method to analyze graph databases. The actual occurrence of a fragment in a database graph is called embedding. Finding the fragments and their embeddings is the goal of subgraph mining described in detail in this chapter. The first published graph mining algorithm, called Subdue, appeared in the mid-1990s and is still used in different application areas and was extended in several ways. (Cook & Holder, 2000). Subdue is based on a heuristic search and does not find all possible fragments and embeddings. It took a few more years before more and faster approaches appeared. In (Helma, Kramer, & de Raedt, 2002) graph databases are mined for simple paths, for a lot of other applications only trees are of interest (Rückert & Kramer, 2004). Also Inductive Logic Programming (Finn et al., 1998) was applied in this area. At the beginning of the new millennium finally more and more and every time faster approaches for general mining of graph databases were developed that were able to find all possible fragments. (Borgelt & Berthold, 2002; Yan & Han, 2002; Kuramochi & Karypis, 2001; Nijssen & Kok, 2004). Several different application areas for graph mining are researched. The most common area is mining molecular databases where the molecules are displayed by their two-dimensional structure. When analyzing molecules it is interesting to find patterns that might explain why a certain set of molecules is useful as a drug against certain diseases (Borgelt & Berthold, 2002). Similar problems occur for protein databases. Here graph data mining can be used to find structural patterns in the primary, secondary and tertiary structure of protein categories (Cook & Holder, 2000). Another application area are web searches (Cook, Manocha, & Holder, 2003). Existing search engines use linear feature matches. Using graphs as underlying data structure, nodes represent pages, documents or document keywords and edges represent links between them. Posing a query as a graph means a smaller graph has to be embedded in the larger one. The graph modeling the data structure can be mined to find similar clusters. Quite new is the application of subgraph mining in optimizing code for embedded devices. With the help of so-called procedural abstraction, the size of pre-compiled binaries can be reduced which is often crucial because of the limited storage capacities of embedded systems. There, subgraph mining helps identifying common structures in the program’s control flow graph which can then be combined (“abstracted”) into a single procedure (Dreweke et.al., 2007).
Title: Subgraph Mining
Description:
The amount of available data is increasing very fast.
With this data, the desire for data mining is also growing.
More and larger databases have to be searched to find interesting (and frequent) elements and connections between them.
Most often the data of interest is very complex.
It is common to model complex data with the help of graphs consisting of nodes and edges that are often labeled to store additional information.
Having a graph database, the main goal is to find connections and similarities between its graphs.
Based on these connections and similarities, the graphs can be categorized, clustered or changed according to the application area.
Regularly occurring patterns in the form of subgraphs —called fragments in this context—that appear at least in a certain percentage of graphs, are a common method to analyze graph databases.
The actual occurrence of a fragment in a database graph is called embedding.
Finding the fragments and their embeddings is the goal of subgraph mining described in detail in this chapter.
The first published graph mining algorithm, called Subdue, appeared in the mid-1990s and is still used in different application areas and was extended in several ways.
(Cook & Holder, 2000).
Subdue is based on a heuristic search and does not find all possible fragments and embeddings.
It took a few more years before more and faster approaches appeared.
In (Helma, Kramer, & de Raedt, 2002) graph databases are mined for simple paths, for a lot of other applications only trees are of interest (Rückert & Kramer, 2004).
Also Inductive Logic Programming (Finn et al.
, 1998) was applied in this area.
At the beginning of the new millennium finally more and more and every time faster approaches for general mining of graph databases were developed that were able to find all possible fragments.
(Borgelt & Berthold, 2002; Yan & Han, 2002; Kuramochi & Karypis, 2001; Nijssen & Kok, 2004).
Several different application areas for graph mining are researched.
The most common area is mining molecular databases where the molecules are displayed by their two-dimensional structure.
When analyzing molecules it is interesting to find patterns that might explain why a certain set of molecules is useful as a drug against certain diseases (Borgelt & Berthold, 2002).
Similar problems occur for protein databases.
Here graph data mining can be used to find structural patterns in the primary, secondary and tertiary structure of protein categories (Cook & Holder, 2000).
Another application area are web searches (Cook, Manocha, & Holder, 2003).
Existing search engines use linear feature matches.
Using graphs as underlying data structure, nodes represent pages, documents or document keywords and edges represent links between them.
Posing a query as a graph means a smaller graph has to be embedded in the larger one.
The graph modeling the data structure can be mined to find similar clusters.
Quite new is the application of subgraph mining in optimizing code for embedded devices.
With the help of so-called procedural abstraction, the size of pre-compiled binaries can be reduced which is often crucial because of the limited storage capacities of embedded systems.
There, subgraph mining helps identifying common structures in the program’s control flow graph which can then be combined (“abstracted”) into a single procedure (Dreweke et.
al.
, 2007).

Related Results

Optimisation of potash mining technology for cell and pillar mining method
Optimisation of potash mining technology for cell and pillar mining method
The diverse demand for inorganic fertilizers has predetermined the intensification of potash mining, which is a raw material for their production. In this regard, it has become nec...
PENGEMBANGAN MASYARAKAT LINGKAR TAMBANG DALAM PENGUSAHAAN PERTAMBANGAN
PENGEMBANGAN MASYARAKAT LINGKAR TAMBANG DALAM PENGUSAHAAN PERTAMBANGAN
Indonesia is a country rich in mining resources. Mining resources include gold, silver, copper, oil and gas, coal and others. There are a large number of companies operating in the...
Efficient Frequent Subgraph Mining for Dynamic Network Graphs Using Golden Dung Graph Hybridization
Efficient Frequent Subgraph Mining for Dynamic Network Graphs Using Golden Dung Graph Hybridization
ABSTRACTFrequent subgraph mining (FSM) is one of the most critical procedures for mining meaningful patterns in large and dynamic graph datasets, common in several applications, su...
The use or abuse of thematic mining information maps
The use or abuse of thematic mining information maps
Abstract Thematic and environmental geology mapping has been applied in recent years by the British Geological Survey (BGS) in cities, towns, urban fringes, rural areas a...
Substantiation of resource-saving technology when mining the deposits for the production of crushed-stone products
Substantiation of resource-saving technology when mining the deposits for the production of crushed-stone products
Purpose. Scientific substantiation of the expedient depth of mining the non-metallic deposits of rocky minerals on the basis of mathematical and statistical methods, which will ens...
ON THE DEVELOPMENT OF A GENERAL METHOD FOR FORECASTING THE DANGEROUS PROPERTIES OF COAL SEAMS
ON THE DEVELOPMENT OF A GENERAL METHOD FOR FORECASTING THE DANGEROUS PROPERTIES OF COAL SEAMS
Purpose: to establish a quantitative effect on the dust-generating ability of mine layers of the degree of metamorphic transformations of fossil coals, mining-geological and mining...
Multi-Subgraph Fusion: An Innovative Approach for Block Matrix Graph Convolutional Networks
Multi-Subgraph Fusion: An Innovative Approach for Block Matrix Graph Convolutional Networks
Abstract Graph Convolutional Networks (GCNs) is a dominant approach for graph representation learning through neighborhood aggregation.However, existing GCN methods rely on...

Back to Top