Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication

View through CrossRef
At present, with the explosive growth of data scale, subgraph matching for massive graph data is difficult to satisfy with efficiency. Meanwhile, the graph index used in existing subgraph matching algorithm is difficult to update and maintain when facing dynamic graphs. We propose a distributed subgraph matching algorithm based on Partition Replica (noted as PR-Match) to process the partition and storage of large-scale data graphs. The PR-Match algorithm first splits the query graph into sub-queries, then assigns the sub-query to each node for sub-graph matching, and finally merges the matching results. In the PR-Match algorithm, we propose a heuristic rule based on prediction cost to select the optimal merging plan, which greatly reduces the cost of merging. In order to accelerate the matching speed of the sub-query graph, a vertex code based on the vertex neighbor label signature is proposed, which greatly reduces the search space for the subquery. As the vertex code is based on the increment, the problem that the feature-based graph index is difficult to maintain in the face of the dynamic graph is solved. An abundance of experiments on real and synthetic datasets demonstrate the high efficiency and strong scalability of the PR-Match algorithm when handling large-scale data graphs.
Title: Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication
Description:
At present, with the explosive growth of data scale, subgraph matching for massive graph data is difficult to satisfy with efficiency.
Meanwhile, the graph index used in existing subgraph matching algorithm is difficult to update and maintain when facing dynamic graphs.
We propose a distributed subgraph matching algorithm based on Partition Replica (noted as PR-Match) to process the partition and storage of large-scale data graphs.
The PR-Match algorithm first splits the query graph into sub-queries, then assigns the sub-query to each node for sub-graph matching, and finally merges the matching results.
In the PR-Match algorithm, we propose a heuristic rule based on prediction cost to select the optimal merging plan, which greatly reduces the cost of merging.
In order to accelerate the matching speed of the sub-query graph, a vertex code based on the vertex neighbor label signature is proposed, which greatly reduces the search space for the subquery.
As the vertex code is based on the increment, the problem that the feature-based graph index is difficult to maintain in the face of the dynamic graph is solved.
An abundance of experiments on real and synthetic datasets demonstrate the high efficiency and strong scalability of the PR-Match algorithm when handling large-scale data graphs.

Related Results

Distributed subgraph matching on timely dataflow
Distributed subgraph matching on timely dataflow
Recently there emerge many distributed algorithms that aim at solving subgraph matching at scale. Existing algorithm-level comparisons failed to provide a systematic view of distri...
A Fast Pattern Matching Algorithm Based on Middle Characters of Pattern String
A Fast Pattern Matching Algorithm Based on Middle Characters of Pattern String
String pattern matching is one of the important string operation. At present, the pattern matching algorithm of strings mainly includes BF algorithm, KMP algorithm, and improved KM...
2021 Census to Census Coverage Survey Matching Results.
2021 Census to Census Coverage Survey Matching Results.
The 2021 England and Wales Census was matched to the Census Coverage Survey (CCS). This was an essential requisite for estimating undercount in the Census. To ensure outputs could ...
RDF Subgraph Matching by Means of Star Decomposition
RDF Subgraph Matching by Means of Star Decomposition
<p>With the continuous development of the network, the scale of RDF data is becoming larger and larger. In the face of large-scale RDF data processing, the traditional databa...
Single‐Molecule Optical Replication Mapping (ORM) Suggests Human Replication Timing is Regulated by Stochastic Initiation
Single‐Molecule Optical Replication Mapping (ORM) Suggests Human Replication Timing is Regulated by Stochastic Initiation
DNA replication timing is regulated by the timing of initiation across the genome. However, there is no consensus as to how initiation timing is regulated. Deterministic models con...
Hepatitis C Virus Replication Depends on Endosomal Cholesterol Homeostasis
Hepatitis C Virus Replication Depends on Endosomal Cholesterol Homeostasis
ABSTRACT Similar to other positive-strand RNA viruses, hepatitis C virus (HCV) causes massive rearrangements of intracellular membranes, resulting in a membranous web (MW...
Identification of 1600 replication origins in S. cerevisiae
Identification of 1600 replication origins in S. cerevisiae
Abstract There are approximately 500 known origins of replication in the yeast genome, and the process by which DNA replication initiates at these locations is well understood. In ...
Map-Matching Algorithm Based on Hidden Markov and Constraint Value Pruning
Map-Matching Algorithm Based on Hidden Markov and Constraint Value Pruning
Map matching is the process of matching global positioning system (GPS) trajectory data with map data. Its purpose is to determine the actual route of the moving object. Because of...

Back to Top