Javascript must be enabled to continue!
A Scalable Data Structure for Efficient Graph Analytics and In-Place Mutations
View through CrossRef
The graph model enables a broad range of analyses; thus, graph processing (GP) is an invaluable tool in data analytics. At the heart of every GP system lies a concurrent graph data structure that stores the graph. Such a data structure needs to be highly efficient for both graph algorithms and queries. Due to the continuous evolution, the sparsity, and the scale-free nature of real-world graphs, GP systems face the challenge of providing an appropriate graph data structure that enables both fast analytical workloads and fast, low-memory graph mutations. Existing graph structures offer a hard tradeoff among read-only performance, update friendliness, and memory consumption upon updates. In this paper, we introduce CSR++, a new graph data structure that removes these tradeoffs and enables both fast read-only analytics, and quick and memory-friendly mutations. CSR++ combines ideas from CSR, the fastest read-only data structure, and adjacency lists (ALs) to achieve the best of both worlds. We compare CSR++ to CSR, ALs from the Boost Graph Library (BGL), and the following state-of-the-art update-friendly graph structures: LLAMA, STINGER, GraphOne, and Teseo. In our evaluation, which is based on popular GP algorithms executed over real-world graphs, we show that CSR++ remains close to CSR in read-only concurrent performance (within 10% on average) while significantly outperforming CSR (by an order of magnitude) and LLAMA (by almost 2×) with frequent updates. We also show that both CSR++’s update throughput and analytics performance exceed those of several state-of-the-art graph structures while maintaining low memory consumption when the workload includes updates.
Title: A Scalable Data Structure for Efficient Graph Analytics and In-Place Mutations
Description:
The graph model enables a broad range of analyses; thus, graph processing (GP) is an invaluable tool in data analytics.
At the heart of every GP system lies a concurrent graph data structure that stores the graph.
Such a data structure needs to be highly efficient for both graph algorithms and queries.
Due to the continuous evolution, the sparsity, and the scale-free nature of real-world graphs, GP systems face the challenge of providing an appropriate graph data structure that enables both fast analytical workloads and fast, low-memory graph mutations.
Existing graph structures offer a hard tradeoff among read-only performance, update friendliness, and memory consumption upon updates.
In this paper, we introduce CSR++, a new graph data structure that removes these tradeoffs and enables both fast read-only analytics, and quick and memory-friendly mutations.
CSR++ combines ideas from CSR, the fastest read-only data structure, and adjacency lists (ALs) to achieve the best of both worlds.
We compare CSR++ to CSR, ALs from the Boost Graph Library (BGL), and the following state-of-the-art update-friendly graph structures: LLAMA, STINGER, GraphOne, and Teseo.
In our evaluation, which is based on popular GP algorithms executed over real-world graphs, we show that CSR++ remains close to CSR in read-only concurrent performance (within 10% on average) while significantly outperforming CSR (by an order of magnitude) and LLAMA (by almost 2×) with frequent updates.
We also show that both CSR++’s update throughput and analytics performance exceed those of several state-of-the-art graph structures while maintaining low memory consumption when the workload includes updates.
Related Results
ecision Farming and Predictive Analytics in Precision Farming and Predictive Analytics in Precision Farming and Predictive Analytics in Precision Farming and Predictive Analytics in Precision Farming and Predictive Analytics in Precision Farming and Predi
ecision Farming and Predictive Analytics in Precision Farming and Predictive Analytics in Precision Farming and Predictive Analytics in Precision Farming and Predictive Analytics in Precision Farming and Predictive Analytics in Precision Farming and Predi
The scope of sensor networks and the Internet of Things spanning rapidly to diversified domains but not limited to sports, health, and business trading. In recent past, the sensors...
A Scalable Data Structure for Efficient Graph Analytics and In-Place Mutations
A Scalable Data Structure for Efficient Graph Analytics and In-Place Mutations
The graph model enables a broad range of analysis, thus graph processing is an invaluable tool in data analytics. At the heart of every graph processing system lies a concurrent gr...
Graph convolutional neural networks for 3D data analysis
Graph convolutional neural networks for 3D data analysis
(English) Deep Learning allows the extraction of complex features directly from raw input data, eliminating the need for hand-crafted features from the classical Machine Learning p...
Dynamics of Mutations in Patients with ET Treated with Imetelstat
Dynamics of Mutations in Patients with ET Treated with Imetelstat
Abstract
Background: Imetelstat, a first in class specific telomerase inhibitor, induced hematologic responses in all patients (pts) with essential thrombocythemia (...
High Resolution Melt Analysis for Rapid and Cost-Effective Screening of TP53 Mutations in Patients with Myeloid Malignancies
High Resolution Melt Analysis for Rapid and Cost-Effective Screening of TP53 Mutations in Patients with Myeloid Malignancies
Abstract
Background
Recent reports have highlighted an adverse impact of TP53 mutations on the prognosis of patients with myeloid malignancies. TP53 m...
Clinical and Biological Implications of CUX1 Mutations in Myeloid Neoplasms
Clinical and Biological Implications of CUX1 Mutations in Myeloid Neoplasms
Abstract
Recurrent somatic mutations of CUX1 are described in myeloid neoplasms. CUX1 is located at chromosome 7q22.1; -7/del(7q) involving CUX1 locus are common abn...
Data Analytics on Graphs Part I: Graphs and Spectra on Graphs
Data Analytics on Graphs Part I: Graphs and Spectra on Graphs
The area of Data Analytics on graphs promises a paradigm shift, as we approach information processing of new classes of data which are typically acquired on irregular but structure...
Service Quality Improvement in the Banking Sector: A Data Analytics Perspective
Service Quality Improvement in the Banking Sector: A Data Analytics Perspective
Service quality in the banking sector is a critical determinant of customer satisfaction, loyalty, and competitive advantage. As banks strive to meet the evolving expectations of c...

