Javascript must be enabled to continue!
Differential privacy and SPARQL
View through CrossRef
Differential privacy is a framework that provides formal tools to develop algorithms to access databases and answer statistical queries with quantifiable accuracy and privacy guarantees. The notions of differential privacy are defined independently of the data model and the query language at steak. Most differential privacy results have been obtained on aggregation queries such as counting or finding maximum or average values, and on grouping queries over aggregations such as the creation of histograms. So far, the data model used by the framework research has typically been the relational model and the query language SQL. However, effective realizations of differential privacy for SQL queries that required joins had been limited. This has imposed severe restrictions on applying differential privacy in RDF knowledge graphs and SPARQL queries. By the simple nature of RDF data, most useful queries accessing RDF graphs will require intensive use of joins. Recently, new differential privacy techniques have been developed that can be applied to many types of joins in SQL with reasonable results. This opened the question of whether these new results carry over to RDF and SPARQL. In this paper we provide a positive answer to this question by presenting an algorithm that can answer counting queries over a large class of SPARQL queries that guarantees differential privacy, if the RDF graph is accompanied with semantic information about its structure. We have implemented our algorithm and conducted several experiments, showing the feasibility of our approach for large graph databases. Our aim has been to present an approach that can be used as a stepping stone towards extensions and other realizations of differential privacy for SPARQL and RDF.
Title: Differential privacy and SPARQL
Description:
Differential privacy is a framework that provides formal tools to develop algorithms to access databases and answer statistical queries with quantifiable accuracy and privacy guarantees.
The notions of differential privacy are defined independently of the data model and the query language at steak.
Most differential privacy results have been obtained on aggregation queries such as counting or finding maximum or average values, and on grouping queries over aggregations such as the creation of histograms.
So far, the data model used by the framework research has typically been the relational model and the query language SQL.
However, effective realizations of differential privacy for SQL queries that required joins had been limited.
This has imposed severe restrictions on applying differential privacy in RDF knowledge graphs and SPARQL queries.
By the simple nature of RDF data, most useful queries accessing RDF graphs will require intensive use of joins.
Recently, new differential privacy techniques have been developed that can be applied to many types of joins in SQL with reasonable results.
This opened the question of whether these new results carry over to RDF and SPARQL.
In this paper we provide a positive answer to this question by presenting an algorithm that can answer counting queries over a large class of SPARQL queries that guarantees differential privacy, if the RDF graph is accompanied with semantic information about its structure.
We have implemented our algorithm and conducted several experiments, showing the feasibility of our approach for large graph databases.
Our aim has been to present an approach that can be used as a stepping stone towards extensions and other realizations of differential privacy for SPARQL and RDF.
Related Results
Augmented Differential Privacy Framework for Data Analytics
Augmented Differential Privacy Framework for Data Analytics
Abstract
Differential privacy has emerged as a popular privacy framework for providing privacy preserving noisy query answers based on statistical properties of databases. ...
Creating RESTful APIs over SPARQL endpoints using RAMOSE
Creating RESTful APIs over SPARQL endpoints using RAMOSE
Semantic Web technologies are widely used for storing RDF data and making them available on the Web through SPARQL endpoints, queryable using the SPARQL query language. While the u...
Privacy Risk in Recommender Systems
Privacy Risk in Recommender Systems
Nowadays, recommender systems are mostly used in many online applications to filter information and help users in selecting their relevant requirements. It avoids users to become o...
Skyline Queries in SPARQL: An Overview
Skyline Queries in SPARQL: An Overview
The growth of RDF (Resource Description Framework) datasets and the expansion of their use in conjunction with the definition of SPARQL, a declarative query language, have made RDF...
Automated extraction of attributes of IFC objects based on graph theory and SPARQL query
Automated extraction of attributes of IFC objects based on graph theory and SPARQL query
Abstract
Building Information Modelling (BIM) has been widely adopted as an effective means for supporting information exchange in Architectural, Engineering and Con...
Differential privacy learned index
Differential privacy learned index
Indexes are fundamental components of database management systems, traditionally implemented through structures like B-Tree, Hash, and BitMap indexes. These index structures map ke...
THE SECURITY AND PRIVACY MEASURING SYSTEM FOR THE INTERNET OF THINGS DEVICES
THE SECURITY AND PRIVACY MEASURING SYSTEM FOR THE INTERNET OF THINGS DEVICES
The purpose of the article: elimination of the gap in existing need in the set of clear and objective security and privacy metrics for the IoT devices users and manufacturers and a...
Heterogeneous Differential Privacy
Heterogeneous Differential Privacy
The massive collection of personal data by personalization systems has rendered the preservation of privacy of individuals more and more difficult. Most of the proposed approaches ...

