Qrlew framework
Open source SQL manipulation framework written in Rust
What is Qrlew?
Qrlew is an open source library that aims to parse and compile SQL queries into an Intermediate Representation (IR) that is well-suited for various rewriting tasks. Although it was originally designed for privacy-focused applications, it can be utilized for a wide range of purposes.
SQL Query IR
Qrlew transforms a SQL query into a combination of simple operations such as Map
, Reduce
and Join
that are applied to Tables
. This representation simplifies the process of rewriting queries and reduces dependencies on the diverse range of syntactic constructs present in SQL.
Type Inference Engine
Differential Privacy (DP) guaranrtees are hard to obtain without destroying too much information. In many mechanisms having prior bounds on values can improve the utility of DP results dramatically. By propagating types cleverly, Qrlew can returns bounds for all values.
Differential Privacy compiler
Qrlew can compile SQL queries into Differentially Private ones. The process is inspired by Wilson et al. 2020. The complexity of the compilation process makes Qrlew IR very useful at delivering clean, readable and reliable code.
Get Started
Qrlew is a Rust library. To add it to your project, simply type:
bash# Create a new project
cargo new super-sql-app
# Change directory
cd super-sql-app
# Add qrlew library
cargo add qrlew
Qrlew comes as a python library. For now it has limited features. To install it simply type:
bashpip install pyqrlew
Parse SQL queries into Qrlew intermediate representation
Qrlew transforms a SQL query into a combination of simple operations such as Map, Reduce and Join that are applied to Tables. This representation simplifies the process of rewriting queries and reduces dependencies on the diverse range of syntactic constructs present in SQL.
SQLSELECT * FROM order_table JOIN item_table
ON id=order_id;
Rustuse qrlew::{sql::parse, Relation};
let relation = Relation::try_from(
parse("SELECT * FROM order_table JOIN
item_table ON id=order_id;")
.unwrap()
.with(&relations),
).unwrap();
Output SQL
After the rewritting process has happened in the intermediate representation, Qrlew can outputs plain SQL.
Rustuse sqlparser::ast::Query;
let query = Query::from(&relation);
SQLWITH
join__e_y (field_eygr, field_0wjz, field_cg0j,
field_idxm, field_0eqn, field_3ned, field_gwco)
AS (SELECT * FROM order_table JOIN item_table
ON (order_table.id) = (item_table.order_id)),
map_8r2s (field_eygr, field_0wjz, field_cg0j,
field_idxm, field_0eqn, field_3ned, field_gwco)
AS (SELECT field_eygr AS field_eygr, field_0wjz AS
field_0wjz, field_cg0j AS field_cg0j, field_idxm
AS field_idxm, field_0eqn AS field_0eqn,
field_3ned AS field_3ned, field_gwco AS field_gwco
FROM join__e_y)
SELECT * FROM map_8r2s;
Track the privacy unit accross SQL queries
Differential Privacy (DP) is defined with respect to a distance between datasets. A DP mechanisms is such that computations on neighboring datasets yield indistinguishable results. Neighboring datasets are datasets at distance 1. In Qrlew, the distance is defined in terms of number if differing protected entities (PE). The PE is defined as an identifying string _PROTECTED_ENTITY_ID_
added to each table.
SQLSELECT * FROM order_table JOIN item_table
ON id=order_id;
Rustuse qrlew::{sql::parse, Relation};
let relation = Relation::try_from(
parse("SELECT * FROM order_table JOIN
item_table ON id=order_id;")
.unwrap()
.with(&relations),
).unwrap();
// Define the privacy unit: here the name of the user
let relation = relation.force_protect_from_field_paths(
&relations,
&[
(
"item_table",
&[
("order_id", "order_table", "id"),
("user_id", "user_table", "id"),
],
"name",
),
("order_table", &[("user_id", "user_table", "id")], "name"),
("user_table", &[], "name"),
],
);
Compile SQL queries into Differentially Private ones
The process is inspired by Wilson et al. 2020. The complexity of the compilation process makes Qrlew IR very useful at delivering clean, readable and reliable code.
SQLSELECT * FROM order_table JOIN item_table
ON id=order_id;
Rustuse qrlew::{sql::parse, Relation};
let relation = Relation::try_from(
parse("SELECT * FROM order_table JOIN
item_table ON id=order_id;")
.unwrap()
.with(&relations),
).unwrap();
// Add noise to get 𝜀,𝛿-DP
let relation = relation.dp_compilation(
&relations,
&[
(
"item_table",
&[
("order_id", "order_table", "id"),
("user_id", "user_table", "id"),
],
"name",
),
("order_table", &[("user_id", "user_table", "id")], "name"),
("user_table", &[], "name"),
],
1., // epsilon
1e-5 // delta
);
Why Qrlew?
Sarus Technologies builds a state-of-the art product to access private data without seeing it. Giving the possibility to run SQL queries safely is an important part of Sarus. Qrlew is at the core of the next iteration of Sarus SQL engine. It is not yet in production but should be gradually integrated in Sarus and fully integrated by the end of the year. Besides the DP algorithm need to be trusted, hence the open source release.
Where is it Going?
Qrlew is actively developped as the core of Sarus SQL offer, but it aims at being used elsewhere. Many connectors to other tools in Differential Privacy should be developped in the coming months.