From 9fa26b6edc9be6b46957daad74084bfaa2e05c3c Mon Sep 17 00:00:00 2001 From: kateandrews Date: Wed, 27 Nov 2024 16:11:17 +1100 Subject: [PATCH 1/5] EQL docs edits Edit for consistence and typos --- docs/concepts/WHY.md | 39 ++++++++++++++++----------------------- 1 file changed, 16 insertions(+), 23 deletions(-) diff --git a/docs/concepts/WHY.md b/docs/concepts/WHY.md index a717acf2..201fd0c1 100644 --- a/docs/concepts/WHY.md +++ b/docs/concepts/WHY.md @@ -1,16 +1,13 @@ # Postgres data security with CipherStash -This article gives a high-level overview of CipherStash's encryption in use solution, including the CipherStash Proxy and the Encrypt Query Language (EQL). +This page gives a high-level overview of CipherStash's encryption in use solution, including CipherStash Proxy and the Encrypt Query Language (EQL). It's designed for developers and engineers who need to implement robust data security in PostgreSQL without sacrificing performance or usability. -It is designed for developers and engineers who need to implement robust data security in PostgreSQL without sacrificing performance or usability. - -## Table of Contents +## On this page 1. [Encryption in use](#encryption-in-use) - [What is encryption in use?](#what-is-encryption-in-use) - [Why use encryption in use?](#why-use-encryption-in-use) 2. [CipherStash Proxy](#cipherstash-proxy) - - [Proxy overview](#proxy-overview) - [How it works](#how-it-works) 3. [Encrypt Query Language (EQL)](#encrypt-query-language-eql) 4. [Best practices](#best-practices) @@ -20,7 +17,8 @@ It is designed for developers and engineers who need to implement robust data se ## Encryption in use -EQL enables encryption in use, without significant changes to your application code. +CipherStash's encryption in use solution, comprising CipherStash Proxy and EQL, provides a practical way to enhance data security in Postgres databases. +EQL enables encryption in use without significant changes to your application code. A variety of searchable encryption techniques are available, including: - **Matching** - Equality or partial matches @@ -44,8 +42,6 @@ Encryption in use mitigates this risk by ensuring that: ## CipherStash Proxy -### Proxy overview - CipherStash Proxy is a transparent proxy that sits between your application and your PostgreSQL database. It intercepts SQL queries and handles the encryption and decryption of data on-the-fly. This enables encryption in use without significant changes to your application code. @@ -63,19 +59,19 @@ This enables encryption in use without significant changes to your application c Encrypt Query Language (EQL) is a set of PostgreSQL functions and data types provided by CipherStash to work with encrypted data and indexes. EQL allows you to perform queries on encrypted data without decrypting it, supporting operations like equality checks, range queries, and unique constraints. -To get started, view the [Getting Started](https://github.com/cipherstash/encrypt-query-language/blob/main/GETTINGSTARTED.md) guide. +To get started, read the [Getting started](https://github.com/cipherstash/encrypt-query-language/blob/main/GETTINGSTARTED.md) guide. -## Best Practices +## Best practices -- **Leverage CipherStash Proxy**: Use CipherStash Proxy to handle encryption/decryption transparently. -- **Utilize EQL functions**: Always use EQL functions when interacting with encrypted data. -- **Define constraints**: Apply database constraints to maintain data integrity. -- **Secure key management**: Ensure encryption keys are securely managed and stored. -- **Monitor performance**: Keep an eye on query performance and optimize as needed. +- **Use CipherStash Proxy** to handle encryption/decryption transparently. +- **Use EQL functions** when interacting with encrypted data. +- **Define database constraints**to maintain data integrity. +- **Secure key management** of encryption keys. +- **Monitor query performance** and optimize as needed. -## Advanced Topics +## Advanced topics -### Integrating without CipehrStash Proxy +### Integrating without CipherStash Proxy > The SDK approach is currently in development, but if you're interested in contributing, please start a discussion [here](https://github.com/cipherstash/encrypt-query-language/discussions). @@ -88,11 +84,8 @@ For advanced users who prefer to handle encryption within their application: **Note**: This approach increases complexity and is recommended only if CipherStash Proxy does not meet specific requirements. -## Conclusion - -CipherStash's encryption in use solution, comprising CipherStash Proxy and EQL, provides a practical way to enhance data security in Postgres databases. -By keeping data encrypted even during processing, you minimize the risk of data breaches and comply with stringent security standards without significant changes to your application logic. +## Getting started -To get started, see the [Getting Started](https://github.com/cipherstash/encrypt-query-language/blob/main/GETTINGSTARTED.md) guide. +To get started using CipherStash's encryption is use solution, see the [Getting Started](https://github.com/cipherstash/encrypt-query-language/blob/main/GETTINGSTARTED.md) guide. -**Contact Support:** For further assistance, raise an issue [here](https://github.com/cipherstash/encrypt-query-language/issues). +For further help, raise an issue [here](https://github.com/cipherstash/encrypt-query-language/issues). From 7d634722b5761025aafe4d4f3a88f1f1f9b4cde1 Mon Sep 17 00:00:00 2001 From: kateandrews Date: Wed, 27 Nov 2024 16:39:29 +1100 Subject: [PATCH 2/5] Edits to README.md --- docs/README.md | 24 +++++++++--------------- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/docs/README.md b/docs/README.md index 4462be55..c34f3d53 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,25 +1,19 @@ -# EQL Documentation +# EQL documentation This directory contains the documentation for the Encrypt Query Language (EQL). -## Concepts +## About -The following concepts are available: +- [Postgres data security with CipherStash](concepts/WHY.md) -- [Why we built EQL](concepts/WHY.md) +## How-to guides -## Reference +- [Getting started](tutorials/GETTINGSTARTED.md) +- [Using CipherStash Proxy](tutorials/PROXY.md) -The following reference guides are available: +## Reference - [EQL index configuration](reference/INDEX.md) -- [JSONB and JSON support](reference/JSON.md) -- [Migrating plaintext data](reference/MIGRATOR.md) +- [EQL with JSON and JSONB](reference/JSON.md) +- [CipherStash Migrator](reference/MIGRATOR.md) - [EQL payload data format](reference/PAYLOAD.md) - -## Tutorials - -The following tutorials are available: - -- [Getting started](tutorials/GETTINGSTARTED.md) -- [Using CipherStash Proxy](tutorials/PROXY.md) \ No newline at end of file From 229cb9c5076e28cb201e82091e04e568a113771c Mon Sep 17 00:00:00 2001 From: kateandrews Date: Wed, 27 Nov 2024 16:59:03 +1100 Subject: [PATCH 3/5] INDEX.md edits --- docs/reference/INDEX.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/reference/INDEX.md b/docs/reference/INDEX.md index a80d236d..d2554b00 100644 --- a/docs/reference/INDEX.md +++ b/docs/reference/INDEX.md @@ -1,7 +1,7 @@ # EQL index configuration The following functions allow you to configure indexes for encrypted columns. -All these functions modify the `cs_configuration_v1` table in your database, and is added during the EQL installation. +All these functions modify the `cs_configuration_v1` table in your database, and are added during the EQL installation. > **IMPORTANT:** When you modify or add an index, you must re-encrypt data that's already been stored in the database. The CipherStash encryption solution will encrypt the data based on the current state of the configuration. @@ -24,7 +24,7 @@ SELECT cs_add_index_v1( | ------------- | -------------------------------------------------- | ------------------------------------------------------------------------ | | `table_name` | Name of target table | Required | | `column_name` | Name of target column | Required | -| `index_name` | The index kind | Required. | +| `index_name` | The index kind | Required | | `cast_as` | The PostgreSQL type decrypted data will be cast to | Optional. Defaults to `text` | | `opts` | Index options | Optional for `match` indexes, required for `ste_vec` indexes (see below) | @@ -44,7 +44,7 @@ Supported types: A match index enables full text search across one or more text fields in queries. -The default Match index options are: +The default match index options are: ```json { @@ -93,7 +93,7 @@ Specifically, searching for strings _shorter_ than the `tokenLength` parameter w If you're using n-gram as a token filter, then a token that is already shorter than the `tokenLength` parameter will be kept as-is when indexed, and so a search for that short token will match that record. However, if that same short string only appears as a part of a larger token, then it will not match that record. -In general, therefore, you should try to ensure that the string you search for is at least as long as the `tokenLength` of the index, except in the specific case where you know that there are shorter tokens to match, _and_ you are explicitly OK with not returning records that have that short string as part of a larger token. +Try to ensure that the string you search for is at least as long as the `tokenLength` of the index, except in the specific case where you know that there are shorter tokens to match, _and_ you are explicitly OK with not returning records that have that short string as part of a larger token. #### Options for ste_vec indexes (`opts`) @@ -101,13 +101,13 @@ An ste_vec index on a encrypted JSONB column enables the use of PostgreSQL's `@> An ste_vec index requires one piece of configuration: the `context` (a string) which is passed as an info string to a MAC (Message Authenticated Code). This ensures that all of the encrypted values are unique to that context. -It is generally recommended to use the table and column name as a the context (e.g. `users/name`). +We recommend that you use the table and column name as a the context (e.g. `users/name`). -Within a dataset, encrypted columns indexed using an `ste_vec` that use different contexts cannot be compared. +Within a dataset, encrypted columns indexed using an `ste_vec` that use different contexts can't be compared. Containment queries that manage to mix index terms from multiple columns will never return a positive result. This is by design. -The index is generated from a JSONB document by first flattening the structure of the document such that a hash can be generated for each unique path prefix to a node. +The index is generated from a JSONB document by first flattening the structure of the document so that a hash can be generated for each unique path prefix to a node. The complete set of JSON types is supported by the indexer. Null values are ignored by the indexer. @@ -182,7 +182,7 @@ The hashes would be generated for all prefixes of the full path to the leaf node Query terms are processed in the same manner as the input document. -A query prior to encrypting & indexing looks like a structurally similar subset of the encrypted document, for example: +A query prior to encrypting and indexing looks like a structurally similar subset of the encrypted document. For example: ```json { @@ -238,4 +238,4 @@ SELECT cs_remove_index_v1( column_name text, index_name text ); -``` \ No newline at end of file +``` From ebb480d72fe076ad67ff48ade201be5c8294cddd Mon Sep 17 00:00:00 2001 From: kateandrews Date: Wed, 27 Nov 2024 17:27:31 +1100 Subject: [PATCH 4/5] JSON.md edits --- docs/reference/JSON.md | 63 ++++++++++++++++++++---------------------- 1 file changed, 30 insertions(+), 33 deletions(-) diff --git a/docs/reference/JSON.md b/docs/reference/JSON.md index 8ca10282..62aa50b0 100644 --- a/docs/reference/JSON.md +++ b/docs/reference/JSON.md @@ -2,20 +2,19 @@ EQL supports encrypting, decrypting, and searching JSON and JSONB objects. -## Table of contents - -- [Configuring the Index](#configuring-the-index) - - [Inserting JSON Data](#inserting-json-data) - - [Reading JSON Data](#reading-json-data) -- [Querying JSONB Data with EQL](#querying-jsonb-data-with-eql) - - [Containment Queries (`cs_ste_vec_v1`)](#containment-queries-cs_ste_vec_v1) - - [Field Extraction (`cs_ste_vec_value_v1`)](#field-extraction-cs_ste_vec_value_v1) - - [Field Comparison (`cs_ste_vec_term_v1`)](#field-comparison-cs_ste_vec_term_v1) - - [Grouping Data](#grouping-data) -- [Reference](#reference) - - [EQL Functions for JSONB and `ste_vec`](#eql-functions-for-jsonb-and-ste_vec) - - [EJSON Paths](#ejson-paths) -- [Native PostgreSQL JSON(B) Compared to EQL](#native-postgresql-jsonb-compared-to-eql) +## On this page + +- [Configuring the index](#configuring-the-index) + - [Inserting JSON data](#inserting-json-data) + - [Reading JSON data](#reading-json-data) +- [Querying JSONB data with EQL](#querying-jsonb-data-with-eql) + - [Containment queries (`cs_ste_vec_v1`)](#containment-queries-cs_ste_vec_v1) + - [Field extraction (`cs_ste_vec_value_v1`)](#field-extraction-cs_ste_vec_value_v1) + - [Field comparison (`cs_ste_vec_term_v1`)](#field-comparison-cs_ste_vec_term_v1) + - [Grouping data](#grouping-data) +- [EQL functions for JSONB and `ste_vec`](#eql-functions-for-jsonb-and-ste_vec) +- [EJSON paths](#ejson-paths) +- [Native PostgreSQL JSON(B) compared to EQL](#native-postgresql-jsonb-compared-to-eql) - [`json ->> text` → `text` and `json -> text` → `jsonb`/`json`](#json--text--text-and-json---text--jsonbjson) - [Decryption Example](#decryption-example) - [Comparison Example](#comparison-example) @@ -116,15 +115,15 @@ Data is returned as: } ``` -## Querying JSONB Data with EQL +## Querying JSONB data with EQL EQL provides specialized functions to interact with encrypted JSONB data, supporting operations like containment queries, field extraction, and comparisons. -### Containment Queries (`cs_ste_vec_v1`) +### Containment queries (`cs_ste_vec_v1`) Retrieve the Structured Encryption Vector for JSONB containment queries. -**Example: Containment Query** +**Example: Containment query** Suppose we have the following encrypted JSONB data: @@ -138,7 +137,7 @@ Suppose we have the following encrypted JSONB data: We can query records that contain a specific structure. -**SQL Query:** +**SQL query:** ```sql SELECT * FROM examples @@ -162,11 +161,11 @@ WHERE jsonb_column @> '{"top":{"nested":["a"]}}'; **Note:** The `@>` operator checks if the left JSONB value contains the right JSONB value. -**Negative Example:** +**Negative example:** If we query for a value that does not exist in the data: -**SQL Query:** +**SQL query:** ```sql SELECT * FROM examples @@ -183,7 +182,7 @@ WHERE cs_ste_vec_v1(encrypted_json) @> cs_ste_vec_v1( This query would return no results, as the value `"d"` is not present in the `"nested"` array. -### Field Extraction (`cs_ste_vec_value_v1`) +### Field extraction (`cs_ste_vec_value_v1`) Extract a field from an encrypted JSONB object. @@ -201,7 +200,7 @@ Suppose we have the following encrypted JSONB data: We can extract the value of the `"top"` key. -**SQL Query:** +**SQL query:** ```sql SELECT cs_ste_vec_value_v1(encrypted_json, @@ -231,7 +230,7 @@ FROM examples; } ``` -### Field Comparison (`cs_ste_vec_term_v1`) +### Field comparison (`cs_ste_vec_term_v1`) Select rows based on a field value in an encrypted JSONB object. @@ -247,7 +246,7 @@ Suppose we have encrypted JSONB data with a numeric field: We can query records where the `"num"` field is greater than `2`. -**SQL Query:** +**SQL query:** ```sql SELECT * FROM examples @@ -277,7 +276,7 @@ SELECT * FROM examples WHERE (jsonb_column->>'num')::int > 2; ``` -### Grouping Data +### Grouping data Use `cs_ste_vec_term_v1` along with `cs_grouped_value_v1` to group by a field in an encrypted JSONB column. @@ -296,7 +295,7 @@ Suppose we have records with a `"color"` field: We can group the data by the `"color"` field and count occurrences. -**SQL Query:** +**SQL query:** ```sql SELECT cs_grouped_value_v1(cs_ste_vec_value_v1(encrypted_json, @@ -336,16 +335,14 @@ GROUP BY jsonb_column->>'color'; | green | 2 | | red | 1 | -## Reference +## EQL Functions for JSONB and `ste_vec` -### EQL Functions for JSONB and `ste_vec` - -- **Index Management** +- **Index management** - `cs_add_index_v1(table_name text, column_name text, 'ste_vec', 'jsonb', opts jsonb)`: Adds an `ste_vec` index configuration. - `opts` must include the `"context"` key. -- **Query Functions** +- **Query functions** - `cs_ste_vec_v1(val jsonb)`: Retrieves the STE vector for JSONB containment queries. - `cs_ste_vec_term_v1(val jsonb, epath jsonb)`: Retrieves the encrypted term associated with an encrypted JSON path. @@ -353,7 +350,7 @@ GROUP BY jsonb_column->>'color'; - `cs_ste_vec_terms_v1(val jsonb, epath jsonb)`: Retrieves an array of encrypted terms for elements in an array at the given JSON path (used for comparisons). - `cs_grouped_value_v1(val jsonb)`: Used with `ste_vec` indexes for grouping. -### EJSON Paths +## EJSON paths EQL uses an extended JSONPath syntax called EJSONPath for specifying paths in JSONB data. @@ -363,7 +360,7 @@ EQL uses an extended JSONPath syntax called EJSONPath for specifying paths in JS - Wildcards are supported: `$.some_array_field[*]` - Array indexing is **not** supported: `$.some_array_field[0]` -**Example Paths:** +**Example paths:** - `$.top.nested` selects the `"nested"` key within the `"top"` object. - `$.array[*]` selects all elements in the `"array"` array. From 2f86845901f7cc687ed3c20b842495d8adcd52bd Mon Sep 17 00:00:00 2001 From: kateandrews Date: Wed, 27 Nov 2024 17:29:39 +1100 Subject: [PATCH 5/5] MIGRATOR.md edits --- docs/reference/MIGRATOR.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/docs/reference/MIGRATOR.md b/docs/reference/MIGRATOR.md index f5f06672..2abb12b1 100644 --- a/docs/reference/MIGRATOR.md +++ b/docs/reference/MIGRATOR.md @@ -1,8 +1,8 @@ # CipherStash Migrator -The CipherStash Migrator is a tool that can be used to migrate plaintext data in a database to its encrypted equivalent. +CipherStash Migrator is a tool that can be used to migrate plaintext data in a database to its encrypted equivalent. It works inside the CipherStash Proxy Docker container and can handle different data types such as text, JSONB, integers, booleans, floats, and dates. -By specifying the relevant columns in your table, the migrator will seamlessly encrypt the existing data and store it in designated encrypted columns. +By specifying the relevant columns in your table, CipherStash Migrator will seamlessly encrypt the existing data and store it in designated encrypted columns. ## Prerequisites @@ -10,8 +10,6 @@ By specifying the relevant columns in your table, the migrator will seamlessly e - [Have set up EQL in your database](GETTINGSTARTED.md) - Ensure that the columns where data will be migrated already exist. -Here’s a draft for the technical usage documentation for the CipherStash Migrator tool: - ## Usage The CipherStash Migrator allows you to specify key-value pairs where the key is the plaintext column, and the value is the corresponding encrypted column.