-
Notifications
You must be signed in to change notification settings - Fork 0
Update docs and JSON schema with "query-only" encryption changes #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5afdfbd to
b67e940
Compare
| "title": "kind", | ||
| "type": "string", | ||
| "enum": ["pt", "ct", "en"] | ||
| "enum": ["pt", "ct", "en", "qm", "qo", "qu", "qejp", "qsv"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When "for query" is set on the plaintext payload, Proxy will encrypt the payload as one of the new variants here:
qm= query matchqo= query OREqu= query uniqueqejp= query EJSON pathqsv= query STE vec
Proxy will only generate the index needed for the specific query and set the relevant kind/index field in the payload.
For example, if the client sends through a plaintext payload with "for query" set to "match":
{
"v": 1,
"k": "pt",
"p": "abc",
"i": {
"t": "users",
"c": "email_encrypted"
},
"q": "match"
}Proxy would encrypt that as the match query variant:
{
"v": 1,
"k": "qm",
"m": [1, 2, 3]
}
Note that the "m" key in the match query payload lines up with the same key used for the match index that's stored in the column. This allows the EQL SQL funcs to look up the index field using the same key (whether the value comes from the column or from a param):
-- $1 here should be encrypted for a match query
SELECT * FROM users WHERE cs_match_v1(email_encrypted) @> cs_match_v1($1)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the new kinds required?
The postgres function shouldn't need the kind as long as the actual field is encoded correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They aren't strictly required (we could re-use the existing "ct"/ciphertext variant), but I'd prefer to model this as new variants for a few reasons.
On the Proxy/Rust side:
- The source ciphertext (
"c"field) in the existing"ct"/ciphertext variant would have to become optional (since queries wouldn't use it) - All of the indexes are optional in the existing
"ct"/ciphertext variant, which is bit awkward to work with when encrypting for a query because behaviour-wise only one (required) index field should be set in the payload depending on the operation - Basically, specific variants for each makes it harder for us to construct something incorrectly and means that we can lean more on the compiler to catch problems
On the PG side:
- If PG attempts to store a
q*variant, something has gone wrong. It's convenient to be able to reject anything that isn't thectvariant - It's nice for debugging to be able to be able to identify with specific variant was chosen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having variants for every option adds complexity.
We could have a single "query" variant to disambiguate from ct and pt.
Coupled with having the "q" value be the identifier, we can then validate.
"knd": "qt",
"qry": "u",
"u": "etc",
if knd == qt
qry is required and qry.value (u) should exist as a field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be an OK middle ground if we're concerned about the number of variants.
Conceptually there's a variant for each and I rather make that explicit than have to model it as a single type that requires extra validation logic to handle what the compiler could catch for us.
README.md
Outdated
| Note that plaintext payloads for query operations should set the `"q"` (for query) property. | ||
| This property tells CipherStash Proxy to only perform encryption necessary for a specific operation. | ||
| Otherwise, Proxy will perform source encryption and encryption for all indexes configured for the given column. | ||
|
|
||
| For reference, the EQL payload is defined as a `jsonb` with a specific schema: | ||
|
|
||
| ```json | ||
| { | ||
| "v": 1, | ||
| "k": "pt", | ||
| "p": "[email protected]", | ||
| "i": { | ||
| "t": "users", | ||
| "c": "email_encrypted" | ||
| }, | ||
| "q": "match" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. I like that.
| ```rb | ||
| # Create the EQL payload using helper functions | ||
| payload = eqlPayload("users", "encrpyted_field", "plaintext value") | ||
| payload = EQL.for_match("users", "encrypted_field", "plaintext value") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are probably more idiomatic ways to do this in Ruby but not something we need to worry about now. Just sharing thoughts while they're fresh.
You could do:
EQL.table("users").column("encrypted_field").match("plaintext value")If you made EQL.table("users") a class method:
class User
def self.eql
EQL.table("users")
end
endAnd used method_missing to handle the column name, you could do something like:
User.eql.encrypted_field = "foo"
User.eql.encrypted_field > 10
# etcJust ideas for later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very fair. At the moment I'm mostly aiming for examples that get the point across and would be easy to port to other languages. Agreed that it'll make sense to implement a more idiomatic/ergonomic interface for Ruby at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth just writting all examples in psuedo code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly! Or maybe we can go with one of the languages included in this repo (Go or JS).
coderdan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job! I like the approach. Left some musings on Ruby DSL improvements but not needed for the moment.
|
I would make the |
I think this might not work for use cases like |
|
@CDThomas Do you have an example of what that means? |
For example, the search In our implementation, the search |
| **Example:** | ||
|
|
||
| ```rb | ||
| Users.findAll(&:encrypted_email) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CDThomas I was trying to show a basic SELECT statement here withour any WHERE clauses and highlighting that responses are simply decrypted. I like what you've done here but maybe for the cs_match_v1 example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense @calvinbrewer. I reverted the changes in this section (and already updated the match example).
This change updates the EQL query docs to use the new field `"q"` (for query). This field tells proxy to perform encryption for a specific query operation (instead of performing source encryption and encryption for all indexes). This change also updates the JSON schema: - add new "for query" field - add missing STE vec index field - remove ".1" suffix from m, u, and o fields
This change adds variants for the encrypted payloads for queries to the EQL JSON schema. These variants are used internally by proxy for query-only encryption (for filters in `WHERE` clauses, for example).
Update the match filter in the select diagram to reflect actual usage.
These aren't actually needed since STE term plaintext queries will be encrypted as the same payload as ORE queries and websearch_to_match queries will be the same payload as match queries. Also add missing query variant for EJSON path queries.
bd33ebf to
6122f41
Compare
|
Merging this one since the docs here reflect what we've already started to implement on the Proxy side, but we can adjust the docs if we end up simplifying more. |
…tion Update docs and JSON schema with "query-only" encryption changes
This change updates the EQL query docs to use the new field
"q"(for query). This field tells proxy to perform encryption for a specific query operation (instead of performing source encryption and encryption for all indexes).This change also updates the JSON schema:
WHEREclauses, for example).