Skip to content

Conversation

kamilogorek
Copy link
Contributor

Resolves #296

It was quite requested feature by some of the clients recently, so we may get it in, in one form or another. I'm open for the feedback.

@kamilogorek kamilogorek requested a review from a team March 19, 2018 12:56
@MaxBittker
Copy link
Contributor

MaxBittker commented Mar 19, 2018

On a roll Kamil! haven't reviewed the implementation in depth yet, but I think it's worth discussing the API/behavior/guarantees/terminology of this feature, because it's really important and this client could become the gold standard for our other clients going forward.

This is not to say that we need to get this perfect in this PR, just that it would be good to identify our goals for the best possible API here so this can work towards that in a backwards-compatible way.

Afaik (and I could be very wrong, @bretthoerner @HazAT ) the only clients which support this right now are python and ruby.

Python

docs:
https://docs.sentry.io/clients/python/advanced/#sanitizing-data
implementation:
https://github.com/getsentry/raven-python/blob/master/raven/processors.py

  • sanitizes a fixed list of keys (AFAIK, not configurable by the user without making a new instance of the processor and passing that in)
  • filters extra data and http data including query params

Ruby:

docs:
https://docs.sentry.io/clients/ruby/config/#configuration
implementation:
https://github.com/getsentry/raven-ruby/blob/master/lib/raven/processor/sanitizedata.rb

  • has some default patterns including password and credit card-like strings.
  • also accepts a few configs including regexes and predicates I think?
  • filters query strings & json blob

@dcramer
Copy link
Member

dcramer commented Mar 19, 2018

I'd prefer something simple like sanitizeKeys over the general processors pattern we implemented elsewhere. I think also supporting regexp is good.

@kamilogorek
Copy link
Contributor Author

kamilogorek commented Mar 19, 2018

The main issue I have with RegExp is that it can help people easily shoot themselves in the foot, as this sanitizer goes through all keys of the payload sent to Sentry. Including exception, message, frames, etc. etc.

@MaxBittker I added defaults at first, but then thought that there may be scenarios where people actually want to let them through, and changing this behavior later, would mean a breaking API change.

On the other hand, we already have some default values in the Sentry, so even if a user sends them through, they'll get sanitized. This means that adding them as a default value and merging passed config keys in the SDK as well shouldn't do any harm.

@ndmanvar
Copy link
Contributor

The main issue I have with RegExp is that it can help people easily shoot themselves in the foot,
as this sanitizer goes through all keys of the payload sent to Sentry. Including exception, message, frames, etc. etc.

I can understand why this can be an example of "shooting yourself in the foot ", but many companies/organization want to scrub PII everywhere (frames, tags, extra information, error title/msg, etc.). Maybe a config that specifies to look through everything / etc.??

Copy link
Contributor

@MaxBittker MaxBittker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks good to me, but could you help me feel confident that cyclic data will be handled safely?

another good follow up would be sanitizing http data

src/utils.js Outdated
}

function sanitize(input, sanitizeKeys) {
sanitizeKeys = isArray(sanitizeKeys) ? sanitizeKeys : [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's not an array, maybe just bail out here and return array? to reduce the risk involved in shipping this code?


function sanitize(input, sanitizeKeys) {
sanitizeKeys = isArray(sanitizeKeys) ? sanitizeKeys : [];
var sanitizeMask = '********';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know we use asterisks elsewhere, but i think something explicit like [value sanitized by raven.js]could be better.

probably not worth breaking with the precedent though

return '';
}

function sanitize(input, sanitizeKeys) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this deal with cycles? or will the data be guaranteed by this point not to have any? how strong is that guarantee? (afraid of freezing a browser in some weird scenario)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It didn't. Thanks for catching that!

docs/config.rst Outdated
.. describe:: sanitizeKeys

An array of strings representing keys that should be scrubbed from the payload sent to Sentry.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(worth being explicit here that this doesn't scrub or match on string values or query encoded data)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also worth mentioning that sentry itself can do do server side sanitizing and this is different

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically i think people are really gonna want to know what guarantees they get from this feature so it's worth writing a longer explanation

@kamilogorek
Copy link
Contributor Author

@dcramer added support for RegExp.
@MaxBittker handled cyclic references, updated docs and changed behavior to bail quickly with the input if incorrect keys type is used.

@kamilogorek kamilogorek merged commit cf87968 into master Mar 23, 2018
@kamilogorek kamilogorek deleted the sanitize branch March 23, 2018 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants