Skip to content

Filter event reporter payloads #38

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

zzak
Copy link

@zzak zzak commented Jul 20, 2025

This is another option for filtering sensitive payload data.

We filter the data early when resolving the payload or kwargs, but doesn't handle custom objects (which are not yet hashes).

/cc @adrianna-chang-shopify

p8 and others added 5 commits January 22, 2025 14:59
The ConsoleLexer allows marking lines starting with `#` as comments
instead of a prompt (which is the default). The improves the formatting
of bash examples.
https://www.rubydoc.info/gems/rouge/Rouge/Lexers/ConsoleLexer
The CSP hash-source is not enclosed in ' and is ignored by the browser.

According to the official W3C definition of a hash-source, it must be enclosed in '.

> hash-source = "'" hash-algorithm "-" base64-value "'"
> hash-algorithm = "sha256" / "sha384" / "sha512"
> https://www.w3.org/TR/CSP3/#grammardef-hash-source
On mobile devices, when reading the Rails Guides I noticed horizontal
scroll. This can be fixed with a few small CSS tweaks around absolute
width in .code, white-space: normal, overflow: scroll
and white-space: pre respectively.

Co-authored-by: Juan Vasquez <[email protected]>
Capybara supports a range of global [selectors][]:

> All Selectors below support the listed selector specific filters in
> addition to the following system-wide filters

> * `:id` (String, Regexp, XPath::Expression) - Matches the id attribute
> * `:class` (String, Array, Regexp, XPath::Expression) - Matches the class(es) provided
> * `:style` (String, Regexp, Hash) - Match on elements style
> * `:above` (Element) - Match elements above the passed element on the page
> * `:below` (Element) - Match elements below the passed element on the page
> * `:left_of` (Element) - Match elements left of the passed element on the page
> * `:right_of` (Element) - Match elements right of the passed element on the page
> * `:near` (Element) - Match elements near (within 50px) the passed element on the page
> * `:focused` (Boolean) - Match elements with focus (requires driver support)

While the Action Text-provided `#fill_in_rich_text_area` System Test
helper is capable of resolving `<trix-editor>` elements (through the
`:rich_textarea` custom Capybara selector), it doesn't support
additional options (except for `:with`).

This commit forwards any available options to the underlying call to
[find][] as filters.

This enables combining `<label>`-based locator text alongside global
filters like `:focused` or `:id`:

```ruby
fill_in_rich_text_area "Label text with [id]", id: "trix_editor_1", with: "Hello"
fill_in_rich_text_area "Label text matching :focus", focused: true, with: "Hello"
fill_in_rich_text_area "Label text not matching :focus", focused: false, with: "Hello"
```

[selectors]: https://rubydoc.info/github/teamcapybara/capybara/master/Capybara/Selector
[find]: https://rubydoc.info/github/teamcapybara/capybara/master/Capybara/Node/Finders:find
Copy link

@adrianna-chang-shopify adrianna-chang-shopify left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on these, Zak!

I think I'm leaning towards this one, where we filter at payload resolution time. This way, subscribers consistently receive filtered data vs relying on the use of a default encoder to handle redacting data. This seems to align a bit more closely with how e.g. Rails filters request parameters -- we filter as soon as the request is processed, and the controllers, log subscribers, instrumentation, etc. all receive the filtered params.

I do wonder if we could combine both approaches 🤔 Filter at payload resolution time for hash-based payloads, and then again at encode time for objects:

module Encoders
  class Base
    def self.transform_event(event)
      unless event[:payload].is_a?(Hash) # already filtered
        parameter_filter = ActiveSupport::ParameterFilter.new(
          ActiveSupport.filter_parameters,
          mask: ActiveSupport::ParameterFilter::FILTERED
        )
        event[:payload] = parameter_filter.filter(event[:payload].to_h)
      end

      event[:tags] = event[:tags].transform_values do |value|
        value.respond_to?(:to_h) ? value.to_h : value
      end
      event
    end
  end

Do you think we should also filter tags?

@adrianna-chang-shopify adrianna-chang-shopify force-pushed the ac-structured-events branch 2 times, most recently from a320b9f to a266aef Compare July 22, 2025 15:06
This automates sigstore signing & uploading of attestations with gem push

This is identical to the behavior added by the rubygems/release-gem official release action

--attestations is supported by rubygems >= 3.6.0

Signed-off-by: Samuel Giddins <[email protected]>
@zzak
Copy link
Author

zzak commented Jul 24, 2025

I think filtering the data earlier is better for the reasons you mentioned, but for objects there is a caveat that they don't get filtered until later, or you can filter them yourself... once you stop using one default encoders you're on your own.

Do you think we should also filter tags?

Yeah I thought about that, but they seemed more intentional and not relying on user data. I could see a case for adding it just to be sure though. We typically use tags to make searching for events easier, so like org / team / project / etc, while the payload contains request data, user/client data, etc.

@zzak zzak force-pushed the event-reporter-filter-params-payload branch from 19a763e to 8d08426 Compare July 26, 2025 06:50
@zzak
Copy link
Author

zzak commented Jul 26, 2025

I'm working on combining these two PRs for filtering params, just don't want to hold you up 🙇

@adrianna-chang-shopify
Copy link

but for objects there is a caveat that they don't get filtered until later, or you can filter them yourself... once you stop using one default encoders you're on your own.

I think this is a super reasonable stance to take, given event objects can essentially be anything (and if users aren't using the default encoders, there's no guarantee we can even to_h them).

Yeah I thought about that, but they seemed more intentional and not relying on user data.

Okay cool, the usages of tags we've seen so far are similar (e.g. tagging that an event came from a particular area of the codebase). Filtering tag objects might prove to be a bit of a pain -- if we want to leave this off for now, I'm fine with that!

Absolutely no rush, I have a good chunk of feedback to work through on the main PR still!

@adrianna-chang-shopify adrianna-chang-shopify force-pushed the ac-structured-events branch 3 times, most recently from ebd42e3 to 2e8ae57 Compare August 5, 2025 18:32
@adrianna-chang-shopify adrianna-chang-shopify force-pushed the ac-structured-events branch 3 times, most recently from 947cc00 to 2731985 Compare August 12, 2025 13:48
lazylester and others added 16 commits August 12, 2025 15:02
* Update action_view_overview.md

Clarifies that a single counter variable is available when instance of different ActiveRecord subclasses are included in the collection. This is the behaviour specified in the test in rails/actionpack/test/controller/render.rb called test_partial_collection_shorthand_with_different_types_of_records

* Update guides/source/action_view_overview.md

---------

Co-authored-by: Matheus Richard <[email protected]>
Co-authored-by: Rafael Mendonça França <[email protected]>
Enable comments formatting on bash examples in guides
…-error-pages

Add Copy as Text button to error pages
…ags-in-scaffold

scaffold: generate valid html when the model has_many_attached
Support hash-source in Content Security Policy
Prevent double scrollbar issue on mailer previews
I wrote at length on the impact of using instance variables in an
Array subclass (or some other core types) in
https://byroot.github.io/ruby/performance/2025/08/11/unlocking-ractors-generic-variables.html

Since uncountables are accessed by some low-level hotspots like `pluralize`,
it's beneficial not to rely on the generic ivar table.

the difference isn't huge, because `pluralize` is quite a costly
operation, but having less generic ivars has non-local performance
benefits.

NB: don't read this as pluralize becoming 9% faster.
That's not what this benchmark shows.

```
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after   134.737k i/100ms
Calculating -------------------------------------
               after      1.394M (± 0.3%) i/s  (717.60 ns/i) -      7.006M in   5.027775s

Comparison:
              before:  1282919.6 i/s
               after:  1393537.2 i/s - 1.09x  faster
```

```ruby
require "bundler/inline"

gemfile do
  gem "rails", path: "."
  gem "benchmark-ips"
end

class GenIvar < Array
  def initialize
    @ivar = 1
  end
end

gens = 500_000.times.map { GenIvar.new }

ActiveSupport::Inflector.inflections.uncountable << "test"

STAGE = ENV["STAGE"]

Benchmark.ips do |x|
  x.report(STAGE || "pluralize") { ActiveSupport::Inflector.pluralize("test") }
  x.compare!(order: :baseline)
  if STAGE
    x.save!("/tmp/bench-pluralize")
  end
end
```
It's much faster to match a single big regexp built with `Regexp.union`
than to match many smaller regexps with `.any?`.

Before:

```
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +YJIT +PRISM [arm64-darwin24]
Calculating -------------------------------------
             regular    275.598k (± 0.8%) i/s    (3.63 μs/i) -      1.401M in   5.082747s
           irregular      1.319M (± 0.9%) i/s  (758.29 ns/i) -      6.658M in   5.049382s
         uncountable      2.428M (± 0.8%) i/s  (411.87 ns/i) -     12.315M in   5.072615s
```

After:

```
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +YJIT +PRISM [arm64-darwin24]
Calculating -------------------------------------
             regular    255.878k (± 1.0%) i/s    (3.91 μs/i) -      1.302M in   5.088179s
           irregular    883.098k (± 0.7%) i/s    (1.13 μs/i) -      4.438M in   5.025973s
         uncountable      1.523M (± 1.1%) i/s  (656.60 ns/i) -      7.676M in   5.040342s
```

Benchmark:

```ruby
require "bundler/inline"

gemfile do
  gem "rails", path: "."
  gem "benchmark-ips"
end

Benchmark.ips do |x|
  x.report("regular") { ActiveSupport::Inflector.pluralize("test") }
  x.report("irregular") { ActiveSupport::Inflector.pluralize("zombie") }
  x.report("uncountable") { ActiveSupport::Inflector.pluralize("sheep") }
end
```
The `en` inflector is almost always the one used for itnernal
framework inflections, hence it's called much more than any other.

Hence it's worth having a fast path for it.

Before:

```
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +YJIT +PRISM [arm64-darwin24]
Calculating -------------------------------------
             regular    278.450k (± 1.1%) i/s    (3.59 μs/i) -      1.401M in   5.030803s
           irregular      1.320M (± 0.9%) i/s  (757.36 ns/i) -      6.617M in   5.012038s
         uncountable      2.417M (± 0.5%) i/s  (413.72 ns/i) -     12.172M in   5.035810s
```

After:

```
ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +YJIT +PRISM [arm64-darwin24]
Calculating -------------------------------------
             regular    297.666k (± 1.3%) i/s    (3.36 μs/i) -      1.509M in   5.068638s
           irregular      1.975M (± 0.6%) i/s  (506.28 ns/i) -      9.939M in   5.031926s
         uncountable      6.078M (± 0.5%) i/s  (164.54 ns/i) -     30.504M in   5.019120s
```

Benchmark:

```
require "bundler/inline"

gemfile do
  gem "rails", path: "."
  gem "benchmark-ips"
end

Benchmark.ips do |x|
  x.report("regular") { ActiveSupport::Inflector.pluralize("test") }
  x.report("irregular") { ActiveSupport::Inflector.pluralize("zombie") }
  x.report("uncountable") { ActiveSupport::Inflector.pluralize("sheep") }
end
```
This prevents some unexpected line wrapping.
Fix Horizontal Scroll on Mobile [ci skip]
…stom-methods-clean

docs: document alias_attribute behavior change in Rails 7.2 upgrading guide
…t-helper-options

Forward `fill_in_rich_text_area` options to Capybara
rafaelfranca and others added 8 commits August 13, 2025 14:33
[Fixes rails#49745 rails#52876]
[Related rails#51883 rails#49771]

This commit refactors the implementation of dispatching method calls
to broadcasted loggers. The updated implementation ensures that any
block is only executed once. The first logger would recieve the block
as per normal, but subsequent loggers would only yield the result of
the initial execution.

The updated implementation of `dispatch` opened up an opportunity to
refactor the way each Logger method is delegatated to broadcasted
loggers - simplifying the delegator definitions.

Prior to these changes, BroadcastLoggers would iterate each broadcast
and re-execute the user provided block for each.

The consumer of any Logger would reasonably expect than when calling a
method with a block, that the block would only execute a single time.
That is, the fact that a Logger is a BroadcastLogger should be
irrelevant to consumer.

The most common example of this is when using
ActiveSupport::TaggedLogging and wrapping behaviour in a
`tagged(*tags) { }` block. But this also applies when using the block
form `info`, `warn` etc. If a BroadcastLogger is used, and there are
multiple loggers being broadcast to, then calling one of these methods
with a block would result in the block being executed multiple times.

For example:

```ruby
  broadcasts = ActiveSupport::BroadcastLogger.new(
    *Array.new(2) { ActiveSupport::Logger.new(STDOUT) }
  )
  number = 0
  broadcasts.info {
    number += 1
    "Updated number to #{number}"
  }
  # Expected:
  # Updated number to 1
  # Updated number to 1
  #
  # Actual:
  # Updated number to 1
  # Updated number to 2
```

After these changes, the behaviour of BroadcastLogger reflects the
expected behaviour above.
…with-attestations-from-github-actions

Release gems with attestations from GitHub Actions
…block

Fix ActiveSupport::BroadcastLogger from executing a block argument for each logger (tagged, info, etc.)
AS::TimeZone exposes MAPPING as way to normalize TZ name
But using it is pretty inconvinient

    zone = ActiveSupport::TimeZone['Hawaii']
    # Old way
    ActiveSupport::TimeZone::MAPPING[zone.name]
    # New way
    zone.standard_name
[CVE-2025-24293]

A subset of transformation methods included in the default allowed list
still present potential command injection risk to applications accepting
arbitrary user input for transformations or their parameters.

Doing so is unsupported behavior and should be considered dangerous.
@zzak zzak force-pushed the event-reporter-filter-params-payload branch from 8d08426 to 19a763e Compare August 14, 2025 07:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.