Skip to content

Default values for entries #89

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 28 commits into from
Closed

Conversation

tleish
Copy link
Contributor

@tleish tleish commented Jun 11, 2022

fixes #66

Default values can be static or Proc.

Types with default:

  • list
  • uniq_list
  • string
  • integer
  • decimal
  • datetime
  • float
  • set
  • proxy
  • json
  • counter
  • string
  • hash
  • boolean

Types without default:

  • cycle
  • enum
  • flag
  • slots

TODO:
Looking for feedback. After getting feedback I can update the README to include examples of using default

@tleish tleish force-pushed the add-default-values branch from 40095b0 to 10ceac9 Compare June 20, 2022 16:03
@@ -4,7 +4,8 @@ class Kredis::Types::Set < Kredis::Types::Proxying
attr_accessor :typed

def members
strings_to_types(smembers || [], typed).sort
value = exists? ? smembers : initialize_with_default || []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds an additional redis command call on every call to #members. Why wouldn't smembers || initialize_with_default || [] work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my testing, when a set has not been created, redis returns the following:

  • #exists? => false
  • #smembers => [] # empty array

I assumed we should not use just an empty array to decide if the code should apply the default value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to this answer on StackOverflow,

When a set is empty, it is removed from the namespace.

see: https://stackoverflow.com/questions/13817865/redis-nil-or-empty-list-or-set#13820112

So, maybe we can assume with:

strings_to_types(smembers.presence || initialize_with_default || [], typed).sort

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks right to me 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question might go for list?

  def elements
-   value = exists? ? lrange(0, -1) : initialize_with_default || []
+   value = lrange(0, -1).presence || initialize_with_default || []
    strings_to_types(value, typed)
  end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @jeremy has an idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a fairly significant refactor.

  • Used the same pattern throughout of #initialize_with_default defined in the base class. #initialize_with_default calls #set_default, which is overridden in many of the child classes.
  • Created a custom #callnx method which executes a redis lua script. The lua script calls a dynamic method with values if the redis key does not exist. (Note: the script uses redis#eval, so additional care was taken to validate formatting of the method call.) Using this custom redis method allows the code to run methods like hset and rpush only if key does not exist, and do this logic without adding redis command calls as it can be executed in a multi block
  • Updated the custom redis #multi method to allow nested multi blocks (when one method using a multi block calls another method which also uses a multi block)

@dhh @jeremy - let me know if this a direction you want to take.

Copy link
Contributor Author

@tleish tleish Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I tested performance on my local environment with the following benchmarks

require 'benchmark/ips'

proxy = Kredis.proxy 'mykey'
proxy.set 'myvalue'

Benchmark.ips(7) do |x|
  x.report("get only") do
    proxy = Kredis.proxy 'mykey'
    proxy.get
  end
  x.report("exists?") do
    proxy = Kredis.proxy 'mykey'
    proxy.set 'myvalue' unless proxy.exists?
    proxy.get
  end
  x.report("nx: true") do
    proxy = Kredis.proxy 'mykey'
    proxy.set 'myvalue', nx: true
    proxy.get
  end
  x.report("callnx") do
    proxy = Kredis.proxy 'mykey'
    proxy.callnx(:set, 'myvalue')
    proxy.get
  end
  x.report("multi > nx: true") do
    proxy = Kredis.proxy 'mykey'
    proxy.multi do
      proxy.set 'myvalue', nx: true
      proxy.get
    end
  end
  x.report("multi > callnx") do
    proxy = Kredis.proxy 'mykey'
    proxy.multi do
      proxy.callnx(:set, 'myvalue')
      proxy.get
    end
  end
  x.compare!
end

The results show:

Warming up --------------------------------------
            get only     1.064k i/100ms
             exists?   731.000  i/100ms
            nx: true   644.000  i/100ms
              callnx   621.000  i/100ms
    multi > nx: true   702.000  i/100ms
      multi > callnx   664.000  i/100ms
Calculating -------------------------------------
            get only     10.617k (± 6.7%) i/s -     74.480k in   7.053390s
             exists?      7.322k (± 2.9%) i/s -     51.901k in   7.094167s
            nx: true      7.030k (± 5.7%) i/s -     49.588k in   7.080403s
              callnx      6.810k (± 7.1%) i/s -     47.817k in   7.059580s
    multi > nx: true      6.891k (± 4.7%) i/s -     48.438k in   7.047004s
      multi > callnx      6.608k (± 4.2%) i/s -     46.480k in   7.046974s

Comparison:
            get only:    10617.3 i/s
             exists?:     7322.1 i/s - 1.45x  (± 0.00) slower
            nx: true:     7030.3 i/s - 1.51x  (± 0.00) slower
    multi > nx: true:     6891.2 i/s - 1.54x  (± 0.00) slower
              callnx:     6810.3 i/s - 1.56x  (± 0.00) slower
      multi > callnx:     6607.6 i/s - 1.61x  (± 0.00) slower

Observations:

  • As expected, that get only is the fastest
  • As expected, defining a default value adds cost in checking the existence and setting if not exist
  • Perhaps something is wrong with the test setup, but in these scenarios using multi adds additional overhead vs not using multi. I expected the opposite. So, I setup a 2nd test around the usage of multi.
require 'benchmark/ips'

Benchmark.ips(7) do |x|
  x.report("1 cmd") do
    proxy = Kredis.proxy 'mykey'
    proxy.get
  end
  x.report("2 cmds") do
    proxy = Kredis.proxy 'mykey'
    proxy.set 'myvalue'
    proxy.get
  end
  x.report("3 cmds") do
    proxy = Kredis.proxy 'mykey'
    proxy.set 'myvalue'
    proxy.set 'myvalue'
    proxy.get
  end
  x.report("multi > 1 cmd") do
    proxy = Kredis.proxy 'mykey'
    proxy.multi do
      proxy.get
    end
  end
  x.report("multi > 2 cmds") do
    proxy = Kredis.proxy 'mykey'
    proxy.multi do
      proxy.set 'myvalue'
      proxy.get
    end
  end
  x.report("multi > 3 cmds") do
    proxy = Kredis.proxy 'mykey'
    proxy.multi do
      proxy.set 'myvalue'
      proxy.set 'myvalue'
      proxy.get
    end
  end
  x.compare!
end
Warming up --------------------------------------
               1 cmd     1.059k i/100ms
              2 cmds   709.000  i/100ms
              3 cmds   552.000  i/100ms
       multi > 1 cmd   797.000  i/100ms
      multi > 2 cmds   730.000  i/100ms
      multi > 3 cmds   610.000  i/100ms
Calculating -------------------------------------
               1 cmd     10.290k (± 6.8%) i/s -     72.012k in   7.034624s
              2 cmds      7.160k (± 3.1%) i/s -     50.339k in   7.037131s
              3 cmds      5.289k (± 6.5%) i/s -     36.984k in   7.026510s
       multi > 1 cmd      7.937k (± 5.1%) i/s -     55.790k in   7.049686s
      multi > 2 cmds      6.976k (± 3.7%) i/s -     48.910k in   7.021257s
      multi > 3 cmds      6.317k (± 5.0%) i/s -     44.530k in   7.069582s

Comparison:
               1 cmd:    10289.5 i/s
       multi > 1 cmd:     7937.3 i/s - 1.30x  (± 0.00) slower
              2 cmds:     7160.2 i/s - 1.44x  (± 0.00) slower
      multi > 2 cmds:     6975.8 i/s - 1.48x  (± 0.00) slower
      multi > 3 cmds:     6316.9 i/s - 1.63x  (± 0.00) slower
              3 cmds:     5289.0 i/s - 1.95x  (± 0.00) slower

Observations:

  • Using multi adds overhead
  • Using multi is only faster when the code sends 3+ commands. The overhead of using multi for 1 or 2 commands is actually slower.
  • In the latest updated version there's lots of usage of multi (which I assumed would be faster). After seeing these benchmarks, it's going to slightly slow down the current performance. For example:
def value
-     value_after_casting = string_to_type(get, typed)

-     if value_after_casting.nil?
-       default
-     else
-       value_after_casting
-     end
+    get_value = multi do
+         set type_to_string(value, typed), ex: expires_in, nx: true if default_value.present?
+         get
+    end[-1]
+    string_to_type(get_value, typed)
end

In the above example, even if default is nil the code will run the get inside of multi, which goes from 10289.5 i/s to 7937.3 i/s.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the previous results in my local environment where both ruby and redis run on localhost. I then ran the same benchmarks on AWS where redis does not run on localhost. Here's the result.

AWS Network test

Warming up --------------------------------------
            get only   109.000  i/100ms
             exists?    61.000  i/100ms
            nx: true    62.000  i/100ms
              callnx    57.000  i/100ms
    multi > nx: true   105.000  i/100ms
      multi > callnx   103.000  i/100ms
Calculating -------------------------------------
            get only      1.255k (± 7.7%) i/s -      8.829k in   7.089845s
             exists?    606.765  (±10.2%) i/s -      4.209k in   7.049791s
            nx: true    614.566  (± 7.2%) i/s -      4.278k in   7.011848s
              callnx    604.887  (± 8.1%) i/s -      4.218k in   7.032741s
    multi > nx: true      1.071k (± 5.9%) i/s -      7.560k in   7.090091s
      multi > callnx      1.075k (± 4.4%) i/s -      7.622k in   7.101791s

Comparison:
            get only:     1255.3 i/s
      multi > callnx:     1075.5 i/s - 1.17x  (± 0.00) slower
    multi > nx: true:     1070.6 i/s - 1.17x  (± 0.00) slower
            nx: true:      614.6 i/s - 2.04x  (± 0.00) slower
             exists?:      606.8 i/s - 2.07x  (± 0.00) slower
              callnx:      604.9 i/s - 2.08x  (± 0.00) slower

AWS Network test multi

Warming up --------------------------------------
               1 cmd   132.000  i/100ms
              2 cmds    65.000  i/100ms
              3 cmds    44.000  i/100ms
       multi > 1 cmd   113.000  i/100ms
      multi > 2 cmds   115.000  i/100ms
      multi > 3 cmds   109.000  i/100ms
Calculating -------------------------------------
               1 cmd      1.296k (± 7.2%) i/s -      9.108k in   7.074883s
              2 cmds    632.662  (±11.7%) i/s -      4.355k in   7.013278s
              3 cmds    423.400  (± 8.0%) i/s -      2.948k in   7.020112s
       multi > 1 cmd      1.186k (± 7.1%) i/s -      8.249k in   7.005603s
      multi > 2 cmds      1.156k (± 3.2%) i/s -      8.165k in   7.072965s
      multi > 3 cmds      1.085k (± 4.4%) i/s -      7.630k in   7.044975s

Comparison:
               1 cmd:     1296.3 i/s
       multi > 1 cmd:     1186.0 i/s - same-ish: difference falls within error
      multi > 2 cmds:     1155.6 i/s - 1.12x  (± 0.00) slower
      multi > 3 cmds:     1085.4 i/s - 1.19x  (± 0.00) slower
              2 cmds:      632.7 i/s - 2.05x  (± 0.00) slower
              3 cmds:      423.4 i/s - 3.06x  (± 0.00) slower

Observations:

In an environment where Redis NOT is hosted on the same environment

  • multi > callnx and multi > nx: true have the same performance, and
  • multi is faster for 2+ commands.
  • My original implementation is the best option, but I want to look and consider NOT using multi for cases when default is not defined and only 1 call will be made.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the code to be more efficient in that it only uses multi in the event that a default value has been defined, otherwise it reverts back to the prior behavior.

At this point, the code is ready for another review.

proxying :hget, :hset, :hmget, :hdel, :hgetall, :hkeys, :hvals, :del, :exists?
ZERO_FIELDS_ADDED = 0

proxying :hset, :hdel, :hgetall, :del, :exists?, :multi, :callnx
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed some of these proxying methods because they were no longer used in the class. Do they need to be kept because of the public api others may have tied into?

@lewispb
Copy link
Contributor

lewispb commented Jul 8, 2023

I wonder if we could simplify this significantly.

Rather than check (and set) a default value on type's read and write operations, could we initialize the default value once upon type instance initialization?

Here's what I mean:

class Person
  include Kredis::Attributes

  kredis_set :names, default: [ "Lewis", "Jeremy" ]

  def id = 1
end

Setting default value on each operation (this PR)

>> person = Person.new
>> person.names

>> person.names.add "David" # every operation must be aware of defaults
  Kredis Proxy (0.0ms)  CALLNX people:1:names ["sadd", "Lewis", "Jeremy"]
  Kredis Proxy (0.0ms)  SADD people:1:names ["David"]
=> 1

>> person.names.members
  Kredis Proxy (0.0ms)  CALLNX people:1:names ["sadd", "Lewis", "Jeremy"]
  Kredis Proxy (0.0ms)  SMEMBERS people:1:names
=> ["David", "Jeremy", "Lewis"]

vs Setting default value on initialization

>> person = Person.new
>> person.names
  Kredis  (0.1ms)  Connected to shared
  Kredis Proxy (1.6ms)  EXISTS? people:1:names # One-time check...
  Kredis Proxy (0.2ms)  SADD people:1:names ["Lewis", "Jeremy"] # and set default

>> person.names.add "David" # subsequent operations don't need to concern themselves with defaults
  Kredis Proxy (0.7ms)  SADD people:1:names ["David"]
=> 1

>> person.names.members
  Kredis Proxy (0.6ms)  SMEMBERS people:1:names
=> ["David", "Jeremy", "Lewis"]

I think setting the default values on initialization has a few benefits:

  • Simpler Ruby implementation, only type initialization needs to consider defaults rather than every operation
  • No need for the Redis Lua script which adds overhead to the Redis server performance
  • 1-2 additional Redis operations when a class is initialized, rather than N operations depending on usage

@lewispb lewispb mentioned this pull request Jul 8, 2023
2 tasks
@lewispb
Copy link
Contributor

lewispb commented Jul 8, 2023

@tleish I hope you don't mind but I kept your tests but adjusted the implementation to fit my idea of setting defaults on initialization: #119

@dhh dhh closed this Aug 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Default proc if empty
3 participants