Skip to content

Commit 37d633e

Browse files
authored
Add multiple saves / change event proportions (#149)
* feat: Add multiple rows per block / profile Adds the ability to fake how many times a course was published or a user profile saved. In the real world these are drivers of table size and we should exercise the "most recent block" / "most recent profile" code more. * refactor: Don't enforce backend type on S3 load When doing S3 CSV generation and loads it required a lot of unnecessary switching back and forth between types in the config. Now as long as the correct keys exist it won't complain. * refactor: Change event proportions Based on feedback from live sites, we've recalculated the proportions of event types. This takes an average of 4 different live sites to find a reasonable middle ground. Site configuration and course design play large roles in these distributions in real life.
1 parent 77cb606 commit 37d633e

16 files changed

+306
-193
lines changed

default_config.yaml

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,6 @@ csv_load_from_s3_after: false
1919
# s3_key:
2020
# s3_secret:
2121

22-
2322
# Ralph / ClickHouse backend configuration
2423
# ########################################
2524
# backend: ralph_clickhouse
@@ -53,6 +52,10 @@ course_length_days: 120
5352
num_organizations: 3
5453
num_actors: 10
5554

55+
# This replicates users updating their profiles several times, creating
56+
# more rows
57+
num_actor_profile_changes: 5
58+
5659
# How many of each size course to create. The sum of these is the total number
5760
# of courses created for the test.
5861
num_course_sizes:
@@ -61,6 +64,11 @@ num_course_sizes:
6164
large: 1
6265
huge: 1
6366

67+
# How many times each course will be "published", this creates a more realistic
68+
# distribution of course blocks where each course can be published dozens or
69+
# hundreds of times while it is being developed.
70+
num_course_publishes: 100
71+
6472
# Course size configurations, how many of each type of object are created for
6573
# each course of this size. "actors" must be less than or equal to "num_actors".
6674
# For a course of this size to be created it needs to exist both here and in

example_configs/clickhouse_example.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,10 @@ course_length_days: 120
2727
num_organizations: 3
2828
num_actors: 10
2929

30+
# This replicates users updating their profiles several times, creating
31+
# more rows
32+
num_actor_profile_changes: 5
33+
3034
# How many of each size course to create. The sum of these is the total number
3135
# of courses created for the test.
3236
num_course_sizes:
@@ -35,6 +39,11 @@ num_course_sizes:
3539
large: 1
3640
huge: 1
3741

42+
# How many times each course will be "published", this creates a more realistic
43+
# distribution of course blocks where each course can be published dozens or
44+
# hundreds of times while it is being developed.
45+
num_course_publishes: 100
46+
3847
# Course size configurations, how many of each type of object are created for
3948
# each course of this size. "actors" must be less than or equal to "num_actors".
4049
# For a course of this size to be created it needs to exist both here and in

example_configs/large_csv_load.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,13 +39,22 @@ course_length_days: 120
3939
num_organizations: 10
4040
num_actors: 2000000
4141

42+
# This replicates users updating their profiles several times, creating
43+
# more rows
44+
num_actor_profile_changes: 5
45+
4246
# The sum of these is the total number of courses created for the test
4347
num_course_sizes:
4448
small: 100
4549
medium: 400
4650
large: 480
4751
huge: 120
4852

53+
# How many times each course will be "published", this creates a more realistic
54+
# distribution of course blocks where each course can be published dozens or
55+
# hundreds of times while it is being developed.
56+
num_course_publishes: 100
57+
4958
course_size_makeup:
5059
small:
5160
actors: 30

example_configs/local_csv.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,10 @@ course_length_days: 120
2020
num_organizations: 3
2121
num_actors: 10
2222

23+
# This replicates users updating their profiles several times, creating
24+
# more rows
25+
num_actor_profile_changes: 5
26+
2327
# How many of each size course to create. The sum of these is the total number
2428
# of courses created for the test.
2529
num_course_sizes:
@@ -28,6 +32,11 @@ num_course_sizes:
2832
large: 1
2933
huge: 1
3034

35+
# How many times each course will be "published", this creates a more realistic
36+
# distribution of course blocks where each course can be published dozens or
37+
# hundreds of times while it is being developed.
38+
num_course_publishes: 100
39+
3140
# Course size configurations, how many of each type of object are created for
3241
# each course of this size. "actors" must be less than or equal to "num_actors".
3342
# For a course of this size to be created it needs to exist both here and in

example_configs/ralph_clickhouse_example.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,10 @@ course_length_days: 120
2727
num_organizations: 3
2828
num_actors: 10
2929

30+
# This replicates users updating their profiles several times, creating
31+
# more rows
32+
num_actor_profile_changes: 5
33+
3034
# How many of each size course to create. The sum of these is the total number
3135
# of courses created for the test.
3236
num_course_sizes:
@@ -35,6 +39,11 @@ num_course_sizes:
3539
large: 1
3640
huge: 1
3741

42+
# How many times each course will be "published", this creates a more realistic
43+
# distribution of course blocks where each course can be published dozens or
44+
# hundreds of times while it is being developed.
45+
num_course_publishes: 100
46+
3847
# Course size configurations, how many of each type of object are created for
3948
# each course of this size. "actors" must be less than or equal to "num_actors".
4049
# For a course of this size to be created it needs to exist both here and in

example_configs/s3_csv.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,10 @@ course_length_days: 120
2424
num_organizations: 3
2525
num_actors: 10
2626

27+
# This replicates users updating their profiles several times, creating
28+
# more rows
29+
num_actor_profile_changes: 5
30+
2731
# How many of each size course to create. The sum of these is the total number
2832
# of courses created for the test.
2933
num_course_sizes:
@@ -32,6 +36,11 @@ num_course_sizes:
3236
large: 1
3337
huge: 1
3438

39+
# How many times each course will be "published", this creates a more realistic
40+
# distribution of course blocks where each course can be published dozens or
41+
# hundreds of times while it is being developed.
42+
num_course_publishes: 100
43+
3544
# Course size configurations, how many of each type of object are created for
3645
# each course of this size. "actors" must be less than or equal to "num_actors".
3746
# For a course of this size to be created it needs to exist both here and in

example_configs/s3_csv_clickhouse.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,10 @@ course_length_days: 120
3939
num_organizations: 3
4040
num_actors: 10
4141

42+
# This replicates users updating their profiles several times, creating
43+
# more rows
44+
num_actor_profile_changes: 5
45+
4246
# How many of each size course to create. The sum of these is the total number
4347
# of courses created for the test.
4448
num_course_sizes:
@@ -47,6 +51,11 @@ num_course_sizes:
4751
large: 1
4852
huge: 1
4953

54+
# How many times each course will be "published", this creates a more realistic
55+
# distribution of course blocks where each course can be published dozens or
56+
# hundreds of times while it is being developed.
57+
num_course_publishes: 100
58+
5059
# Course size configurations, how many of each type of object are created for
5160
# each course of this size. "actors" must be less than or equal to "num_actors".
5261
# For a course of this size to be created it needs to exist both here and in

xapi_db_load/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
Scripts to generate fake xAPI data against various backends.
33
"""
44

5-
__version__ = "1.4.1"
5+
__version__ = "1.5.0"

0 commit comments

Comments
 (0)