A gem for archiving (deleting) old records you no longer need. Send them straight to tartarus!
Add this line to your application's Gemfile:
gem 'tartarus-rb'And then execute:
$ bundle install
Or install it yourself as:
$ gem install tartarus-rb
This game is based on sidekiq-cron, which means you can manage (e.g. disable/enable) jobs from sidekiq-cron UI.
Here are some examples how to use it
Put it in the initializer, e.g. in config/initializers/sidekiq.rb right after loading schedule for sidekiq-cron:
Sidekiq.configure_server do |config|
  config.on(:startup) do
    schedule_file = "config/schedule.yml"
    if File.exist?(schedule_file) && Sidekiq.server?
      Sidekiq::Cron::Job.load_from_hash YAML.load_file(schedule_file)
      tartarus = Tartarus.new
      tartarus.register do |item|
        item.model = ModelThatYouWantToArchive
        item.cron = "5 4 * * *"
        item.queue = "default"
        item.tenants_range = -> { Account.active }
        item.tenant_value_source = :uuid
        item.tenant_id_field = :account_uuid
        item.archive_items_older_than = -> { 30.days.ago }
        item.timestamp_field = :created_at
        item.archive_with = :destroy_all
      end
      tartarus.register do |item|
        item.model = OtherModelThatYouWantToArchive
        item.cron = "5 5 * * *"
        item.queue = "default"
        item.tenants_range = -> { ["Account", "User"] }
        item.tenant_id_field = :model_type
        item.archive_items_older_than = -> { 30.days.ago }
        item.timestamp_field = :created_at
      end
      glacier_configuration = Tartarus::RemoteStorage::Glacier::Configuration.build(
        aws_key: ENV.fetch("AWS_KEY"),
        aws_secret: ENV.fetch("AWS_SECRET"),
        aws_region: ENV.fetch("AWS_REGION"),
        vault_name: ENV.fetch("GLACIER_VAULT_NAME"),
        root_path: Rails.root.to_s,
        archive_registry_factory: ArchiveRegistry,
      )
      # don't forget about installing `aws-sdk-glacier` gem
      tartarus.register do |item|
        item.model = YetAnotherModel
        item.cron = "5 6 * * *"
        item.queue = "default"
        item.timestamp_field = :created_at
        item.archive_items_older_than = -> { 1.week.ago }
        item.remote_storage = Tartarus::RemoteStorage::Glacier.new(glacier_configuration)
      end
      tartarus.schedule #  this method must be called to create jobs for sidekiq-cron!
    end
  end
endYou can use the following config params:
model- a name of the ActiveReord model you want to archive, requiredname- name of your strategy, optional. It fallbacksmodel.to_s. It's important to set in in cases when you have several strategies for the same model:
  tartarus.register do |item|
    item.model = InternalEvent
    item.name = "archive_account_and_user_internal_events"
    item.cron = "5 5 * * *"
    item.queue = "default"
    item.tenants_range = -> { ["Account", "User"] }
    item.tenant_id_field = :model_type
    item.archive_items_older_than = -> { 30.days.ago }
    item.timestamp_field = :created_at
  end
  tartarus.register do |item|
    item.model = InternalEvent
    item.name = "archive_post_and_comment_internal_events"
    item.cron = "5 15 * * *"
    item.queue = "default"
    item.tenants_range = -> { ["Post", "Comment"] }
    item.tenant_id_field = :model_type
    item.archive_items_older_than = -> { 10.days.ago }
    item.timestamp_field = :created_at
  endcron- cron syntax, requiredqueue- name of the sidekiq queue you want to use for execution of the jobs, requiredtenants_range- optional, use if you want to scope items by a tenant (or any field that can be used for partitioning). It doesn't have to be ActiveRecord collection, could be just an array. Must be a proc/lambda/object responding tocallmethod. For ActvieRecord collection,find_eachloop will be used for optimization.tenant_value_source- optional but required if you want to have scoping by tenant/partitioning field. Specifying:uuidhere means that ModelThatYouWantToArchive collection will be scheduled for archiving by uuid of each Account. It defaults toid.tenant_id_field- required when using tenant_value_source/tenant_value_source. It's a DB column that will be used for scoping records by a tenant. For example, here it would be:ModelThatYouWantToArchive.where(account_uuid: value_of_uuid_from_some_active_account)archive_items_older_than- required, for defining retention policytimestamp_field- required, used for performing a query using the value fromarchive_items_older_thanarchive_with- optional (defaults todelete_all). Could bedelete_all,destroy_all,delete_all_without_batches,destroy_all_without_batches,delete_all_using_limit_in_batchesbatch_size- optional (defaults to10_000, used withdelete_all_using_limit_in_batchesstrategy)remote_storage- optional (defaults toTartarus::RemoteStorage::Nullwhich does nothing). Use this option if you want store the data somewhere before deleting it.
Currently, only Glacier (for AWS Glacier) is supported. Also, it works only with Postgres database and requires postgres-copy.
To take advantage of this feature you will need a couple of things:
- Apply 
acts_as_copy_targetto the archivable model (frompostgres-copygem). - Create a model that will be used as a registry for all uploads that happened.
 - Install 
aws-sdk-glaciergem. 
If you want to make Version model archivable and use ArchiveRegistry as the registry, you will need the following models and tables:
database.create_table(:archive_registries) do |t|
  t.string :glacier_location, null: false
  t.string :glacier_checksum, null: false
  t.string :glacier_archive_id, null: false
  t.string :archivable_model, null: false
  t.string :tenant_id_field
  t.string :tenant_id
  t.datetime :completed_at, null: false
end
database.create_table(:versions) do |t|
end
class Version < ApplicationRecord
  acts_as_copy_target
end
class ArchiveRegistry < ApplicationRecord
endYou can use the above schema for the registry model as it contains all needed fields.
To initialize the service:
glacier_configuration = Tartarus::RemoteStorage::Glacier::Configuration.build(
  aws_key: ENV.fetch("AWS_KEY"),
  aws_secret: ENV.fetch("AWS_SECRET"),
  aws_region: ENV.fetch("AWS_REGION"),
  vault_name: ENV.fetch("GLACIER_VAULT_NAME"),
  root_path: Rails.root.to_s,
  archive_registry_factory: ArchiveRegistry,
)
Tartarus::RemoteStorage::Glacier.new(glacier_configuration)You can also pass account_id (by default "-" string will be used):
glacier_configuration = Tartarus::RemoteStorage::Glacier::Configuration.build(
  aws_key: ENV.fetch("AWS_KEY"),
  aws_secret: ENV.fetch("AWS_SECRET"),
  aws_region: ENV.fetch("AWS_REGION"),
  vault_name: ENV.fetch("GLACIER_VAULT_NAME"),
  root_path: Rails.root.to_s,
  archive_registry_factory: ArchiveRegistry,
  account_id: "some_account_id"
)
Tartarus::RemoteStorage::Glacier.new(glacier_configuration)Important - do not use Glacier Storage for large batches (> 4 GB) as multipart uploads are not supported yet.
If you know what you are doing, you can add your own storage, as long as it complies with the following interface:
class Glacier
  attr_reader :configuration
  private     :configuration
  def initialize(configuration)
    @configuration = configuration
  end
  def store(collection, archivable_model, tenant_id: nil, tenant_id_field: nil)
  end
endYou might want to verify that the gem works in the way you expect it to work. For that, you will be mostly interested in 2 usecases:
- scheduling/enqueueing: use 
Tartarus::ScheduleArchivingModel#schedule- for example,Tartarus::ScheduleArchivingModel.new.schedule("PaperTrailVersion"), it's going to enqueue eitherTartarus::Sidekiq::ArchiveModelWithTenantJoborTartarus::Sidekiq::ArchiveModelWithoutTenantJob, depending on the config. - execution of the archiving logic: use 
Tartarus::ArchiveModelWithTenant#archive(for example,Tartarus::ArchiveModelWithTenant.new.archive("PaperTrailVersion", "User")) orTartarus::ArchiveModelWithoutTenant#archive(for example,Tartarus::ArchiveModelWithoutTenant.new.archive("PaperTrailVersion")) 
You might also want to check spec/integration to get an idea how the integration tests were written.
After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.
Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/tartarus-rb.
The gem is available as open source under the terms of the MIT License.