-
Notifications
You must be signed in to change notification settings - Fork 59
Out of memory caching #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
30 commits
Select commit
Hold shift + click to select a range
fb271f1
Add google datastore memoization option.
danielecook e3569a6
add error messages and compression.
danielecook 51a4544
Add to do and additional cache types.
danielecook 56a6c8f
update readme.
danielecook 1e8e809
Rewrite to use googleauthR
danielecook 514514a
Fix cache return value and don't automatically reset cache.
danielecook 20ce739
Finish datastore cache.
danielecook 2e690c6
cleanup namespace
danielecook fc44834
Add AWS Service
danielecook 0448ff8
Cleanup and readme.
danielecook 9e1ac4c
Update README.md
danielecook 27fe9e8
Rename, update
danielecook c2f0bce
Update description.
danielecook 6fad8e3
fix description
danielecook 73c11b4
fix remotes
danielecook cd16c33
fix remotes.
danielecook df3b92f
fix remotes.
danielecook 4c6df0d
use git repo instead
danielecook 25cbe38
Remotes were fine...
danielecook 68e8636
Update README.md
danielecook f435e95
Restructure for fork
danielecook a952d5d
Merge branch 'master' of https://github.com/danielecook/xmemoise
danielecook c32fd35
Improve documentation.
danielecook 324e5a5
remove documentation items not allowed...
danielecook 9f4a914
Fixed final warnings.
danielecook 5b3e9ee
improve documentation with examples.
danielecook f60558d
Update README.md
danielecook f6ddf99
Update README.md
danielecook d86494d
Update README.md
danielecook baed71c
Update README.md
danielecook File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,3 +3,4 @@ | |
| ^\.Rproj\.user$ | ||
| ^revdep$ | ||
| ^cran-comments\.md$ | ||
| ^\.httr-oauth$ | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,4 @@ | ||
| .Rproj.user | ||
| .Rhistory | ||
| .RData | ||
| .httr-oauth |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,18 +1,27 @@ | ||
| Encoding: UTF-8 | ||
| Package: memoise | ||
| Title: Memoisation of Functions | ||
| Version: 1.0.0.9000 | ||
| Description: Memoisation allows the results of functions to be cached based on input parameters | ||
| xmemoise offers both local and remote caches, enabling memoisation across computers. | ||
| Additional cache types include google datastore and amazon aws. | ||
| Version: 1.0.1 | ||
| Authors@R: c( | ||
| person("Hadley", "Wickham", , "[email protected]", role = "aut"), | ||
| person("Jim", "Hester", , "[email protected]", role = c("aut", "cre")), | ||
| person("Kirill", "Müller", , "[email protected]", role = "aut")) | ||
| person("Kirill", "Müller", , "[email protected]", role = "aut"), | ||
| person("Daniel", "Cook", , "[email protected]", role = "aut")) | ||
| Description: Cache the results of a function so that when you call it | ||
| again with the same arguments it returns the pre-computed value. | ||
| URL: https://github.com/hadley/memoise | ||
| URL: https://github.com/danielecook/xmemoise | ||
| BugReports: https://github.com/hadley/memoise/issues | ||
| Imports: | ||
| digest (>= 0.6.3) | ||
| digest (>= 0.6.3), | ||
| base64enc | ||
| Suggests: | ||
| testthat | ||
| testthat, | ||
| googleAuthR, | ||
| aws.s3 | ||
| Remotes: | ||
| cloudyr/aws.s3 | ||
| License: MIT + file LICENSE | ||
| RoxygenNote: 5.0.1 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,72 @@ | ||
| #' @name cache_aws_s3 | ||
| #' @title Amazon Web Services S3 Cache | ||
| #' @description Initiate an Amazon Web Services Cache | ||
| #' | ||
| #' @examples | ||
| #' | ||
| #' \dontrun{ | ||
| #' # Set AWS credentials. | ||
| #' Sys.setenv("AWS_ACCESS_KEY_ID" = "<access key>", | ||
| #' "AWS_SECRET_ACCESS_KEY" = "<access secret>") | ||
| #' | ||
| #' # Set up a unique bucket name. | ||
| #' s3 <- cache_aws_s3("unique-bucket-name") | ||
| #' mem_runif <- memoise(runif, cache = s3) | ||
| #' } | ||
| #' | ||
| #' | ||
| #' @param cache_name Bucket name for storing cache files. | ||
| #' @export | ||
|
|
||
| cache_aws_s3 <- function(cache_name) { | ||
|
|
||
| # Can't get this check to pass... | ||
| # if (!("aws.s3" %in% installed.packages()[,"Package"])) { stop("aws.s3 required for datastore cache.") } | ||
|
|
||
| if (!(aws.s3::bucket_exists(cache_name))) { | ||
| aws.s3::put_bucket(cache_name) | ||
| if (!(aws.s3::bucket_exists(cache_name))) { | ||
| stop("Cache name must use unique bucket name") | ||
| } | ||
| } | ||
|
|
||
| cache <- NULL | ||
| cache_reset <- function() { | ||
| aws.s3::delete_bucket(cache_name) | ||
| aws.s3::put_bucket(cache_name) | ||
| } | ||
|
|
||
| cache_set <- function(key, value) { | ||
| tfile = tempfile() | ||
| save(value, file = tfile) | ||
| aws.s3::put_object(tfile, object = key, bucket = cache_name) | ||
| } | ||
|
|
||
| cache_get <- function(key) { | ||
| suppressWarnings(aws.s3::s3load(object = key, bucket = cache_name)) | ||
| base::get(ls()[ls() != "key"][[1]]) | ||
| } | ||
|
|
||
| cache_has_key <- function(key) { | ||
| aws.s3::head_object(object = key, bucket = cache_name) | ||
| } | ||
|
|
||
| cache_keys <- function() { | ||
| items <- lapply(aws.s3::get_bucket(bucket = cache_name), function(x) { | ||
| if ("Key" %in% names(x)) { | ||
| return(x$Key) | ||
| } else { | ||
| return(NULL) | ||
| } | ||
| }) | ||
| unlist(Filter(Negate(is.null), items)) | ||
| } | ||
|
|
||
| list( | ||
| reset = cache_reset, | ||
| set = cache_set, | ||
| get = cache_get, | ||
| has_key = cache_has_key, | ||
| keys = cache_keys | ||
| ) | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,136 @@ | ||
| #' @name cache_datastore | ||
| #' @title Google Datastore Cache | ||
| #' @description Initiate a Google Datastore cache. | ||
| #' @param project Google Cloud project | ||
| #' @param cache_name datastore kind to use for storing cache entities. | ||
| #' | ||
| #' @examples | ||
| #' \dontrun{ | ||
| #' ds <- cache_datastore(project = "<project-id>", cache_name = "rcache") | ||
| #' mem_memoise(runif, cache = ds) | ||
| #' } | ||
| #' | ||
| #' @seealso \url{https://cloud.google.com/} | ||
| #' @seealso \url{https://cloud.google.com/datastore/docs/concepts/overview} | ||
| #' | ||
| #' @export | ||
|
|
||
| cache_datastore <- function(project, cache_name = "rcache") { | ||
|
|
||
| if (!("googleAuthR" %in% installed.packages()[,"Package"])) { stop("googleAuthR required for datastore cache.") } | ||
|
|
||
| options("googleAuthR.scopes.selected" = c("https://www.googleapis.com/auth/datastore", | ||
| "https://www.googleapis.com/auth/userinfo.email")) | ||
|
|
||
| googleAuthR::gar_auth() | ||
|
|
||
| base_url <- paste0("https://datastore.googleapis.com/v1beta3/projects/", project) | ||
|
|
||
| transaction <- googleAuthR::gar_api_generator(paste0(base_url, ":beginTransaction"), | ||
| "POST", | ||
| data_parse_function = function(x) x$transaction) | ||
|
|
||
| commit_ds <- googleAuthR::gar_api_generator(paste0(base_url, ":commit"), | ||
| "POST", | ||
| data_parse_function = function(x) x) | ||
|
|
||
| load_ds <- googleAuthR::gar_api_generator(paste0(base_url, ":lookup"), | ||
| "POST", | ||
| data_parse_function = function(resp) { | ||
| # Unserialize and return | ||
| if ("found" %in% names(resp)) { | ||
| resp <- resp$found | ||
| value <- resp$entity$properties$object$blobValue | ||
| response <- unserialize(memDecompress(base64enc::base64decode(value), type = "gzip")) | ||
| } else if ("missing" %in% names(resp)) { | ||
| "!cache-not-found" | ||
| } else { | ||
| stop("Error") | ||
| } | ||
| }) | ||
|
|
||
| query_ds <- googleAuthR::gar_api_generator(paste0(base_url, ":runQuery"), | ||
| "POST", | ||
| data_parse_function = function(resp) resp) | ||
|
|
||
|
|
||
| cache_reset <- function() { | ||
| query_results <- query_ds(the_body = list(gqlQuery = list(queryString = paste0("SELECT * FROM ", cache_name)))) | ||
| while((query_results$batch$moreResults != "NO_MORE_RESULTS") | is.null(query_results$batch$entityResults) == FALSE) { | ||
|
|
||
|
|
||
| ids <- (sapply(query_results$batch$entityResults$entity$key$path, function(x) x$name)) | ||
|
|
||
| item_groups <- split(ids, (1:length(ids)) %/% 25) | ||
| sapply(item_groups, function(idset) { | ||
| mutations <- lapply(idset, function(x) { | ||
| c(list("delete" = list(path = list(kind = cache_name, name = x)))) | ||
| }) | ||
| body <- list(mutations = mutations, transaction = transaction()) | ||
| resp <- try(commit_ds(the_body = body), silent = T) | ||
| message("Clearing Cache") | ||
| }) | ||
| query_results <- query_ds(the_body = list(gqlQuery = list(queryString = paste0("SELECT * FROM ", cache_name)))) | ||
| } | ||
| } | ||
|
|
||
|
|
||
| cache_set <- function(key, value) { | ||
| # Serialize value | ||
| svalue <- base64enc::base64encode(memCompress(serialize(value, NULL, ascii=T), type = "gzip")) | ||
| path_item <- list( | ||
| kind = cache_name, | ||
| name = key | ||
| ) | ||
| prop = list( | ||
| object = list(blobValue = svalue, excludeFromIndexes = T) | ||
| ) | ||
|
|
||
| transaction_id <- transaction() | ||
|
|
||
| key_obj <- c(list(key = list(path = path_item), | ||
| properties = prop)) | ||
| mutation = list() | ||
| mutation[["upsert"]] = key_obj | ||
| body <- list(mutations = mutation, | ||
| transaction = transaction_id | ||
| ) | ||
|
|
||
| resp <- try(commit_ds(the_body = body), silent = T) | ||
| if (class(resp) == "try-error") { | ||
| warning(attr(resp, "condition")) | ||
| } | ||
| } | ||
|
|
||
| cache_get <- function(key) { | ||
| path_item <- list( | ||
| kind = cache_name, | ||
| name = key | ||
| ) | ||
|
|
||
| resp <- load_ds(the_body = list(keys = list(path = path_item))) | ||
| suppressWarnings( if(resp == "!cache-not-found") { | ||
| stop("Cache Not Found") | ||
| }) | ||
| resp | ||
| } | ||
|
|
||
| cache_has_key <- function(key) { | ||
| res <- try(suppressWarnings(cache_get(key)), silent = TRUE) | ||
| if (class(res) != "try-error") { | ||
| message("Using Cached Version") | ||
| } | ||
| class(res) != "try-error" | ||
| } | ||
|
|
||
| list( | ||
| reset = cache_reset, | ||
| set = cache_set, | ||
| get = cache_get, | ||
| has_key = cache_has_key, | ||
| keys = function() message("Keys can't be listed with the google datastore cache.") | ||
| ) | ||
| } | ||
|
|
||
|
|
||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,65 @@ | ||
| #' @name cache_filesystem | ||
| #' @title Filesystem Cache | ||
| #' @description | ||
| #' Initiate a filesystem cache. | ||
| #' | ||
| #' @param path Directory in which to store cached items. | ||
| #' | ||
| #' @examples | ||
| #' | ||
| #' \dontrun{ | ||
| #' # Use with Dropbox | ||
| #' | ||
| #' db <- cache_filesystem("~/Dropbox/.rcache") | ||
| #' | ||
| #' mem_runif <- memoise(runif, cache = db) | ||
| #' | ||
| #' # Use with Google Drive | ||
| #' | ||
| #' gd <- cache_filesystem("~/Google Drive/.rcache") | ||
| #' | ||
| #' mem_runif <- memoise(runif, cache = gd) | ||
| #' | ||
| #' } | ||
| #' | ||
| #' @export | ||
|
|
||
| cache_filesystem <- function(path) { | ||
|
|
||
| dir.create(file.path(path), showWarnings = FALSE) | ||
|
|
||
| cache_reset <- function() { | ||
| cache_files <- list.files(path, full.names = TRUE) | ||
| # Use an environment for loaded items. | ||
| cache <- new.env(TRUE, emptyenv()) | ||
| if (length(cache_files) > 0) { | ||
| rm_status <- file.remove(list.files(path, full.names = TRUE)) | ||
| if (rm_status) { | ||
| message("Cached files removed.") | ||
| } | ||
| } else { | ||
| message("No files in Cache.") | ||
| } | ||
| } | ||
|
|
||
| cache_set <- function(key, value) { | ||
| save(value, file = paste(path, key, sep="/")) | ||
| } | ||
|
|
||
| cache_get <- function(key) { | ||
| load(file = paste(path, key, sep="/")) | ||
| value | ||
| } | ||
|
|
||
| cache_has_key <- function(key) { | ||
| file.exists(paste(path, key, sep="/")) | ||
| } | ||
|
|
||
| list( | ||
| reset = cache_reset, | ||
| set = cache_set, | ||
| get = cache_get, | ||
| has_key = cache_has_key, | ||
| keys = function() list.files(path) | ||
| ) | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to be somewhat careful about dependencies not yet on CRAN,
aws.s3also should be inSuggests:as well asRemotes:.