Administration tools for managing a data library.

get_raad_data_roots()

get_raad_filenames()

set_raad_data_roots(..., replace_existing = TRUE,
  use_known_candidates = FALSE, verbose = TRUE)

raad_filedb_path(...)

set_raad_filenames(clobber = FALSE)

run_build_raad_cache()

Arguments

...

input file paths to set

replace_existing

replace existing paths, defaults to TRUE

use_known_candidates

apply internal logic for known candidates (for internal use at raad-hq), defaults to FALSE

clobber

by default do not ignore existing file cache, set to TRUE to ignore and set

Details

These management functions are aimed at raadtools users, but can be used for any file collection. The administration tools consist of **data roots** and control over the building, reading, and caching of the available file list. No interpretation of the underlying files is provided in the administration tools.

A typical user won't use these functions but may want to investigate the contents of the raw file list, with `get_raad_filenames()`.

A user setting up a raadfiles collection will typically set the root directory/directories with `set_raad_data_roots()`, then run the file cache list builder with `run_build_raad_cache()`, and then `set_raad_filenames()` to actually load the file cache into memory.

In a new R session there is no need to run `set_raad_filenames()` directly as this will be done as the package loads. To disable this automatic behaviour use `options(raadfiles.file.cache.disable = TRUE)` *before* the package is used or loaded. This is typically done when calling `run_build_raad_cache()` in a cron task.

Every raadfiles file collection function (e.g. `oisst_daily_files`) will run `get_raad_filenames` to obtain the full raw list of available files from the global in-memory option `getOption("raadfiles.filename.database")` and there is a low threshold probability that this will also trigger a re-read of the file listing from the root directories. To avoid this trigger either use `getOption("raadfiles.filename.database")` directly to get the in-memory file list, or set `options(raadfiles.file.refresh.threshold = 0)` to prevent the trigger. (Set it to 1 to force it always to be read, also controlled by `set_raad_filenames(clobber = TRUE)`).

There is a family of functions and global options used for administration.

Administration functions

set_raad_data_roots
set data root paths, for normal use only one data root is needed
set_raad_filenames
runs the system to update the file listing and refresh it
get_raad_data_roots
returns the current list of visible root directories
get_raad_filenames
returns the entire list of all files found in visible root directories
run_build_raad_cache
scan all root directories and update the file listing in each

Options for use by administrators

raadfiles.data.roots
the list of paths to root directories
raadfiles.file.cache.disable
disable on-load setting of the in-memory file cache (never set automatically by the package)
raadfiles.file.refresh.threshold
threshold probability of how often to refresh in-memory file cache (0 = never, 1 = every time `get_raad_filenames()` is called)

Internal options, used by the package

Options used internally, and subject to control by adminstrator options and the running of admin functions (they may not be set).

raadfiles.filename.database
the data frame of all file names from the data roots
raadfiles.database.status
a status record of the in-memory filename database (timestamp)