raadfiles-admin.Rd
Administration tools for managing a data library.
get_raad_data_roots() get_raad_filenames() set_raad_data_roots(..., replace_existing = TRUE, use_known_candidates = FALSE, verbose = TRUE) raad_filedb_path(...) set_raad_filenames(clobber = FALSE) run_build_raad_cache()
... | input file paths to set |
---|---|
replace_existing | replace existing paths, defaults to TRUE |
use_known_candidates | apply internal logic for known candidates (for internal use at raad-hq), defaults to FALSE |
clobber | by default do not ignore existing file cache, set to TRUE to ignore and set |
These management functions are aimed at raadtools users, but can be used for any file collection. The administration tools consist of **data roots** and control over the building, reading, and caching of the available file list. No interpretation of the underlying files is provided in the administration tools.
A typical user won't use these functions but may want to investigate the contents of the raw file list, with `get_raad_filenames()`.
A user setting up a raadfiles collection will typically set the root directory/directories with `set_raad_data_roots()`, then run the file cache list builder with `run_build_raad_cache()`, and then `set_raad_filenames()` to actually load the file cache into memory.
In a new R session there is no need to run `set_raad_filenames()` directly as this will be done as the package loads. To disable this automatic behaviour use `options(raadfiles.file.cache.disable = TRUE)` *before* the package is used or loaded. This is typically done when calling `run_build_raad_cache()` in a cron task.
Every raadfiles file collection function (e.g. `oisst_daily_files`) will run `get_raad_filenames` to obtain the full raw list of available files from the global in-memory option `getOption("raadfiles.filename.database")` and there is a low threshold probability that this will also trigger a re-read of the file listing from the root directories. To avoid this trigger either use `getOption("raadfiles.filename.database")` directly to get the in-memory file list, or set `options(raadfiles.file.refresh.threshold = 0)` to prevent the trigger. (Set it to 1 to force it always to be read, also controlled by `set_raad_filenames(clobber = TRUE)`).
There is a family of functions and global options used for administration.
set_raad_data_roots | |
set data root paths, for normal use only one data root is needed | |
set_raad_filenames | |
runs the system to update the file listing and refresh it | |
get_raad_data_roots | |
returns the current list of visible root directories | |
get_raad_filenames | |
returns the entire list of all files found in visible root directories | |
run_build_raad_cache | |
scan all root directories and update the file listing in each | |
raadfiles.data.roots | |
the list of paths to root directories | |
raadfiles.file.cache.disable | |
disable on-load setting of the in-memory file cache (never set automatically by the package) | |
raadfiles.file.refresh.threshold | |
threshold probability of how often to refresh in-memory file cache (0 = never, 1 = every time `get_raad_filenames()` is called) |
Options used internally, and subject to control by adminstrator options and the running of admin functions (they may not be set).
raadfiles.filename.database | |
the data frame of all file names from the data roots | |
raadfiles.database.status | |
a status record of the in-memory filename database (timestamp) | |