R/cansim_parquet.R
get_cansim_connection.Rd
Retrieves a data table using an NDM catalogue number as parquet, feather, or SQLite database connection. Retrieved table data is cached permanently if a cache path is supplied or for duration of the current R session. If the table is cached the function will check if a newer version is available and emit a warning message if the cached table is out of date.
the NDM table number to load
"en"
or "english"
for English and "fr"
or "french"
for French language versions (defaults to English)
(Optional) The format of the data table to retrieve. Either "parquet"
, "feather"
, or sqlite
(default is "parquet"
).
(Optional) Partition columns to use for parquet or feather formats.
(Optional) Valid options are FALSE
(the default), TRUE
, and "auto"
. When set
to TRUE
, forces a reload of data table, when set to "auto"
it will refresh the table by downloading
the newest version from StatCan if the table is out of date. If set to FALSE
and the table is out of date
a warning will be emitted to alert the user that the data is outdated.
(Optional) Timeout in seconds for downloading cansim table to work around scenarios where StatCan servers drop the network connection.
(Optional) Path to where to cache the table permanently. By default, the data is cached in the path specified by `getOption("cansim.cache_path")`, if this is set. Otherwise it will use `tempdir()`.
A database connection to a local parquet, feather, or sqlite database with the StatCan Table data. The data frames after calling `collect()` or `collect_and_normalize()` are identical up to possibly different row order.
if (FALSE) {
con <- get_cansim_connection("34-10-0013")
# Work with the data connection
glimpse(con)
}