This vignette details how to use the internal table search functions in the cansim package with a simple example using employment data for economic regions in British Columbia.

A note on caching and refreshing table lists

As the table search function requires a full scrape of Statistics Canada’s data repository webpages, generating this list can be quite slow so a saved list of tables is included with the package. As Statistics Canada adds additional tables and data products, the list that comes with the package will become out of date and will require refreshing. Tables can be refreshed by specifying refresh=TRUE when calling list_cansim_tables. The full list of tables can be cached locally to avoid delays and prevent unnecessary web scraping. This can (and should) be enabled by setting options(cache_path="your cache path") option so that table information is cached across R sessions.

Listing and filtering tables

Calling list_cansim_tables returns a data frame with useful metadata for available tables. There are 21 fields of metadata for each table including title, in English and French, keyword sets, notes, and table numbers.

The appropriate table can be found by subsetting or filtering on the properties we want to use to find the appropriate tables.

The search came up with two tables. In this example we are interested in the unemployment rate for 2015 onwards for the Lower Mainland, Vancouver Island, and Okanagan economic regions from the Labour Force Characteristics table. We use the tidyr package here to reshape data from a long format to a wider format.

We can visualize then results with ggplot2.

library(ggplot2)
ggplot(data, aes(x=Date, group = GEO,y=Estimate)) +
  geom_ribbon(aes(ymin=Estimate - `Standard error of estimate`,
                  ymax=Estimate + `Standard error of estimate`, fill=""),
              alpha=0.8) +
  geom_line(aes(color=GEO)) +
  scale_y_continuous(labels=scales::percent) +
  scale_fill_manual(name = "", values="grey80", label="Standard error") +
  theme_bw() + 
  labs(title = "Comparison of unemployment rate by economic region",
       y = "Unemployment Rate", 
       x = "",
       color = "",
       caption=paste0("CANSIM ", selected_table))