--- title: "Basic HLI queries" description: "This vignette shows the functions for querying the CAX tables, with a focus on the HLI indicators for Endangered Species Act Listed ESUs/DPSs." output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Basic HLI queries} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` This vignette focuses on the Coordinated Assessment Program (CAP) standardized high-level indicators (HLI) tables. See the CAP [HLI](https://www.streamnet.org/cap/current-hli/) webpage for details on each indicator. The tables returned in this vignette are the same data returned via the [CAP Fish HLIs Tabular Query](https://www.streamnet.org/data/hli/?index=1&perpage=10) GUI. Note CAP includes other datasets (tables) and {rCAX} will return these also. See the [CAX Datasets](CAX_Tables.html) vignette. ## Getting started See the [Getting Started vignette](setup.html) for instructions on installing {rCAX}. Once you have installed {rCAX}, you can begin using the package by loading the library. Load the library. ```{r setup} library(rCAX) ``` Read the terms of use: ``` rcax_termsofuse() ``` ## HLI queries The main user function is `rcax_hli()` which returns HLI tables from Coordinated Assessments data eXchange with meta data such as NMFS_PopID as returned from the [CAP Fish HLIs Tabular Query](https://www.streamnet.org/data/hli/) or [CAX](https://cax.streamnet.org). The basic `rcax_hli()` functionality is shown with the NOSA HLI but the syntax is the same for all the HLI tables: * NOSA: Natural Origin Spawner Abundance * SAR: Smolt to Adult Ratios * PNI: Proportionate Natural Influence of supplementation hatcheries * RperS: Recruits per Spawner * JuvOut: Juvenile Outmigrants * PreSmolt: Presmolt Abundance ## Show the columns for the NOSA table Get the table and show all the column names with definitions. Only first 10 are shown. ```{r} head(rcax_hli("NOSA", type="colnames")) ``` ## Get records for the NOSA HLI table Here the columns returned are restricted by `cols`. The table is filtered with `flist` to be just the columns with `nmfs_popid` equal to 7. Note the `cols` argument is case insensitive, `NMFS_PopID` and `nmfs_popid` are the same, but the column names in the returned tables will all be lower case. ```{r} tab <- rcax_hli("NOSA", flist = list(nmfs_popid = 7), cols=c("nmfs_popid", "spawningyear", "tsaej", "nosaej")) head(tab) ``` Return data for a single ESU. The ESU/DPS names must be exact and are case sensitive. Use `rCAX:::caxesu` to see the ESU/DPS names. ```{r} tab <- rcax_hli("NOSA", flist = list(esu_dps = "Salmon, chum (Columbia River ESU)") ) ``` We can then plot: ```{r} library(ggplot2) # Convert tsaej to a number tab$tsaej <- as.numeric(tab$tsaej) # plot ggplot( subset(tab, spawningyear>2000), aes(x=spawningyear, y=log(tsaej), color=waterbody)) + geom_line(na.rm = TRUE) + ggtitle("log(total spawners)") ``` Keep in mind that not all ESUs or DPSs are in the CAX database for each HLI, nor are all populations for each ESU or DPS. Go to https://www.streamnet.org/data/hli/ to do a search to quickly see what is available in different HLI tables. ## Filtering Querying for columns with specific values is filtering and filter specification is via the `flist` argument. The `flist` argument is a list with the columns and values you want to filter on. A query with the `flist` argument takes this form: ``` tab <- rcax_hli("NOSA", flist = list(...)) ``` ### Single value Filter based on one value. This shows examples of how you might specify the `flist` argument in a `rcax_hli()` call: ``` flist = list(esu_dps = "Steelhead (Middle Columbia River DPS)") flist = list(popid = 7) ``` For example, you could use ``` tab <- rcax_hli("NOSA", flist = list(popid = 7)) ``` to retrieve only data for popid 7. ### Multiple values You can also filter based on multiple values. In this case, data with popid 7, 8 or 9 are returned. ``` flist = list(popid = c(7,8,9)) ``` Filter based on two columns. Here we getting the the summer run data for one ESU. The values in `flist` are not case sensitive so "Summer" will return both "Summer" and "summer". ``` flist = list(esu_dps = "Salmon, Chinook (Snake River spring/summer-run ESU)", run = c("Summer")) ``` Unfortunately there seems to be a server-side problem with passing in multiple values with multiple columns. This works ``` flist = list(run = c("Summer", "Spring")) ``` But this throws an error. ``` flist = list(esu_dps = "Salmon, Chinook (Snake River spring/summer-run ESU)", run = c("Summer", "Spring")) ``` ## Change the number of records returned The default maximum number of records is 1000. You can increase (or decrease) this by passing in the limit query parameter using the `qlist` argument. ```{r} tab <- rcax_hli("NOSA", qlist = list(limit=1), cols=c("popid", "spawningyear", "tsaej")) tab ``` Increase the limit to 2000 to ensure all the data are returned. Not run. ```{r eval=FALSE} tab <- rcax_hli("NOSA", flist = list(esu_dps="Salmon, Chinook (Snake River spring/summer-run ESU)") qlist = list(limit=2000), cols=c("popid", "spawningyear", "tsaej")) tab ``` ## Show the available tables Only the name and id columns are shown. ```{r} tab <- rcax_datasets(cols=c("name", "id")) head(tab) ``` ## Show internal data These are internal data sets. Access with `rCAX:::` * `caxesu` The ESU and DPS names, which appear in the `esu_dps` column in tables. * `caxpops` The Populations table with all the population metadata, like NMFS_PopID and MPG. * `caxsuperpops` The SuperPopulations table with the metadata describing which populations are included in each superpopulation. These are used when the data, e.g. Genetic Stock Identification, cannot separate data to the population level. ```{r} rCAX:::caxesu[1:5] ``` ```{r} colnames(rCAX:::caxpops) ```