Extending cardioStatsUSA with modules
modules.Rmd
library(cardioStatsUSA)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tibble)
Modules
Definition: We define a module as a subpopulation of NHANES participants paired with a collection of variables. Each variable in the collection has designated roles. For example, hypertension can be an outcome and can also be used to define groups when analyzing another outcome. This module definition is general enough to allow for diverse extensions of the web application, but may not be specific enough to illustrate how the web application works. Therefore, in the current analysis, we define and present results from the “BP and hypertension” module, defined as follows.
The Blood Pressure and Hypertension Module
Subpopulation: Beginning with 107,622 US individuals who participated in NHANES 1999-2000 to 2017- March 2020, we restricted the subpopulation to adults ≥ 18 years of age. This exclusion was applied because statistics for hypertension and BP levels in children and adolescents are markedly different than for adults. We further restricted the subpopulation to participants who completed the in-home interview and study examination, with one or more SBP and DBP measurement, and who had data on self-reported antihypertensive medication use. After these exclusions were applied, the sub-population included data from 56,035 participants (Figure S1).
Variables The primary variables from the BP and hypertension module are listed in Table 1, with full definitions provided in Supplemental Table 1. Briefly, mean SBP and DBP levels were computed over all available measurements for each participant. Oscillometric BP values were calibrated to the mercury device as described previously. Antihypertensive medication classes were defined using recommendations from the 2017 ACC/AHA BP guideline.
Example
my_key <- tibble::tribble(
~class, ~variable, ~label, ~type, ~outcome, ~group, ~subset, ~stratify, ~module, ~description,
"Survey", "svy_id", "participant identifier", "svy", FALSE, FALSE, FALSE, FALSE, "none", NA_character_,
"Survey", "svy_psu", "primary sampling unit", "svy", FALSE, FALSE, FALSE, FALSE, "none", NA_character_,
"Survey", "svy_strata", "strata", "svy", FALSE, FALSE, FALSE, FALSE, "none", NA_character_,
"Survey", "svy_weight_mec", "Mobile examination center weights", "svy", FALSE, FALSE, FALSE, FALSE, "none", NA_character_,
"Survey", "svy_subpop_sbp", "Subpopulation for the SBP module", "svy", FALSE, FALSE, FALSE, FALSE, "none", NA_character_,
"Survey", "svy_year", "NHANES cycle", "time", FALSE, FALSE, FALSE, FALSE, "none", NA_character_,
"SBP", "sbp_mmhg", "SBP, mm Hg", "ctns", TRUE, TRUE, TRUE, TRUE, "sbp", "Systolic blood pressure in mm Hg",
"SBP", "sbp_cat", "SBP category", "catg", TRUE, TRUE, TRUE, TRUE, "sbp", "Systolic blood pressure categories",
"SBP", "sbp_gteq_130", "SBP >= 130 mm Hg", "bnry", TRUE, TRUE, TRUE, TRUE, "sbp", "Systolic blood pressure in hypertensive range?"
)
Create the data that will be used in the new module
my_nhanes_init <- nhanes_data %>%
as_tibble() %>%
select(svy_id,
svy_psu,
svy_strata,
svy_weight_mec,
svy_year,
sbp_mmhg = bp_sys_mean)
Derive variables to be used in your module
my_nhanes_derived <- my_nhanes_init %>%
mutate(
svy_subpop_sbp = if_else(is.na(sbp_mmhg), 0, 1),
sbp_cat = case_when(
sbp_mmhg < 120 ~ "< 120 mm Hg",
sbp_mmhg < 130 ~ "< 130 mm Hg",
sbp_mmhg < 140 ~ "< 140 mm Hg",
sbp_mmhg >= 140 ~ ">= 140 mm Hg"
),
sbp_gteq_130 = if_else(sbp_mmhg >= 130, "Yes", "No")
)
nhanes_summarize(data = my_nhanes_derived,
key = my_key,
outcome_variable = 'sbp_mmhg')
#> svy_year statistic estimate std_error ci_lower ci_upper n_obs
#> <fctr> <char> <num> <num> <num> <num> <int>
#> 1: 1999-2000 mean 122.8637 0.7063396 121.4793 124.2481 4805
#> 2: 2001-2002 mean 122.5326 0.4679821 121.6153 123.4498 5290
#> 3: 2003-2004 mean 122.7715 0.5106348 121.7707 123.7723 4943
#> 4: 2005-2006 mean 122.3453 0.4492387 121.4648 123.2258 5057
#> 5: 2007-2008 mean 121.6019 0.3776369 120.8617 122.3420 5700
#> 6: 2009-2010 mean 120.5063 0.4843604 119.5569 121.4556 6072
#> 7: 2011-2012 mean 121.6048 0.6520077 120.3269 122.8827 5356
#> 8: 2013-2014 mean 121.4424 0.3140021 120.8270 122.0579 5716
#> 9: 2015-2016 mean 123.3737 0.4620068 122.4682 124.2792 5571
#> 10: 2017-2020 mean 123.1444 0.3692445 122.4207 123.8681 8024
#> unreliable_status unreliable_reason review_needed review_reason
#> <lgcl> <char> <lgcl> <char>
#> 1: FALSE <NA> FALSE <NA>
#> 2: FALSE <NA> FALSE <NA>
#> 3: FALSE <NA> FALSE <NA>
#> 4: FALSE <NA> FALSE <NA>
#> 5: FALSE <NA> FALSE <NA>
#> 6: FALSE <NA> FALSE <NA>
#> 7: FALSE <NA> FALSE <NA>
#> 8: FALSE <NA> FALSE <NA>
#> 9: FALSE <NA> FALSE <NA>
#> 10: FALSE <NA> FALSE <NA>
Now you can run the application locally with your customized data
using
app_run(nhanes_data = my_nhanes_derived, nhanes_key = my_key)