Extending cardioStatsUSA with modules • cardioStatsUSA

library(cardioStatsUSA)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tibble)

Modules

Definition: We define a module as a subpopulation of NHANES participants paired with a collection of variables. Each variable in the collection has designated roles. For example, hypertension can be an outcome and can also be used to define groups when analyzing another outcome. This module definition is general enough to allow for diverse extensions of the web application, but may not be specific enough to illustrate how the web application works. Therefore, in the current analysis, we define and present results from the “BP and hypertension” module, defined as follows.

The Blood Pressure and Hypertension Module

Subpopulation: Beginning with 107,622 US individuals who participated in NHANES 1999-2000 to 2017- March 2020, we restricted the subpopulation to adults ≥ 18 years of age. This exclusion was applied because statistics for hypertension and BP levels in children and adolescents are markedly different than for adults. We further restricted the subpopulation to participants who completed the in-home interview and study examination, with one or more SBP and DBP measurement, and who had data on self-reported antihypertensive medication use. After these exclusions were applied, the sub-population included data from 56,035 participants (Figure S1).

Variables The primary variables from the BP and hypertension module are listed in Table 1, with full definitions provided in Supplemental Table 1. Briefly, mean SBP and DBP levels were computed over all available measurements for each participant. Oscillometric BP values were calibrated to the mercury device as described previously. Antihypertensive medication classes were defined using recommendations from the 2017 ACC/AHA BP guideline.

Example

my_key <- tibble::tribble(
   ~class,        ~variable,                              ~label,  ~type, ~outcome, ~group, ~subset, ~stratify, ~module, ~description,
 "Survey",         "svy_id",            "participant identifier",  "svy",    FALSE,  FALSE,   FALSE,     FALSE,  "none",  NA_character_,
 "Survey",        "svy_psu",             "primary sampling unit",  "svy",    FALSE,  FALSE,   FALSE,     FALSE,  "none",  NA_character_,
 "Survey",     "svy_strata",                            "strata",  "svy",    FALSE,  FALSE,   FALSE,     FALSE,  "none",  NA_character_,
 "Survey", "svy_weight_mec", "Mobile examination center weights",  "svy",    FALSE,  FALSE,   FALSE,     FALSE,  "none",  NA_character_,
 "Survey", "svy_subpop_sbp",  "Subpopulation for the SBP module",  "svy",    FALSE,  FALSE,   FALSE,     FALSE,  "none",  NA_character_,
 "Survey",       "svy_year",                      "NHANES cycle", "time",    FALSE,  FALSE,   FALSE,     FALSE,  "none",  NA_character_,
    "SBP",       "sbp_mmhg",                        "SBP, mm Hg", "ctns",     TRUE,   TRUE,    TRUE,      TRUE,   "sbp",  "Systolic blood pressure in mm Hg",
    "SBP",        "sbp_cat",                      "SBP category", "catg",     TRUE,   TRUE,    TRUE,      TRUE,   "sbp",  "Systolic blood pressure categories",
    "SBP",   "sbp_gteq_130",                  "SBP >= 130 mm Hg", "bnry",     TRUE,   TRUE,    TRUE,      TRUE,   "sbp",  "Systolic blood pressure in hypertensive range?"
 )

Create the data that will be used in the new module


my_nhanes_init <- nhanes_data %>%
 as_tibble() %>%
 select(svy_id,
        svy_psu,
        svy_strata,
        svy_weight_mec,
        svy_year,
        sbp_mmhg = bp_sys_mean)

Derive variables to be used in your module


my_nhanes_derived <- my_nhanes_init %>%
 mutate(
  svy_subpop_sbp = if_else(is.na(sbp_mmhg), 0, 1),
  sbp_cat = case_when(
   sbp_mmhg <  120 ~ "< 120 mm Hg",
   sbp_mmhg <  130 ~ "< 130 mm Hg",
   sbp_mmhg <  140 ~ "< 140 mm Hg",
   sbp_mmhg >= 140 ~ ">= 140 mm Hg"
  ),
  sbp_gteq_130 = if_else(sbp_mmhg >= 130, "Yes", "No")
 )

nhanes_summarize(data = my_nhanes_derived,
                 key = my_key,
                 outcome_variable = 'sbp_mmhg')
#>      svy_year statistic estimate std_error ci_lower ci_upper n_obs
#>        <fctr>    <char>    <num>     <num>    <num>    <num> <int>
#>  1: 1999-2000      mean 122.8637 0.7063396 121.4793 124.2481  4805
#>  2: 2001-2002      mean 122.5326 0.4679821 121.6153 123.4498  5290
#>  3: 2003-2004      mean 122.7715 0.5106348 121.7707 123.7723  4943
#>  4: 2005-2006      mean 122.3453 0.4492387 121.4648 123.2258  5057
#>  5: 2007-2008      mean 121.6019 0.3776369 120.8617 122.3420  5700
#>  6: 2009-2010      mean 120.5063 0.4843604 119.5569 121.4556  6072
#>  7: 2011-2012      mean 121.6048 0.6520077 120.3269 122.8827  5356
#>  8: 2013-2014      mean 121.4424 0.3140021 120.8270 122.0579  5716
#>  9: 2015-2016      mean 123.3737 0.4620068 122.4682 124.2792  5571
#> 10: 2017-2020      mean 123.1444 0.3692445 122.4207 123.8681  8024
#>     unreliable_status unreliable_reason review_needed review_reason
#>                <lgcl>            <char>        <lgcl>        <char>
#>  1:             FALSE              <NA>         FALSE          <NA>
#>  2:             FALSE              <NA>         FALSE          <NA>
#>  3:             FALSE              <NA>         FALSE          <NA>
#>  4:             FALSE              <NA>         FALSE          <NA>
#>  5:             FALSE              <NA>         FALSE          <NA>
#>  6:             FALSE              <NA>         FALSE          <NA>
#>  7:             FALSE              <NA>         FALSE          <NA>
#>  8:             FALSE              <NA>         FALSE          <NA>
#>  9:             FALSE              <NA>         FALSE          <NA>
#> 10:             FALSE              <NA>         FALSE          <NA>

Now you can run the application locally with your customized data using app_run(nhanes_data = my_nhanes_derived, nhanes_key = my_key)