Skip to contents

Visualize summaries of NHANES

Usage

nhanes_visualize(
  data,
  key,
  outcome_variable,
  outcome_quantiles = NULL,
  outcome_stats = NULL,
  group_variable = NULL,
  group_cut_n = NULL,
  group_cut_type = NULL,
  stratify_variable = NULL,
  time_variable = "svy_year",
  time_values = NULL,
  pool = FALSE,
  subset_calls = list(),
  standard_variable = "demo_age_cat",
  standard_weights = NULL,
  statistic_primary = NULL,
  title = NULL,
  geom = "bar",
  reorder_cats = FALSE,
  width = NULL,
  height = NULL,
  size_point = NULL,
  size_error = NULL
)

Arguments

data

[data.frame] A set of NHANES data with one row per survey participant and one column per variable. See nhanes_data for more details See Details for specific requirements. See nhanes_data for an example.

key

[data.frame] A data set with one row per variable and with columns that describe the variable. See nhanes_key for more details See Details for specific requirements. See nhanes_key for an example.

outcome_variable

[character(1)] The name of the outcome variable to be summarized.

outcome_quantiles

[numeric(1+)] The quantiles to be summarized for a continuous outcome. The default is c(0.25, 0.50, 0.75). For example,

  • outcome_quantiles = c(.5) will compute the 50th percentile (i.e., the median)

  • outcome_quantiles = c(.25, .5, .75) will compute the 25th, 50th, and 75th percentile.

  • outcome_quantiles = seq(.1, .9, by = .1) will compute every 10th percentile, except for the 0th and 100th

outcome_stats

[character(1+)]

The statistics that should be computed. Multiple statistics may be requested. Valid options depend on the type of outcome to be summarized. For continuous outcomes, valid options include

  • 'mean': estimates the mean value of the outcome

  • 'quantile': estimates 25th, 50th, and 75th percentile of the outcome.

For categorical outcomes, valid options include

  • 'percentage': estimates the prevalence of the outcome

  • 'percentage_kg': estimates the prevalence and uses Korn and Graubard's method to estimate a 95% confidence interval

  • 'count': estimates the number of US adults with the outcome.

group_variable

[character(1)] The name of the group variable. See Details for a description of the group variable and the stratify variable.

group_cut_n

[integer(1)] The number of groups to form using the group variable. This is only relevant if the group variable is continuous, and can be omitted. Default is 3

group_cut_type

[character(1)] The method used to create groups with the grouping variable. This is only relevant i fthe group variable is continuous, and can be omitted. Valid options are:

  • "interval": equal interval width, e.g., three groups with ages of 0 to <10, 10 to <20, and 20 to < 30 years.

  • "frequency": equal frequency, e.g., three groups with ages of 0 to <q, q to <p, and p to <r, where q, p, and r are selected so that roughly the same number of people are in each group.

stratify_variable

[character(1)] the name of the stratify variable. See Details for a description of the group variable and the stratify variable.

time_variable

[character(1)] The name of the time variable. The default, svy_year, corresponds to the variable in nhanes_data that indicates which 2 year NHANES cycle an observation was collected in.

time_values

[character(1+)] The time values that will be included in this design object. The default is to include all time values present in data. Valid options are:

  • 'most_recent': includes the most recent time value.

  • 'last_5': includes the 5 most recent time values.

  • 'all': includes all time values present in data.

  • You can also give a vector of specific time values, e.g., c("2009-2010", "2011-2012", "2013-2014"), if these values are present in the time_variable column (they are for nhanes_data).

pool

[logical(1)] If FALSE (the default), results are presented for individual times, separately. If TRUE, data from each time value are pooled together. Note that only contiguous cycles should be pooled together, e.g., using pool = TRUE with time_values = 'last_5' is okay, but using pool = TRUE with time_values = c("2009-2010", "2013-2014") is not recommended (that would be a strange result to interpret).

subset_calls

[named list(n)]

the names of subset_calls are variable names, and the values are values of the variable to include in the subsetted data. For example, subset_calls = list("demo_gender" = "Women") will subset the data to include rows where demo_gender is equal to "Women". Multiple entries are allowed and collapsed with the logical & operator. For example, subset_calls = list(demo_gender = "Women", bp_med_use = "Yes") will subset the data to include rows where demo_gender is equal to 'Women' AND bp_med_use is equal to "Yes"

standard_variable

[character(1)]

The name of the variable used to create standardization groups. The default is to use demo_age_cat, which leads to age standardization.

standard_weights

[numeric(n)]

The proportionate weights for each group defined by the standard variable. The number of weights should equal the number of groups defined by standard_variable and all weights must be >0.

statistic_primary

[character(1)]

the statistic that defines the geometric objects in the plot. Other statistics will be featured in the text that appears when the users mouse hovers over the corresponding object.

title

[character(1)]

The title that will appear above the plot. If this is not supplied, the title will be generated using the key data in x.

geom

[character(1)]

The type of figure that will be made. Valid options are:

  • 'bar' creates a bar plot with annotations on the bars

  • 'point' creates a scatter plot with 95% confidence interval error bars

reorder_cats

[logical(1)]

whether to re-order the categorical group variable so that its levels are shown in increasing order by the expected outcome.

width

[numeric(1)]

the width of the plot, in pixels

height

[numeric(1)]

the height of the plot, in pixels

size_point

[numeric(1)]

the size of points in the plot. (only relevant if 'geom' = 'point')

size_error

[numeric(1)]

the size of error bars in the plot. (only relevant if 'geom' = 'point')

Value

a plotly object

Examples


# Plotly objects do not render for example R code. Please see vignettes
# for examples of nhanes_visualize. TODO: ADD LINK