# User Guide: The Usual Intakes for Sodium and Potassium

﻿

## Analysis of the 2015 Canadian Community Health Survey – Nutrition

### 1. Introduction

The present analysis responds to a request from the Committee to Review the Dietary Reference Intakes (DRI) for Sodium and Potassium of the National Academies of Science, Engineering and Medicine. As part of the review process, the Committee requested up-to-date information on the usual dietary intake of the nutrients Sodium, Potassium and Energy for the Canadian population. Data from the 2015 Canadian Community Health Survey-Nutrition was used to estimate distributions of usual intake from food at the national level.

### 2. Statistical Methods

#### 2.1. Dataset and Variables

The 2015 Canadian Community Health Survey (CCHS)-Nutrition1 Focus Share file was used to estimate distributions of usual intake from food for energy and several nutrients. The survey collected two recalls concerning type and amount of food consumed in the 24 hours preceding the interview; one for all respondents and a second recall from a representative subsample of the group. Using data from only the first dietary recall produces a measure of daily intake (i.e., the quantity of nutrients or food eaten in one day). Data from both the first and second recalls can be used to produce an estimate of usual intake (i.e., the long-term average intake). In general, it is more informative to study the usual intake distribution of a population than its daily intake distribution.

For the present analysis, distributions of usual intakes and associated standard errors were calculated for the following nutrients, energy and related ratios:

• Sodium
• Potassium
• Energy
• Sodium to potassium ratio
• Sodium density
• Potassium density

For the nutrients, energy and ratios listed above, analyses were completed for each of the following DRI age, sex and life-stage groups: Males and females separately for ages 1 - 3 years; 4 - 8 years; 9 - 13 years; 14 - 18 years; 19 - 30 years; 31 - 50 years; 51 - 70 years; 71+ years; and 19+ years as well as males and females combined for ages 1-3 and 4-8. In addition, the following groups were analyzed: adults 19 years and older (males and females combined); pregnant women; lactating women; and pregnant or lactating women.

A second analysis was also completed to compute usual intakes for Sodium and Potassium, stratified by self-reported high blood pressure status. In addition, any participants who said yes, don't know or refusal to the question "Do you have heart disease?" were excluded from the analysis. The following DRI groups were included in the second analysis, stratified by blood pressure: Males and females separately for 19 - 30 years old; 31 - 50 years old; 51 - 70 years old; 71+ years old; and 19 years and older. Estimates were also provided for adults 19 years and older (males and females combined); pregnant women; lactating women and pregnant or lactating women.

A third analysis examined usual intakes of Sodium and Potassium by DRI age-sex groups reporting all percentiles of the distribution, from the 1st percentile to the 99th percentile. Estimates and associated standard errors were provided. All estimates were vetted to ensure adherence to Statistics Canada release guidelines.

#### 2.2. Estimation of Usual Dietary Intake

The variability in intakes among a group on a given day reflects both variability in intake within specific individuals (who may have eaten more or less than usual on that day) as well as between different individuals (who habitually have higher or lower intakes). To obtain an estimate of a population's usual intake distribution from daily intake data, one must fit a measurement error model that reduces the effect of the within‐individual variance while measuring the between‐individual variance. Several methods are available to estimate a population's usual intake distributions from daily intake data2-5. For the analysis of 2015 CCHS, a technical working group of statisticians from Health Canada and Statistics Canada recommended the use of the National Cancer Institute (NCI) method2,3,5 for analysis of usual dietary intake. Despite increased computational intensity compared to other available methods, the NCI method has advantages as it can be used to estimate intake of both episodic and non-episodic nutrients and foods, can include covariates in the model, and accounts for the correlation between probability of consumption and amount consumed6.

The National Cancer Institute has developed SAS macros for implementation of the NCI method for usual intakes, which are available online (https://epi.grants.cancer.gov/diet/usualintakes/macros.html). Two different types of NCI macros were used for analysis - the univariate macros MIXTRAN and DISTRIB2,3 and the bivariate macros for ratio and density calculations5. Usual intakes for Sodium, Potassium and Energy were calculated using the univariate macros, since they pertain to only one component. The one-part (amount-only) model was used since less than 5% of the 24 hour recalls had zero intake of a nutrient7. Similarly, for the analysis stratified by self-reported blood pressure, the univariate amount-only model was used, with self-reported blood pressure status as a covariate.

On the other hand, the ratio of usual intakes was calculated for the sodium to potassium ratio as well as sodium and potassium density, since these variables correspond to an analysis of two or more components. Sodium density and potassium density were calculated as the amount of sodium (or potassium) per 1000 kcal per day. The ratio of usual intakes is based on a joint bivariate model for the two components (numerator and denominator)5. The first step of the procedure is to transform each component to approximate normality using a Box-Cox transformation, which is chosen to minimize the mean squared error around a straight line fit to a weighted quantile-quantile (QQ) plot using the survey weight of each respondent. A bivariate linear mixed effects model is then fit, which assumes the subject-specific model terms and the within-subject terms have a bivariate normal distribution, independent of each other. Monte Carlo simulations are then calculated using the parameter values estimated from the model. Usual intakes for a large number of pseudo-individuals were then generated, and subsequently the ratio of usual intakes was calculated for each pseudo-individual. In addition, the mean and percentile of the distribution of ratio of usual intakes were estimated.

During the analysis, it was noted that the model did not converge for some DRI age/sex groups, thus data from the previous 2004 CCHS-Nutrition (Cycle 2.2) was included to increase sample size and produce stable estimates. As a result, parameter estimates were obtained from a dataset with both 2004 and 2015 CCHS-Nutrition combined, and survey cycle was also included as a covariate. As per previous NCI analyses2,3,5, covariates for sequence of recall and weekend/weekday were also included. Furthermore, usual intakes were obtained by each DRI age/sex group separately using a stratified approach6,7.

#### 2.3. Statistical Assumptions

In either the univariate or bivariate case, once the model was chosen, the ratio of within-between variance components was used to evaluate other statistical assumptions, including choice of covariates and outliers. Large values of the within-between variance ratio suggests instability of model parameter estimates, and leads to a larger adjustment of the one-day intakes to the usual intakes. As a result, estimation of percentiles of the usual intake distribution may be impacted. To ensure model accuracy, the effect of covariates and outliers were evaluated when computing usual intakes.

In cases where the difference between Day 1 and Day 2 intakes was abnormally large (i.e. ratio of within- to between- variation >10), analyses were conducted to look for potential outliers. In such cases, the Day 2 value was removed as Day 1 values are considered to be more reliable as they are less likely to be biased due to learning curve or change in diet since the respondent is aware that a recall will take place. The impact of outlier removal on the within-between variation was determined on the basis of ±3, ±2.5 or ±2 SD away from the mean distribution of the difference between Day 1 and Day 2 values. The scenario which resulted in the greatest improvement in within-between variation with the fewest outliers removed was selected.

A second method for outlier removal was also considered7. In this case, a Box-Cox transformation of the raw non-zero values is performed and observations are flagged which are 2.5 times the interquartile range below the 25th and above the 75th percentiles, respectively. For this approach, outliers are identified if they lead to a possible violation of the normality assumption.

Table 1 summarizes the outliers identified as part of the various usual intake analyses.

Table 1: Summary of Outliers by Energy and Nutrient: Analysis of Nutrient Intake from Foods for the National Academies of Sciences, Engineering and Medicine

Component DRI Age-Sex Group Threshold Number of recalls removed
Energy 1 to 3 years 3 SD 10
Sodium (mg/d) 19 to 30 years, males 2 SD 39
Sodium (mg/d) 19 to 30 years, females 3 SD 9
Sodium (mg/d) 31 to 50 years, females 3 SD 20
Potassium (mg/d) 1 to 3 years 2.5 SD 31
Potassium (mg/d) 31 to 50 years, males 2 SD 63
Potassium (mg/d) 71 years and over, males 3 SD 15

Data Source: 2015 Canadian Community Health Survey - Nutrition - Share file

#### 2.4. Standard Errors using Bootstrap Replication

The CCHS-Nutrition Focus Surveys have a complex design, implying that no mathematical formula exists to calculate the sampling variability directly. Instead, it is necessary to use a replication method to estimate this variance, and the most convenient method is bootstrap replication. Statistics Canada has provided bootstrap replicate weights to estimate the variance from complex survey sampling designs.

For simple estimates such as totals, ratios or regression parameters, it is possible to estimate the sampling variability by using the bootstrap weights with a survey-procedure, such as SUDAAN, STATA, or PROC SURVEYMEANS in SAS. These procedures properly account for the complex survey design in the estimation of standard errors. To obtain an estimate, the parameter of interest is calculated (e.g. total, ratio) for each of the 500 replicates and then the variance between the 500 values is computed. This is the method used to estimate the average nutrient intake using day one recalls only. For estimates related to distributions of usual intake, this process must be repeated when using the NCI method. Thus, it is necessary to estimate the parameters of interest with the NCI method for each replicate (using each bootstrap weight) and then calculate the variance between each of the 500 estimates.

For some survey procedures, the variance of the 500 replicates compares each estimate to the mean of the 500 bootstraps (the bootstrap mean). However the root estimate (the estimate calculated using the survey weight) is also available from the data. Typically, since the number of replicates is large (500) the bootstrap mean will converge to the root estimate. However, since the NCI method may fail for some of the 500 replicates, it is possible that not all of the 500 distribution estimates will be available to calculate the mean of the bootstrap estimates. For this reason, when calculating the variance from the bootstrap estimates, each replicate was compared to the root estimate, not to the mean of the bootstrap estimates. As such, some of the bias caused by failing replicates is mitigated in the estimation procedure.

##### 2.4.1. Standard Errors for Ratios of Usual Dietary Intake

Compared with estimation of a single nutrient such as potassium or sodium, estimation of the ratio of usual intakes using the NCI method requires substantially more computing time to produce estimates and associated standard errors. As an example, estimation of the standard error for the sodium to potassium ratio for one DRI age/sex group (Males 9 - 13 years old) required approximately 63 hours of computing time compared with 5 hours for sodium alone, using 500 bootstrap replicates.

As a result, in order to ensure accuracy and meet project deadlines, an analysis was performed to assess the impact of using fewer than 500 bootstrap replicates on the estimation of standard error for the ratio of usual intakes. For the Sodium to Potassium ratio, four DRI age/sex groups were identified for the analysis: Males 9 - 13 years, Females 9 - 13 years, Males 14 - 18 years and Females 14 - 18 years old. Standard errors were calculated based on 200, 300, 400 and 500 bootstrap weights for the usual intake mean as well as the following percentiles: 1st, 5th, 10th, 25th, 50th, 75th, 90th, 95th and 99th. Results are summarized in Table 2.

Overall, the average difference in standard error calculated using 200 as opposed to 500 bootstrap replications was -0.0009. For most estimates, the standard errors are the same to two decimal places, yet the savings in computational time is substantial. These findings are consistent with other published research which noted little additional gain in using more than 200 bootstrap replicates for analysis with the NCI method3.

Table 2 - Comparison of Standard Errors for Sodium to Potassium Ratio using the NCI method

DRI Group No. of bootstraps used n Standard Errors for Mean and Percentiles
Mean 1st 5th 10th 25th 50th 75th 90th 95th 99th
Males 9 - 13 years 199 1023 0.02 0.04 0.03 0.03 0.03 0.02 0.03 0.05 0.07 0.11
299 1023 0.02 0.04 0.03 0.03 0.02 0.02 0.03 0.05 0.07 0.11
399 1023 0.02 0.04 0.04 0.03 0.03 0.02 0.03 0.05 0.07 0.11
498 1023 0.02 0.04 0.04 0.03 0.03 0.02 0.03 0.05 0.07 0.11
Females 9 - 13 years 200 939 0.02 0.04 0.03 0.03 0.02 0.02 0.03 0.05 0.07 0.11
300 939 0.02 0.04 0.03 0.03 0.02 0.02 0.03 0.05 0.07 0.11
400 939 0.02 0.04 0.03 0.03 0.02 0.02 0.03 0.05 0.07 0.11
500 939 0.03 0.04 0.04 0.03 0.02 0.02 0.03 0.06 0.07 0.11
Males 14 - 18 years 200 900 0.03 0.06 0.05 0.04 0.03 0.03 0.04 0.07 0.09 0.14
299 900 0.03 0.06 0.05 0.04 0.03 0.03 0.04 0.07 0.09 0.14
398 900 0.03 0.06 0.05 0.04 0.03 0.03 0.04 0.07 0.09 0.14
498 900 0.03 0.06 0.05 0.04 0.03 0.03 0.04 0.07 0.09 0.14
Females 14 - 18 years 199 986 0.03 0.05 0.05 0.04 0.03 0.03 0.04 0.06 0.08 0.13
299 986 0.03 0.06 0.05 0.04 0.03 0.03 0.04 0.06 0.08 0.13
399 986 0.03 0.05 0.05 0.04 0.03 0.03 0.04 0.06 0.08 0.13
499 986 0.03 0.05 0.05 0.04 0.03 0.03 0.04 0.06 0.08 0.13

Data Source: 2015 Canadian Community Health Survey - Nutrition - Share file

For more information on interpreting data from the 2015 Canadian Community Health Survey - Nutrition, the reader is referred to the following Reference Guide to Understanding and Using the Data8, available at:

### Table Footnotes for the Sodium and Potassium data tables

1. The survey excludes from its target population those living in the three territories, individuals living on Indian reserves or on residents of institutions, full‐time members of the Canadian Armed Forces and residents of certain remote regions.
2. Sample size is based on the first 24‐hour dietary recall (first day of interview) only.
3. Intakes are based on food consumption only. Intakes from vitamin and mineral supplements are not included.
4. The intake distribution (percentiles and percentage above or below a cut‐off when applicable) was adjusted using the National Cancer Institute (NCI) Method as described in Tooze JA, Midthune D, Dodd KW, et al.: A new statistical method for estimating the usual intake of episodically consumed foods with application to their distribution. J Am Diet Assoc 2006;106: 1575-1587 and Tooze JA, Kipnis V, Buckman DW, et al.: A mixed-effects model approach for estimating the distribution of usual intake of nutrients: the NCI method. Stat Med 2010; 29: 2857-2868. Methodology for the ratio of usual intakes was described in Freedman LS, Guenther PM, Dodd, KW, Krebs-Smith SM, Midthune D. The population distribution of ratios of usual intakes of dietary components that are consumed every day can be estimated from repeated 24-hour recalls. The Journal of Nutrition 2010; 140: 111-116.
5. Bootstrapping techniques were used to produce the standard error (SE).

### References

1. Statistics Canada. Canadian Community Health Survey 2015 - Nutrition User Guide. 2017.
2. Tooze JA, Midthune D, Dodd KW, et al.: A new statistical method for estimating the usual intake of episodically consumed foods with application to their distribution. J Am Diet Assoc 2006;106: 1575-1587.
3. Tooze JA, Kipnis V, Buckman DW, et al.: A mixed-effects model approach for estimating the distribution of usual intake of nutrients: the NCI method. Stat Med 2010; 29: 2857-2868.
4. Nusser SM, Carriquiry AL, Dodd KW, Fuller WA: A semiparametric transformation approach to estimating usual daily intake distributions. J Am Stat Assoc 1996; 91: 1440-1449.
5. Freedman LS, Guenther PM, Dodd, KW, Krebs-Smith SM, Midthune D. The population distribution of ratios of usual intakes of dietary components that are consumed every day can be estimated from repeated 24-hour recalls. The Journal of Nutrition 2010; 140: 111-116.
6. Davis KA, Gonzalez A, Loukine L, Qiao C, Sadeghpour A, Vigneault M, Wang KC, Ibanez D. Early experience analyzing dietary intake data from the Canadian Community Health Survey - Nutrition using the National Cancer Institute (NCI) Method. In preparation.
7. Krebs-Smith SM, Guenther PM, Subar AM, Kirkpatrick SI, Dodd KW. Americans do not meet federal dietary recommendations. The Journal of Nutrition 2010; 140: 1832-1838.