This is a somewhat large interesting dataset, a data frame of 15 variables (columns) on 9575 persons (rows).

data(NHANES)

Format

This data frame contains the following columns:

Cancer.Incidence

binary factor with levels No and Yes.

Cancer.Death

binary factor with levels No and Yes.

Age

numeric vector giving age of the person in years.

Smoke

a factor with levels Current, Past, Nonsmoker, and Unknown.

Ed

numeric vector of \(\{0,1\}\) codes giving the education level.

Race

numeric vector of \(\{0,1\}\) codes giving the person's race.

Weight

numeric vector giving the weight in kilograms

BMI

numeric vector giving Body Mass Index, i.e., Weight/Height^2 where Height is in meters, and missings (61% !) are coded as 0 originally.

%% rather FIXME?
Diet.Iron

numeric giving Dietary iron.

Albumin

numeric giving albumin level in g/l.

Serum.Iron

numeric giving Serum iron in \(\mu\)g/l.

TIBC

numeric giving Total Iron Binding Capacity in \(\mu\)g/l.

Transferin

numeric giving Transferin Saturation which is just 100*serum.iron/TIBC.

Hemoglobin

numeric giving Hemoglobin level.

Sex

a factor with levels F (female) and M (male).

Examples