Stories and Data Sets

WR Inc
Marital Status
Treatment of Drug Addiction
Undergraduates and Race
Newcomb's Speed of Light
Babe Ruth's Homeruns
Drug Use of Mothers and the Health of the Newborn
Olympic Men's Long Jump
1970's Military Draft
Supermarket Price Comparison
Sleep Deprivation
Calcium and Blood Pressure
Healthy and Failed Companies
Friday the 13th
Rates of AIDS in Americas (1995)
Cost and Income of Movies
Headache and Pain Reliever
Salaries
Polio cases before the Salk Vaccine
Age by Gender in US and Puerto Rico (Census 2000)
Rogaine
Sex Ratios by State (American Community Survey 2004)
Deaths in car crashes in 2002
10 Leading Causes of Death
Evaluating Instructors
Study Habits of College Students
Methods of Drowning

WR Inc

WR Inc. is a large (ficticous) company. It recently did a survey of all its employees, asking them to fill out a questionnaire with questions regarding their gender, income etc. In addition they randomly selected 500 employees and asked them some addtional questions.
Data set: wrinc
Worksheets: CensusData, filled out by all employees, and SampleData, filled out by 500 randomly selected employees.

Marital Status

The marital status of all American 18 or older, according to the US census.
Data set: maritalstatus

Treatment of Drug Addiction

Cocaine addiction is hard to break. Addicts need cocaine to feel any pleasure, so perhaps giving them an antidepressant drug will help. A 3 year study with 72 chronic cocaine users compared an antidepressant called desipramine with standard treatment for cocaine addiction (lithium) and a placebo. One third of the subjects chosen at random received each drug. After 3 years for each addict it was determined whether he/she was drug free or relapsed. The data, from D.M. Barnes, "Breaking the Cycle of Addiction", Science, 241 1988).
Data set: drugaddiction

Undergraduates and Race

Breakdown by race of 12,263,000 undergraduate students in US colleges in 1994, according to the US Department of Education.
Data Set: race

Newcomb's Measurements of the Speed of Light

Simon Newcomb made a series of measurements of the speed of light between July and September 1882. He measured the time in seconds that a light signal took to pass from his laboratory on the Potomac River to a mirror at the base of the Washington Monument and back, a total distance of 7400m. His first measurement was 0.000024828 seconds, or 24,828 nanoseconds (109 nanoseconds = 1 second). The data are the deviations (differences) from 24,800 nanoseconds.
Data Set: newcomb

Babe Ruth's Homeruns

Number of homeruns hit by Babe Ruth while he was with the new York Yankees (1920 - 1934)
Data Set: babe

Drug Use of Mothers and the Health of the Newborn

Chasnoff and others obtained several measures and responses for newborn babies whose mothers were classified by degree of cocain use. The study was conducted in the Perinatal Center for Chemical Dependence at Northwestern University Medical School. The measurement given here is the length of the newborn.

Source: Cocaine abuse during pregnancy: correlation between prenatal care and perinatal outcome
Authors: SN MacGregor, LG Keith, JA Bachicha, and IJ Chasnoff
Obstetrics & Gynecology 1989;74:882-885
Data set: cocain

Olympic Men's Long Jump

Data on the gold medal winning performances in the men's long jump for the modern Olympic games.
Data set: longjump

1970's Military Draft

In 1970, Congress instituted a random selection process for the military draft. All 366 possible birth dates were placed in plastic capsules in a rotating drum and were selected one by one. The first date drawn from the drum received draft number one and eligible men born on that date were drafted first. In a truly random lottery there should be no relationship between the date and the draft number.
Data set: draft

Supermarket Price Comparison

In order to see whether there is a difference in the prices of foods at two local supermarkets we bought a basket of the same items at each of the two stores.
Data set: foods

Sleep Deprivation

Kelby Childers asked subjects to perform several tasks before and after 24 hours of sleep deprivation. One task involved the subjects lifting weights until muscle failure. The data is a count of the number of bench presses done before and after the 24 hours. (The Effect of Sleep Deprivation on Performance of a Gross and a Fine Motor Skill)
Data set: sleep

Calcium and Blood Pressure

In a randomized comparative experiment researchers gave 10 black men a calcium supplement for 12 weeks. The control group of 11 black men received a placebo (there were also white men in this study but we will only consider the black men) The experiment was double-blind. The data is the seated systolic (heart contracted) blood pressure for all subjects before and after the 12 weeks.
Data set: calcium

Healthy and Failed Companies

Why do some companies succeed and others fail? A study compared various characteristics of 68 healthy and 33 failed companies. One of the variables was the ratio of current assets to current liabilities, the amount a company is worth divided bu what it owes. (C. Papoulias and P. Theodossiou, "Analysis and modeling of recent business failures in Greece", Managerial and Decision Economics, 1992)
Data set: company

Friday the 13th

Is Friday the 13th an unusually unlucky day, or is this just superstition? How do superstitions affect people's behavior? These questions were addressed by researchers Scanlon, et al. (1993) in a study that examined the relationship between behavior and superstition in the United Kingdom. They analyzed shopping and traffic patterns, as well as the numbers and types of accidents that occurred on past Friday the 13th's. The study, conducted in England, focused on two questions: 1) How do superstitions regarding Friday the 13th affect human behavior?, and 2) Is Friday the 13th more unlucky than other Fridays?
Scanlon, T.J., Luben, R.N., Scanlon, F.L., Singleton, N. (1993), "Is Friday the 13th Bad For Your Health?," BMJ, 307, 1584-1586.
Variable Names:
1. Dataset: Identifies source dataset (traffic, shopping, or accident)
2. Dates: year and month in which the Friday the 13th occurred
3. 6th: Number of cars passing through junction (traffic dataset), shoppers for each supermarket (shopping dataset), or admissions due to transport accidents (accident dataset) on Friday the 6th
4. 13th: Number of cars passing through junction (traffic dataset), shoppers for each supermarket (shopping dataset), or admissions due to transport accidents (accident dataset) on Friday the 13th
5. Location: Motorway junction (traffic dataset), supermarket location (shopping dataset) or hospital (accident dataset) to which the data correspond
Data set: friday

Rates of AIDS in Americas (1995)

Data of the incidence of AIDS per 100,000 population in 1995 for the Americas as reported by the World Health Organization WHO.
Data set: aids

Cost and Income of Movies

Cost and Income of movies produced by Castle Rock Entertainment, in million dollars.
Data set: movies

Headaches and Pain Reliever

A pharmaceutical company set up an experiment in which patients with a common type of headache were treated with a new analgesic or pain reliever. The analgesic was given to each patient in one of four dosage levels: 2,5,7 or 10 grams. Then the time until noticeable relieve was recorded in minutes. In addition the sex and the blood pressure of each patient was recorded. The blood pressure groups where formed by comparing each patients diastolic and systolic pressure reading with historical data. Based on this comparison the patients are assigned to one of three types: low (0.25), medium (0.5), high (0.75) according to the respective quantiles of the historic data.
Data set: headache

Salaries

We have the salaries of the men and the women in a company. We also have their ratings on evaluations for 2004 and 2005
Data set: salaries

Polio cases before the Salk Vaccine

Annual number of polio cases from 1930-1955 in the US
Data set: polio

Age by Gender in US and PR (Census 2000)

Breakdown of the population of USA and Puerto Rico by age and gender, according to the 2000 Census
Data set: Puerto Rico: agesex, all of US: agesexUS

Rogaine

Rogaine is the first treatment for hair loss approved by the Food and Drug Administration. Here we have the results of one of the studies that were done to show that rogaine works. A randomized clinical trial was carried out. 1431 bald men were randomly assigned to two groups. The men in the treatment group received Rogaine, the men in the control group received a placebo. After some time the men were examined and assigned to one of 5 groups:
No Growth = no difference
New Vellus = some hair follicles
Min Growth = minimal hair growth
Mod Growth = moderate hair growth
Den Growth = dense hair growth

Here is the original statistical analysis by the Food and Drug Administration used to approve rogaine.
Data set: rogaine

Sex Ratios by State (American Community Survey 2004)

Sex ratios by state, without Puerto Rico. Data from the American Community Survey 2004. Includes upper and lower limits of 90% confidence interval.
Data set: sexratio

Deaths in Car Accidents (2002)

Number of fatalities in car crashes in the US, by age and gender. Source: National Center for Injury Prevention and Control
Data set: cardeaths

10 Leading Causes of Death

Table with 10 leading causes of Death, by age in 2001. Source: National Center for Injury Prevention and Control:
Table:

Evaluating Instructors

A survey of business students at nine colleges in the United States was taken to determine the instructor behaviors that the students feel are more likely to contribute to their academic success. 215 undergraduate students enrolled in business classes were asked to list the instructor classroom behaviors they felt were important to their academic success. A questionnaire consisting of 51 instructor behaviors was studied. The data are for 735 students from nine business colleges randomly selected from the 1990/1991 AACSB Handbook.
Variable Names:
Behavior: The 51 instructor behaviors on the survey
IM: Number of students who responed "Important for academic success"
NU: Number of students who responed "Neither important nor unimportant for academic success"
NI: Number of students who responed "Not important for academic success"
Data set: instructor

Study Habits of College Students

The Survey of Study Habits and Attitudes (SSHA) is a psychological test that measures the motivation, attitude towards school, and the study habits of students. Scores range from 0 to 200. We have the scores of 20 male and 18 female first-year students at a private college. (taken from David Moore: The Active Practice of Statistics)
Data set: studyhabits

Drownings

Data is from O'Carroll PW, Alkon E, Weiss B. Drowning mortality in Los Angeles County, 1976 to 1984, JAMA, 1988 Jul 15;260(3):380-3.
Drowning is the fourth leading cause of unintentional injury death in Los Angeles County. We examined data collected by the Los Angeles County Coroner's Office on drownings that occurred in the county from 1976 through 1984. There were 1587 drownings (1130 males and 457 females) during this nine-year period, for an annual rate of 2.36 drownings per 100,000 persons (3.44 for males and 1.33 for females). The largest proportion of drownings (44.5%) for both sexes, and in almost every age group, occurred in private swimming pools. Children 2 to 3 years of age had the highest swimming-pool drowning rate (7.95). The elderly also experienced high drowning rates, primarily in swimming pools and bathtubs. Drowning-site profiles varied dramatically by age and sex. These findings indicate a need for Los Angeles County to address the problem of drownings among infants and toddlers in private swimming pools and to investigate the failure of regulations requiring fencing of swimming pools to prevent these deaths. These findings also suggest several potential opportunities for preventive intervention by physicians and demonstrate that health professionals cannot rely on national drowning-site profiles when developing local drowning prevention strategies.
Data set: drownings