WR Inc
WR Inc. is a (ficticous) company. It recently did a survey of all its employees, asking them to fill out a questionnaire with questions regarding their gender, income etc.
Data set: wrinc
Marital Status
The marital status of all American 18 or older, according to the US census.
Data set: maritalstatus
Treatment of Drug Addiction
Cocaine addiction is hard to break. Addicts need cocaine to feel any pleasure, so perhaps giving them an antidepressant drug will help. A 3 year study with 72 chronic cocaine users compared an antidepressant called desipramine with standard treatment for cocaine addiction (lithium) and a placebo. One third of the subjects chosen at random received each drug. After 3 years for each addict it was determined whether he/she was drug free or relapsed. The data, from D.M. Barnes, "Breaking the Cycle of Addiction", Science, 241 1988).
Data set: drugaddiction
Undergraduates and Race
Breakdown by race of 12,263,000 undergraduate students in US colleges in 1994, according to the US Department of Education.
Data Set: race
Newcomb's Measurements of the Speed of Light
Simon Newcomb made a series of measurements of the speed of light between July and September 1882. He measured the time in seconds that a light signal took to pass from his laboratory on the Potomac River to a mirror at the base of the Washington Monument and back, a total distance of 7400m. His first measurement was 0.000024828 seconds, or 24,828 nanoseconds (109 nanoseconds = 1 second). The data are the deviations (differences) from 24,800 nanoseconds.
Data Set: newcomb
Babe Ruth's Homeruns
Number of homeruns hit by Babe Ruth while he was with the new York Yankees (1920 - 1934)
Data Set: babe
Drug Use of Mothers and the Health of the Newborn
Chasnoff and others obtained several measures and responses
for newborn babies whose mothers were classified by degree of cocain use.
The study was conducted in the Perinatal Center for Chemical Dependence
at Northwestern University Medical School. The measurement given here is
the length of the newborn.
Source: Cocaine abuse during pregnancy: correlation between prenatal care and perinatal outcome
Authors: SN MacGregor, LG Keith, JA Bachicha, and IJ Chasnoff
Obstetrics & Gynecology 1989;74:882-885
Data set: cocain
Olympic Men's Long Jump
Data on the gold medal winning performances in the men's long jump for the modern Olympic games.
Data set: longjump
1970's Military Draft
In 1970, Congress instituted a random selection process for the military draft. All 366 possible birth dates were placed in plastic capsules in a rotating drum and were selected one by one. The first date drawn from the drum received draft number one and eligible men born on that date were drafted first. In a truly random lottery there should be no relationship between the date and the draft number.
Data set: draft
Supermarket Price Comparison
In order to see whether there is a difference in the prices of foods at two local supermarkets we bought a basket of the same items at each of the two stores.
Data set: foods
Sleep Deprivation
Kelby Childers asked subjects to perform several tasks before and after 24 hours of sleep deprivation. One task involved the subjects lifting weights until muscle failure. The data is a count of the number of bench presses done before and after the 24 hours. (The Effect of Sleep Deprivation on Performance of a Gross and a Fine Motor Skill)
Data set: sleep
Calcium and Blood Pressure
In a randomized comparative experiment researchers gave 10 black men a calcium supplement for 12 weeks. The control group of 11 black men received a placebo (there were also white men in this study but we will only consider the black men) The experiment was double-blind. The data is the seated systolic (heart contracted) blood pressure for all subjects before and after the 12 weeks.
Data set: calcium
Healthy and Failed Companies
Why do some companies succeed and others fail? A study compared various characteristics of 68 healthy and 33 failed companies. One of the variables was the ratio of current assets to current liabilities, the amount a company is worth divided bu what it owes. (C. Papoulias and P. Theodossiou, "Analysis and modeling of recent business failures in Greece", Managerial and Decision Economics, 1992)
Data set: company
Friday the 13th
Is Friday the 13th an unusually unlucky day, or is this just superstition? How do superstitions affect people's behavior? These questions were addressed by researchers Scanlon, et al. (1993) in a study that examined the relationship between behavior and superstition in the United Kingdom. They analyzed shopping and traffic patterns, as well as the numbers and types of accidents that occurred on past Friday the 13th's. The study, conducted in England, focused on two questions: 1) How do superstitions regarding Friday the 13th affect human behavior?, and 2) Is Friday the 13th more unlucky than other Fridays?
Scanlon, T.J., Luben, R.N., Scanlon, F.L., Singleton, N. (1993), "Is Friday the 13th Bad For Your Health?," BMJ, 307, 1584-1586.
Variable Names:
1. Dataset: Identifies source dataset (traffic, shopping, or accident)
2. Dates: year and month in which the Friday the 13th occurred
3. 6th: Number of cars passing through junction (traffic dataset), shoppers for each supermarket (shopping dataset), or admissions due to transport accidents (accident dataset) on Friday the 6th
4. 13th: Number of cars passing through junction (traffic dataset), shoppers for each supermarket (shopping dataset), or admissions due to transport accidents (accident dataset) on Friday the 13th
5. Location: Motorway junction (traffic dataset), supermarket location (shopping dataset) or hospital (accident dataset) to which the data correspond
Data set: friday
Rates of AIDS in Americas (1995)
Data of the incidence of AIDS per 100,000 population in 1995 for the Americas as reported by the World Health Organization WHO.
Data set: aids
Cost and Income of Movies
Cost and Income of movies produced by Castle Rock Entertainment, in million dollars.
Data set: movies
Headaches and Pain Reliever
A pharmaceutical company set up an experiment in which patients
with a common type of headache were treated with a new analgesic or
pain reliever. The analgesic was given to each patient in one of four
dosage levels: 2,5,7 or 10 grams. Then the time until noticeable relieve
was recorded in minutes. In addition the sex and the blood pressure of
each patient was recorded. The blood pressure groups where formed by
comparing each patients diastolic and systolic pressure reading with
historical data. Based on this comparison the patients are assigned to
one of three types: low (0.25), medium (0.5), high (0.75) according to
the respective quantiles of the historic data.
Data set: headache
Salaries
We have the salaries of the men and the women in a company. We also have their ratings on evaluations for 2004 and 2005
Data set: salaries
Polio cases before the Salk Vaccine
Annual number of polio cases from 1930-1955 in the US
Data set: polio
Age by Gender in US and PR (Census 2000)
Breakdown of the population of USA and Puerto Rico by age and gender, according to the 2000 Census
Data set: Puerto Rico: agesex, all of US: agesexUS
Rogaine
Rogaine is the first treatment for hair loss approved by the Food and Drug Administration. Here we have the results of one of the studies that were done to show that rogaine works. A randomized clinical trial was carried out. 1431 bald men were randomly assigned to two groups. The men in the treatment group received Rogaine, the men in the control group received a placebo. After some time the men were examined and assigned to one of 5 groups:
No Growth = no difference
New Vellus = some hair follicles
Min Growth = minimal hair growth
Mod Growth = moderate hair growth
Den Growth = dense hair growth
Here is the original statistical analysis by the Food and Drug Administration used to approve rogaine.
Data set: rogaine
Sex Ratios by State (American Community Survey 2004)
Sex ratios by state, without Puerto Rico. Data from the American Community Survey 2004. Includes upper and lower limits of 90% confidence interval.
Data set: sexratio
Deaths in Car Accidents (2002)
Number of fatalities in car crashes in the US, by age and gender. Source: National Center for Injury Prevention and Control
Data set: cardeaths
10 Leading Causes of Death
Table with 10 leading causes of Death, by age in 2001. Source: National Center for Injury Prevention and Control:
Table:
Evaluating Instructors
A survey of business students at nine colleges in the United States was taken to determine the instructor behaviors that the students feel are more likely to contribute to their academic success. 215 undergraduate students enrolled in business classes were asked to list the instructor classroom behaviors they felt were important to their academic success. A questionnaire consisting of 51 instructor behaviors was studied. The data are for 735 students from nine business colleges randomly selected from the 1990/1991 AACSB Handbook.
Variable Names:
Behavior: The 51 instructor behaviors on the survey
IM: Number of students who responed "Important for academic success"
NU: Number of students who responed "Neither important nor unimportant for academic success"
NI: Number of students who responed "Not important for academic success"
Data set: instructor
Study Habits of College Students
The Survey of Study Habits and Attitudes (SSHA) is a psychological test that measures the motivation, attitude towards school, and the study habits of students. Scores range from 0 to 200. We have the scores of 20 male and 18 female first-year students at a private college. (taken from David Moore: The Active Practice of Statistics)
Data set: studyhabits
Drownings
Data is from O'Carroll PW, Alkon E, Weiss B. Drowning mortality in Los Angeles County, 1976 to 1984, JAMA, 1988 Jul 15;260(3):380-3.
Drowning is the fourth leading cause of unintentional injury death in Los Angeles County. We examined data collected by the Los Angeles County Coroner's Office on drownings that occurred in the county from 1976 through 1984. There were 1587 drownings (1130 males and 457 females) during this nine-year period, for an annual rate of 2.36 drownings per 100,000 persons (3.44 for males and 1.33 for females). The largest proportion of drownings (44.5%) for both sexes, and in almost every age group, occurred in private swimming pools. Children 2 to 3 years of age had the highest swimming-pool drowning rate (7.95). The elderly also experienced high drowning rates, primarily in swimming pools and bathtubs. Drowning-site profiles varied dramatically by age and sex. These findings indicate a need for Los Angeles County to address the problem of drownings among infants and toddlers in private swimming pools and to investigate the failure of regulations requiring fencing of swimming pools to prevent these deaths. These findings also suggest several potential opportunities for preventive intervention by physicians and demonstrate that health professionals cannot rely on national drowning-site profiles when developing local drowning prevention strategies.
Data set: drownings