Introduction to ESMA 3102

This page discusses some general concepts of ESMA 3102.

3101 vs. 3102

In ESMA 3101 (3015) we were mainly concerned with answering questions about one variable at a time. We considered problems like these:
What is the average height of men in Puerto Rico? (Find the mean or median, or draw a histogram or boxplot, or find a confidence interval)
Are men in Puerto Rico on average taller than 5'10''? (do a hypothesis test)
Has the average income in Puerto Rico gone up in the last ten years? (hypothesis test)

In ESMA 3102 we are going to study two (or more) variables simultaneously, and we are really interested in their relationships:
Is the average height of men in Puerto Rico different from men in the USA and from men in Europe?
How does the average height of men relate to things like their economic status (income), their race, their diet, et.
How does the average income in Puerto Rico depend on the economic policies of the Government?

Discrete vs. Continuous Variables

We categorize variables as follows:
Type Discrete Continuous
Description Maybe numeric, maybe not (words, dates, et). If it is numbers, then relatively few different values are repeated many times. Always numbers. Almost all values are different, with few if any repetitions.
Examples 1) Day of the week on which a person was born
2) Age at which a student graduated from High School
3) Number of times a student took precalculus until they passed
1) Yearly Income of a family in Puerto Rico
2) Weight of a person entering a weight loss program
3) Gasoline consumption of a car using a special brand of gasoline

For more on data types see page 32 of the textbook.

Predictor - Response Paradigm

It is often useful to think of the problems we discuss in this class as trying to use one (or more) variables to predict another
Predictor(s) Response
Gender Income
GPA in high school, points on college boards GPA after the freshmen year in college
Whether fertilizer was used or not Yield of crop
Size of lot, size of house, number of bedrooms, quality of neighborhood Price of house

Types of Problems in 3102

Depending on the type of data we need to use different methods of analysis. Here is a table to help with this:
Response Predictor(s) Method
Continuous At least one continuous Regression
Discrete Discrete Categorical data analysis
Continuous All discrete Analysis of Variance (ANOVA)

Warning: This table maybe the most important item for you to learn - understand - memorize - use. Without it you can not pass this class, or do Statistics in real live!

Statistical Models

Many of the problems in ESMA 3102 will require us to find a model for the data. A model is just another word for an equation giving a relationship between the predictors and the response.

Model Type
Income = 15312+311·Years of Service Simple linear model
Amount of Radioactive Material = 1.23·10-0.78·Time Exponential model
Price of House = 15113+2450·Size of Lot+3425·Size of House+945·Number of Bedrooms Multiple linear model