Nayland College Mathematics; More than just a school  
 
 

 

 
 
 

AS 3.5 Bivariate Data

Bivariate HOME | Continuous variables | Aim | Scatterplots | Correlation | Excel | Correlation Coefficient | Outliers | Causality | Regression | Residuals | Excel Data Analysis | Coefficient of Determination | Extrapolation

Achievement Standard 3.5 Bivariate Data is a 3 Credits Internal Assessment.

Overview | Achievement Standard

1

 

What is Bivariate Data?

Height and Arm span measurement. Data table
Data recorded.
Is there a relationship between the variables?
Graphing the data appropriately.
Variables need to be specified, with units and causality considered.

MR Notes | Blank

You will be provided with data that has several columns of values: McDonalds data set | Excel version

You will need to select appropriate pairs of data sets to establish if a relationship exists. McDonalds Aims

What are continuous variables?

Learn more about the aim of your investigation and variables | McDonalds Variables

AIS p103

 Scatterplots

What is a scatterplot?
When investigating bivariate data making a scatterplot is the first step.
What types of scatterplot associations are possible?

If the scatterplot indicates a linear model is appropriate then we can proceed.

The relationship can be positive or negative.

What is 'Regression' and 'Correlation'

MR Scatterplot Notes | Blank

Ex 13.01 p???

How to make a scatterplot using Excel
Learn more about describing scatterplots

AIS  p99

 

3

 

What is the Correlation Coefficient 'r'

What is Covariance?
The correlation coefficient 'r' gives two pieces of information:
The DIRECTION of the relationship: positive or negative.
The STRENGTH of the relationship: r=±1 strong to r=0 weak

Correlation coefficient measures the degree of ASSOCIATION between two variables.

Regression is the process of fitting a line to bivariate data to predict 'y' values from the value of an 'x'

The differences between actual and predicted values of 'y' are the 'residuals' The aim of regression is to minimise the residuals.

MR Regression Notes | Blank Notes

Reminder of the Standard Deviation

Learn more about the correlation coefficient 'r'
Using CORREL Excel function
Learn more about Regression

McDonalds Scatterplots

 

 What is the Coefficient of Determination'R2'

The Coefficient of Determination R2 is a measure of the variation of the dependant variable that can be explained by the model.

MR notes, Blank Notes

Ex 15.01 p317

Learn more about R2

McDonalds Coefficient of Determination and discussion of the Linear Model

AIS  p100, 110

Explaining the Linear Model

y = mx + c

The equation modelling the relationship indicates that for every unit change in 'x', the 'y' increases/decreases by 'm' (rounded appropriately)

Progress check: Longjump data set

Save in: \\MANAGEMENT\Maths$\Student Data\Mr RILEY\MRY 3A Stats

5

 

Correlation, Causality & Outliers

Does Correlation imply Causality?

Two variables may have a high correlation, but this does not always mean that one variable influences the other. There may be other lurking variables causing both the variables to change.

How do you deal with Outliers?

Is the outlier an error or just different?
Do NOT remove outliers unless they are errors.
Is it appropriate to analyse the data twice - Once with and once without the outlier.

MR notes on Causality & Outliers, Blank Notes

Ex 13.04 pg 269, Ex 13.05 pg 273

Learn more about causality | Learn more about removing outliers | McDonalds outlier

AIS  p104, 109

What are Residuals?

Residuals are the prediction errors that data values are above or below the line of best fit.

MR notes on, Blank Notes

Learn more about Residuals | McDonalds Residuals

AIS  p108

Progress check: Internet/GDP data set (Achieve & Merit - includes interpolation & extrapolation)

Save in: \\MANAGEMENT\Maths$\Student Data\Mr RILEY\MRY 3A Stats

7

 

What is interpolation and extrapolation?

Interpolation is the prediction of values WITHIN the data range using the model. Extrapolation is the prediction of data OUTSIDE the data range.

MR Interpolation Notes | Blank

Learn more about interpolation and extrapolation | McDonalds interpolation and extrapolation

AIS  p101

Investigate the Cars data set
Pick an appropriate pair of variables.
1) Produce an appropriate scatter plot. Define variables used.
2) Add the Linear Trend line.
3) Use the CORREL Excel function to calculate 'r'. How does this compare with R2
4) Discuss the scatterplot in general terms.
5) Discuss the meaning of 'r' in context.
6) Discuss the meaning of R2 in context.
7) Discuss the gradient of the linear trend line in context.

8) Discuss correlation and causation in context.
9) Produce and discuss a graph of residuals.
10) Interpolate a value and extrapolate a value (with discussion).

11) Rename the spreadsheet with your name and save into the postoffice MR stats folder.
\\Applications\vpo\3A Stats MR

 

 

 

Writing a report

Regression analysis involved these steps...

1) Constructing a scatterplot
2) Identify if variables are 'associated' or 'explanatory' and 'dependant'
3) Investigate the affects of groups and outliers
4) Calculate and discuss the correlation coefficient 'r'
5) Form a regression model and interpolate and extrapolate
6) What possible assumptions, limitations, improvements, bias, other models are possible?

Sigma Example pg 299, answer p300 Excel data

Assessment Criteria , Report Suggestions

Ex 14.03

Q2 Data
Q3 Data
Q4 Data

Q2 Ans
Q3 Ans
Q4 Ans

 

 

AIS  p1054, 105, 106, 111, 112, 113, 114, 115,

Analyse the Life Expectancy data

Rename the spreadsheet with your name and save into the postoffice MR stats folder.
\\MANAGEMENT\Maths$\Student Data\Mr RILEY\MRY 3A Stats

Ms Barks Class
\\MANAGEMENT\Maths$\Student Data\KBS\3A_Statistics_2010

Cut & Paste

9

 

Non linear regression

Using a curve to fit a set of bivariate data.
Be careful when using R2 p307
Never use the correlation coefficient 'r' when discussing a non-linear model

Ex 14.04

Q1 Data
Q2 Data
Q3 Data
Q4 Data
Q5 Data

Q1 Ans
Q2 Ans
Q3 Ans
Q4 Ans
Q5 Ans

McDonalds non-linear

AIS  p107

 

Revision

Complete an investigation on volcanic islands Excel Data set

Good luck in the assessment.

AIS  p116

"Achieving in Statistics" by W. Geldof data sets. Practice Assessments, gradually becoming more self-directed.
If you have purchased the book from the student office for $16 you can have your very own answers and check your progress.

pg103 Flight Information | pg104 Plane Facts | pg108 Used Cars | pg111 United Nations | pg 114 Boat Data

Bivariate Data Powerpoint

Applets

  • Regression by eye A scatter plot is displayed and you can draw in regression lines by hand. You can then compare your lines to the best least squares fit. You can also try to guess the correlation coefficient, r.
  • Guess the correlation coefficient competition Four scatter plots and 4 correlation coefficients and your task is to match the coefficients to the plots. New plots can be generated and a running score is kept.
  • Build your own scatter plot You can add points to the plot and move points around. See the effects on the least squares fitted line (the plotted line and its equation) and the correlation coefficient.
  • Components of r The slope, standard error of the estimate, and the standard deviation of X can all be manipulated independently to see the effect on the scatter plot, r and a visual representation of R-squared.
  • Scatter plot and correlation coefficient The sample size, the points on the scatter plot and the correlation coefficient can all be manipulated independently for you to see their effect on each other.
  • Nonlinear examples and the correlation coefficient

If you go to a page with an applet and you do not have the proper plug-in, you should be prompted automatically to download the plug-in. If this does not happen, download and execute the Java plug-in given here Java 2.

weblinks
auckland uni stats dept - link to all sorts of good stuff  
   

 

back to top