Statistics Research and Report Assignment :

In this assignment you will examine data used by a Real Estate investment advisor. She wants you

to answer some specific questions put by clients about houses prices in the neighbourhood encompassed by 4 suburbs around city of Melbourne. The data is contained in the file

*‘Real_Estate.xls’ *and contains the following columns (variables):

Variable Name | Description |

ID | House Identity number |

Price | Selling Price of the house (in 000’s) |

Bedrooms | Number of bedrooms |

Size | House Size (m^{2}) |

Pool | 0=House without a Pool 1=House with a Pool |

Distance | Distance from city centre (km) |

Suburb | Suburb number |

Garage | 0=House without a Garage 1=House with a Garage |

#### Random Sample:

Before you begin your analysis, you are required to take a random sample of size 150 from the 170 cases in the file. Use the file *Random_Sample_Generator-2.xls *to do this. Answers to the questions below are to be based on your sample of 110 cases. Make sure to keep a safe copy of the sample you use since you cannot use *Random_Sample_Generator-2.xls *to reproduce it. Provide a printout of the data in your sample, with **ID **numbers in ascending order.

#### Part 1: Initial Data Analysis

- Variable List
- Using the variables listed in the table above, state for each variable whether it is qualitative or quantitative.

- If it is qualitative, state whether it is nominal or ordinal, and if it is quantitative, state whether it is discrete or continuous.

- Histogram
- Create a histogram showing the distribution of selling
**price**of the house.

- Comment upon the shape of the distribution: is it symmetric? If it is not, is it positively or negatively skewed?

- Are there any outliers present? If so, are they of particular interest?

- State which central measure would be best to use to describe the centre of this distribution, and the reason(s) why.

- Create a histogram showing the distribution of selling

- Descriptive statistics
- Prepare a table that shows the 5-number summary of
**price**for houses in the 4**suburbs**.

- Construct side-by-side boxplots for the price of the houses in the 4 suburbs. Briefly comment upon any differences you observe in house price for each suburb.

- Are there any outliers present? If so, are they of particular interest?

- State which central measure would be best to use to describe the centre of this distribution, and the reason(s) why.

- Prepare a summary table that shows the mean and standard deviation of
**Price**for houses in the**4 Suburbs**according (subject) to the variable**Bedrooms**. Think carefully about the layout of the rows and columns of your table. As well as means and standard deviations you should also include the number of houses in each group. So each cell in your final table should contain the*mean*, the*standard deviation*and*n,*the number of houses in that group.

- Refer to part (e). Comment, in bullet point form, on the
**Price**of any combinations for

- Prepare a table that shows the 5-number summary of

**Suburb **and **Bedrooms **variables (i.e. cells in the table).

- Statistical inferences

One of the clients wants information on size of houses as it relates to price.

- Produce a scatter plot of
**Price**vs**Size**(**Size**should be on the horizontal axis). Make sure you label your axes properly and that your graph has an appropriate title.- Refer to part (a). Briefly, describe the nature of the relationship between these 2 variables.

- Now, create a new variable (column) labelled
**Size Group**which divides**Size**up into two size groups as follows:

Under 200 square meters | Small |

200 square meters and over | Large |

- Produce suitable graphs or charts to help in providing the information requested on the

**Size **of the house as it relates to **Price**.

- Construct 95% confidence interval for small and large houses
**Price**.- Refer to (ii). Is there any interaction (overlap) between the 2 Confidence Intervals? What does this tell you about the
**Prices**for the two**Sizes**.

- Refer to (ii). Is there any interaction (overlap) between the 2 Confidence Intervals? What does this tell you about the

#### Part 2: Research Questions

Based on your random sample, identify and investigate **TWO **research questions of your own using inferential statistics (estimation and hypothesis testing).

Statistics Research and Report Assessment

Visit:https://aussienment.com/

Also visit:https://www.notesnepal.com/archives/767

## YOUR COMMENT