30 April 2023 13:51 PM | UPDATED 12 months ago

Statistics Research And Report

 Statistics Research and Report Assignment – Brief Date Due: Week 11 Worth: 20% of your final grade

In this assignment you will examine data used by a Real Estate investment advisor. She wants you

to answer some specific questions put by clients about houses prices in the neighbourhood encompassed by 4 suburbs around city of Melbourne. The data is contained in the file

‘Real_Estate.xls’ and contains the following columns (variables):

Variable NameDescription
IDHouse Identity number
PriceSelling Price of the house (in 000’s)
BedroomsNumber of bedrooms
SizeHouse Size (m2)
Pool0=House without a Pool 1=House with a Pool
DistanceDistance from city centre (km)
SuburbSuburb number
Garage0=House without a Garage 1=House with a Garage

Random Sample:

Before you begin your analysis, you are required to take a random sample of size 150 from the 170 cases in the file. Use the file Random_Sample_Generator-2.xls to do this. Answers to the questions below are to be based on your sample of 110 cases. Make sure to keep a safe copy of the sample you use since you cannot use Random_Sample_Generator-2.xls to reproduce it. Provide a printout of the data in your sample, with ID numbers in ascending order.

Part 1: Initial Data Analysis

  1. Variable List
    1. Using the variables listed in the table above, state for each variable whether it is qualitative or quantitative.
    1. Text Box: 8 pointsIf it is qualitative, state whether it is nominal or ordinal, and if it is quantitative, state whether it is discrete or continuous.
  • Histogram
    • Create a histogram showing the distribution of selling price of the house.
    • Comment upon the shape of the distribution: is it symmetric? If it is not, is it positively or negatively skewed?
    • Are there any outliers present? If so, are they of particular interest?
    • Text Box: 3 X 4 = 12 pointsState which central measure would be best to use to describe the centre of this distribution, and the reason(s) why.
  • Descriptive statistics
    • Prepare a table that shows the 5-number summary of price for houses in the 4 suburbs.
    • Construct side-by-side boxplots for the price of the houses in the 4 suburbs. Briefly comment upon any differences you observe in house price for each suburb.
    • Are there any outliers present? If so, are they of particular interest?
    • State which central measure would be best to use to describe the centre of this distribution, and the reason(s) why.
    • Prepare a summary table that shows the mean and standard deviation of Price for houses in the 4 Suburbs according (subject) to the variable Bedrooms. Think carefully about the layout of the rows and columns of your table. As well as means and standard deviations you should also include the number of houses in each group. So each cell in your final table should contain the mean, the standard deviation and n, the number of houses in that group.
    • Text Box: 30 pointsRefer to part (e). Comment, in bullet point form, on the Price of any combinations for

Suburb and Bedrooms variables (i.e. cells in the table).

  • Statistical inferences

One of the clients wants information on size of houses as it relates to price.

  1. Produce a scatter plot of Price vs Size (Size should be on the horizontal axis). Make sure you label your axes properly and that your graph has an appropriate title.
    1. Refer to part (a). Briefly, describe the nature of the relationship between these 2 variables.
    1. Text Box: 5+5+10
20 points
Now, create a new variable (column) labelled Size Group which divides Size up into two size groups as follows:
Under 200 square metersSmall
200 square meters and overLarge
  1. Produce suitable graphs or charts to help in providing the information requested on the

Size of the house as it relates to Price.

  1. Construct 95% confidence interval for small and large houses Price.
    1. Refer to (ii). Is there any interaction (overlap) between the 2 Confidence Intervals? What does this tell you about the Prices for the two Sizes.

Part 2: Research Questions

Text Box: 30 pointsBased on your random sample, identify and investigate TWO research questions of your own using inferential statistics (estimation and hypothesis testing).


