Information Technology

CSC8002 Research Assessment

20 April 2023 15:55 PM | UPDATED 2 years ago

Suppose you are working for the Australian Government as a “Data Scientist” to tackle COVID-19 or any other future pandemic. Google has released a dataset on people’s

mobility during the pandemic. As a “Data Scientist”, can you find some critical information from that dataset, which can help Australia tackle COVID-19?

In this assignment, you will augment the report from Assignment 1.

Tasks of the Assignment:

  • Explore the dataset based on your research question.
  • Find out the useful subset of the dataset, which will be helpful to answer your research question.
  • Create a NoSQL database with the subset of the dataset.
  • Query the NoSQL database and analyse the data to answer your research question.

1.  Introduction (0):

  • Provide a brief discussion on the dataset details. (0)
    • From where did you download the dataset? (0)
    • How can this dataset help Australia to tackle COVID 19? Please justify. (0)

2.   Data Exploration (0):

  • Discuss the size of the dataset. (0)
  • Discuss the format of the dataset. (0)
  • Discuss the features (columns) of the dataset. (0)

3.  Literature Review (0):

  • Find at least two research works from CSC8002 Google Scholar (Any preprint or published work)” where the researchers have used this dataset. (0)
  • Please provide a brief discussion on their research. How did the researchers use this dataset to answer their research question? (0)

4.  Research Question/Selection of the Problem (0):

  • Identify a research question that you can answer after analysing the dataset. (0
    • Justify your research question. Why is your research question important? How can this help Australia tackle COVID19 or any future pandemic? (0)

5.  Data Analysis to Select Subset Data (18):

  • Provide a discussion on the analysis of the dataset to select the useful subset of the dataset (After analysing this dataset, you can answer your research question). (10)
    • List the steps you have taken to find out the useful subset of the dataset. (4)
    • Submit the code (Any programming language) as a separate file named

“subset_dataselection”. (4) [Anything from software like Excel, PowerBI, Tableau is

not accepted. In this case, you will get zero for this criterion.]

6.  Database Design Based on the Subset Data (17):

  • Provide a discussion on the logical design of your NoSQL CSC8002 database. (with a figure showing the components and connections within the database.) (5+5)
  • Justify your design. (2)
  • Submit the code (Any programming language) as a separate file named “subset_nosql_database_creation”. (5) [Results from software like Excel, PowerBI, Tableau will not be accepted. In this case, you will get zero for this criterion]

7.  Data Analysis (35):

  • Connect the database with any programming language of your choice. Write appropriate queries to fetch the required dataset and analyse it with any programming language of your choice. Submit the code (Any programming language) as a separate file named “code_for_analysis”. (2+3) [Anything from software like Excel, PowerBI, Tableau is not accepted. In this case, you will get zero for this criterion]
  • Provide a detailed analysis with appropriate visualisations to answer the research question. (15 {Analysis and Visualisations} + 15 {Relevant Discussions according to the Analysis and Visualisations})

8.  Findings (5):

  • Provide the discussion to answer your research question based on the findings from the analysis. (5)

9.  Ethics and Privacy (3):

  • As this is a dataset of community mobility, how did Google protect the privacy of its user? (0)
  • Research Australian Law on collecting public data and show the validity of this community mobility dataset according to Australian Law. (3)

10.   Hosting on a remote server (15)

  • Please host a NoSQL database in the Azure cloud for public access with your subset data according to your design. Now, make a video while accessing this database from your personal computer using public IP and perform at least three queries. Upload this video to Google Drive and share the link at the end of the report or in a separate file.

11.  Writing Style and Report Format (7):

  • The report is written clearly, and the sections are connected with Assignment 1. (4)
    • The report follows the given structure. (1)
    • Proper and correct in-text citation is presented in the report. (1)
    • The report cannot exceed fifteen pages (Page count includes everything including the table of contents, references and appendix). Any font of size 12pt is accepted. (1)

12.   Submission Format:

  • Please submit the report in a pdf format. Any other submission format is not accepted. For the codes, please submit them in their native format and the results should be reproducible.

13.  Late Submission and Extension Request

Please refer to USQ Policy Library – Assessment Procedure for information on the late submission policy and the USQ Policy Library – Assessment of Compassionate and Compelling Circumstances Procedure for considerable special circumstances in extension requests.

If you have any difficulties in meeting the deadline, please apply for the extension request.

USQ has zero-tolerance for academic misconduct, including plagiarism and collusion. Plagiarism refers to the activity of presenting someone else’s work as if you wrote it yourself. Collusion is a specific type of cheating that occurs when two or more students exceed a permitted level of collaboration on a piece of assessment. Identical layout, identical mistakes, identical arguments, and identical presentations in students’ assignments are evidence of plagiarism and collusion. Such academic misconduct may lead to serious consequences, such as:

  • Required to undertake an additional assessment in the course.
  • Failed in the piece of assessment.
  • Awarded a grade of Fail for the course.
  • Withdrawn from the course with an academic penalty.
  • Excluded from the course of the program for a period of time.


