Skip to menu Skip to content Skip to footer
Course profile

Data Analytics at Scale (DATA7201)

Study period
Sem 1 2025
Location
St Lucia
Attendance mode
In Person

Course overview

Study period
Semester 1, 2025 (24/02/2025 - 21/06/2025)
Study level
Postgraduate Coursework
Location
St Lucia
Attendance mode
In Person
Units
2
Administrative campus
St Lucia
Coordinating unit
Elec Engineering & Comp Science School

Data Science techniques often need to be applied to large amounts of data to generate insights. To deal with volume, velocity, and variety of data we need to rely on novel computational architectures that focus on scaling-out data processing as compared to the classic scale-up approach. Such systems allow to add computational resources to a distributed system depending on requirements and load which changes over time. In this course we will give students knowledge about modern scale-out system architectures to perform data analytics queries over very large structured/unstructured datasets as well as to run data mining algorithms at scale.

Course requirements

Assumed background

Students starting this course should have a basic understanding of the following:

  • Bash scripts
  • Python
  • SQL

Prerequisites

You'll need to complete the following courses before enrolling in this one:

DATA7001 and (INFS7901 or INFS2200)

Restrictions

Restricted to MDataSc students only.

Course contact

Lecturer

Professor Gianluca Demartini

Timetable

The timetable for this course is available on the UQ Public Timetable.

Aims and outcomes

The aim of this course is to give you knowledge about big data analytics architectures and help you to understand when and how to appropriately use such scalable data processing solutions. The course will help you understand the challenges and opportunities of big data analytic infrastructures. This course aims to:

  1. Provide an introduction to different big data computational architectures and algorithms (e.g., Map/Reduce);
  2. Provide an overview of existing big data analytics products for volume, velocity, and variety of data;
  3. Show how big data analytics is used in industry by means of use cases;
  4. Provide practical hands-on experience through use of cloud-based software for big data processing.

Learning outcomes

After successfully completing this course you should be able to:

LO1.

Solve challenges and leverage opportunities in dealing with Big Data

LO2.

Use Big Data infrastructure solutions for Volume, Variety, and Velocity including industry-driven and open-source solutions

LO3.

Apply data analytics infrastructures to best support data science practices for non-technical stakeholders (e.g., executives)

LO4.

Compare alternative data analytics infrastructure solutions and select the most appropriate one for a certain use case

LO5.

Judge in which situations Big Data analytics solutions are more or less appropriate.

LO6.

Design the most appropriate Big Data infrastructure solution given a use case where to deploy Big Data solutions

Assessment

Assessment summary

Category Assessment task Weight Due date
Quiz Module quizzes (Series of 3)
  • Online
10%

Week 7, Tue 3:00 pm

Week 10, Tue 3:00 pm

Week 13, Tue 3:00 pm

Quizzes will open Tuesdays on Week 6, 9 and 12. Students will have one week to complete the quiz.

Paper/ Report/ Annotation, Project Report on Dataset Analytics
  • Online
45%

30/05/2025 3:00 pm

Examination Final Exam
  • Hurdle
  • Identity Verified
  • In-person
45%

End of Semester Exam Period

7/06/2025 - 21/06/2025

A hurdle is an assessment requirement that must be satisfied in order to receive a specific grade for the course. Check the assessment details for more information about hurdle requirements.

Assessment details

Module quizzes (Series of 3)

  • Online
Mode
Activity/ Performance
Category
Quiz
Weight
10%
Due date

Week 7, Tue 3:00 pm

Week 10, Tue 3:00 pm

Week 13, Tue 3:00 pm

Quizzes will open Tuesdays on Week 6, 9 and 12. Students will have one week to complete the quiz.

Learning outcomes
L01, L04, L05

Task description

At the end of each course module (week 6, week 9, week 12) students will be asked to complete an online quiz worth 5% each with the total score for the series capped at 10%. This is done to allow students to skip one should they need.

The quizzes will be due on Tuesdays week 7, 10, and 13 and will open on Tuesdays week 6, 9, 12; students have one week to complete them and can do it anytime during the week.

This assessment task evaluates students' abilities, skills and knowledge without the aid of generative Artificial Intelligence (AI) or Machine Translation (MT). Students are advised that the use of AI or MT technologies to develop responses is strictly prohibited and may constitute student misconduct under the Student Code of Conduct. 

Submission guidelines

The quizzes will be available on Blackboard.



Deferral or extension

You cannot defer or apply for an extension for this assessment.

No extensions available and 100% Late penalty applied for the online quizzes as results and feedback are released immediately after the due date.

To accommodate unforeseen circumstances such as illness, your mark will be based on the best 2 out of 3 submissions. 

Late submission

You will receive a mark of 0 if this assessment is submitted late.

Report on Dataset Analytics

  • Online
Mode
Written
Category
Paper/ Report/ Annotation, Project
Weight
45%
Due date

30/05/2025 3:00 pm

Learning outcomes
L01, L03, L05, L06

Task description

Given a dataset, you should use scalable data analytics techniques to explore the data and to draw some conclusions that inform decision makers. 

You should write a 1,500 word structured report that describes the approach you have taken to analyse the given dataset using scalable data analytics techniques. The report should focus on summarising your approach on the given dataset and presenting your main findings.

You should focus on communicating clearly the results of your analysis and in helping the reader interpret your findings.  

Artificial Intelligence (AI) and Machine Translation (MT) are emerging tools that may support students in completing this assessment task. Students may appropriately use AI and/or MT in completing this assessment task. Students must clearly reference any use of AI or MT in each instance. A failure to reference generative AI or MT use may constitute student misconduct under the Student Code of Conduct. 

Submission guidelines

The report has to be submitted via the Turnitin available on Blackboard.


 

Deferral or extension

You may be able to apply for an extension.

The maximum extension allowed is 21 days. Extensions are given in multiples of 24 hours.

This course uses a progressive assessment approach where feedback and/or detailed solutions will be released to students within 28 days. 

Late submission

A penalty of 10% of the maximum possible mark will be deducted per 24 hours from time submission is due for up to 7 days. After 7 days, you will receive a mark of 0.

Final Exam

  • Hurdle
  • Identity Verified
  • In-person
Mode
Written
Category
Examination
Weight
45%
Due date

End of Semester Exam Period

7/06/2025 - 21/06/2025

Other conditions
Time limited.

See the conditions definitions

Learning outcomes
L01, L02, L04, L05, L06

Task description

Delivery Mode: On campus invigilated exam.

Timing: The final exam will be scheduled at a fixed time for all students – i.e. students will complete the exam simultaneously. 

Permitted materials: None.

Other Information: In this assessment the student will be presented with technical questions about the scalable data analytics infrastructure discussed in the course as well as big data scenarios to be discussed.

This assessment task evaluates students' abilities, skills and knowledge without the aid of generative Artificial Intelligence (AI) or Machine Translation (MT). Students are advised that the use of AI or MT technologies to develop responses is strictly prohibited and may constitute student misconduct under the Student Code of Conduct. 

Hurdle requirements

A final exam mark less than 40% means grade is capped at 3.

Exam details

Planning time 10 minutes
Duration 120 minutes
Calculator options

No calculators permitted

Open/closed book Closed Book examination - no written materials permitted
Exam platform Paper based
Invigilation

Invigilated in person

Submission guidelines

Deferral or extension

You may be able to defer this exam.

Course grading

Full criteria for each grade is available in the Assessment Procedure.

Grade Description
1 (Low Fail)

Absence of evidence of achievement of course learning outcomes.

Course grade description: 0-19%

2 (Fail)

Minimal evidence of achievement of course learning outcomes.

Course grade description: 20-46%

3 (Marginal Fail)

Demonstrated evidence of developing achievement of course learning outcomes

Course grade description: 47-49%

4 (Pass)

Demonstrated evidence of functional achievement of course learning outcomes.

Course grade description: 50-64%

5 (Credit)

Demonstrated evidence of proficient achievement of course learning outcomes.

Course grade description: 65-74%

6 (Distinction)

Demonstrated evidence of advanced achievement of course learning outcomes.

Course grade description: 75-84%

7 (High Distinction)

Demonstrated evidence of exceptional achievement of course learning outcomes.

Course grade description: 85-100%

Additional course grading information

Assessment item results will be combined as per the weightings above and your final percentage will be rounded to the nearest whole number before your final grade is determined as per the cutoffs above.

You must achieve at least 40% on the final exam to pass the course. If you do not achieve at least 40% on the final exam then your overall mark will be capped at 49% and your final grade will be capped at 3.

As percentage cutoffs are used, these will be rounded up before grades are determined (e.g., 84.5% is a 7).

The course coordinator reserves the right to moderate marks.

Supplementary assessment

Supplementary assessment is available for this course.

Additional assessment information

Having Troubles? If you are having difficulties with any aspect of the course material, you should seek help and speak to the course teaching staff. If external circumstances are affecting your ability to work on the course, you should seek help as soon as possible. The University and UQ Union have organisations and staff who are able to help; for example, UQ Student Services are able to help with study and exam skills, tertiary learning skills, writing skills, financial assistance, personal issues, and disability services (among other things). Complaints and criticisms should be directed in the first instance to the course coordinator. If you are not satisfied with the outcome, you may bring the matter to the attention of the School of EECS Director of Teaching and Learning.


Learning resources

You'll need the following resources to successfully complete the course. We've indicated below if you need a personal copy of the reading materials or your own item.

Library resources

Find the required and recommended resources for this course on the UQ Library website.

Additional learning resources information

The sections to be read from the required texts will be indicated on Blackboard.

Learning activities

The learning activities for this course are outlined below. Learn more about the learning outcomes that apply to this course.

Filter activity type by

Please select
Clear filters
Learning period Activity type Topic
Multiple weeks

From Week 1 To Week 13

Lecture

Lectures

Lectures will cover 1) use cases for data analytics at scale, 2) fundamental data infrastructure architectures, and 3) applications of data analytics at scale to problems like, e.g., recommender systems, log mining, and opinion mining. These sessions will also be used for exam preparation and student discussions.

Learning outcomes: L01, L02, L03, L04, L05, L06

Multiple weeks

From Week 3 To Week 12

Practical

Practicals

During this sessions, students will be exposed to the systems discussed during the lectures and they will be able to develop their own data analytics solutions over these scalable systems.

Learning outcomes: L01, L02, L03, L06

Policies and procedures

University policies and procedures apply to all aspects of student life. As a UQ student, you must comply with University-wide and program-specific requirements, including the:

Learn more about UQ policies on my.UQ and the Policy and Procedure Library.

School guidelines

Your school has additional guidelines you'll need to follow for this course: