Course overview
- Study period
- Semester 1, 2025 (24/02/2025 - 21/06/2025)
- Study level
- Undergraduate
- Location
- St Lucia
- Attendance mode
- In Person
- Units
- 2
- Administrative campus
- St Lucia
- Coordinating unit
- Elec Engineering & Comp Science School
Compiler modules; programming language specifications; lexical analysis, parsing - recursive descent and table driven; static semantics - symbol tables and type checking; error handling; introduction to code generation and optimisation; compiler generators; interpreters.
The invention of (high-level) programming languages in the 1950s was one of the most significant advances for making software development more accessible and time efficient --- compare writing programs in machine or assembly language with, say, Java. Implementing programming languages was a major challenge taken on in the 1960s and 1970s that led to the development of modern day compilers and the theory and practice that we study in this course. A significant advantage of using a programming language over machine/assembly language is machine independence.
To ensure that the implementations of a programming language on different machines are consistent one needs a machine-independent definition of the language. To achieve this we first define the syntax of the language (i.e., what are the legal strings in the language) and then its semantics (or meaning). The syntax of the language is defined via a grammar written in terms of the basic symbols (such as keywords, identifiers, and operators), which are called lexical tokens, and the tokens are defined in terms of regular expressions over character strings. Not all strings allowed by the grammar are valid programs in the language because, for example, identifiers cannot be multiply defined within a single scope and expressions must be of the correct type for their context. We use a set of type checking rules to define the programs that are well formed. Finally we need to define what the result of an expression evaluation should be and what the effect of executing a statement should be.
A programming language is implemented by either a compiler or interpreter. These are themselves both just (rather sophisticated) programs. A compiler implements a programming language by translating a high-level program (e.g., a Java program represented simply as a text file (which is itself represented as a stream of characters (which are represented by bits))) to a machine-level program (represented as a sequence of machine instructions (e.g., x86 machine instructions (that are themselves represented as bits))). [Aside: you have to be able to handle lots of nesting in this course.]
The tasks undertaken by a compiler can be divided into the following.
- Recognising the input program and detecting syntax errors in the program via
- lexical analysis: recognising the basic symbols (or lexical tokens) of the language, and
- parsing: recognising the structure of programs composed of these basic symbols.
- Constructing an abstract internal representation of the program in terms of
- an abstract syntax tree, and
- a symbol table.
- Checking the well formedness of a program according to its static semantics, e.g., checking that assignment statements are type correct.
- Generating code, which can include changing the representation of the program to forms more suitable for code generation and code optimisation (both machine independent at the source code level and machine dependent at the target machine code level).ᅠThe generated machine code can then be executed on the target machine. Due to time constraints within the course we don't spend a lot of time on code optimisation.
An interpreter is a simpler (and usually less efficient) way of implementing a programming language. It acts in the same way as a compiler in the early phases, but does not have a code generation phase. Instead it directly interprets its internal representation of the program. The main advantage of interpreters over compilers is that they do not depend on the target machine, and hence are more portable than compilers. Interpreters are used for simple languages, like operating system command shells, where the overheads of generating machine code outweigh the slower execution time of the interpreted program, or for very high-level languages (e.g., functional and logic programming languages) where the individual operations are so powerful that the execution-time cost of interpretation over compilation is not all that significant.
The language recognition phases of compilers are common to many applications of computers that involve a language of some form. For example, a web browser recognises a document written in the Hyper Text Markup Language (HTML) and interprets it by generating a display of the document on the screen.
Early compilers were hand-crafted programs designed to implement a particular language for a particular machine. However, in theᅠ1960s it was recognised that many of the aspects of writing a compiler were not dependent on the particular programming language being compiled. In particular, the early phases of recognising a program and creating an abstract representation of it were dependent mainly on the grammar of the language. The result was the creation of compiler generators (lexical analyser and parser generators really). A parser generator is yet another program that accepts a description of a programming language (in the form of a grammar for the language) and generates a program, that is a parser for that language. Modern compiler generators are often implemented as suites of cooperating tools that generate compiler components from a range of notations, each of which specifies a different aspect of a language (and its implementation). These activities include not only lexical analysis and parsing, but also error checking and code generation.
Course Changes in Response to Previous Student Feedback
Updates have been made to the information provided in the week-by-week course outline.
Course requirements
Assumed background
Prerequisite: COMP2502 or COMP3506
A compiler is a substantial program and the software engineering techniques taught in COMP3506 and its prerequisites (especiallyᅠCSSE2002) are needed in developing such a tool. The assignments in this course involve modifying reasonably large, complex software and require sound programming skills. Extensive programming experience, in particular in Java, is assumed knowledge.
A sound knowledge of data structures is required for an understanding of the appropriate data structures to use for internal structures in a language processor such as the symbol table and abstract syntax tree. Some knowledge of how basic data structures such as arrays and records are implemented at the machine level is expected from the COMP3506 prerequisite.
Recommended: CSSE1000 or CSSE2010
A basic knowledge of machine and assembly language programming is required for an understanding of the code generation phase of the compiler.
Prerequisites
You'll need to complete the following courses before enrolling in this one:
COMP3506
Recommended prerequisites
We recommend completing the following courses before enrolling in this one:
CSSE2010
Incompatible
You can't enrol in this course if you've already completed the following:
COMP7402
Course contact
Lecturer
Timetable
The timetable for this course is available on the UQ Public Timetable.
Aims and outcomes
This course studies how compilers and interpreters for programming languages are implemented. Along the way we learn a lot about programming languages, their grammars, what they really mean, how they are implemented, and how to structure large programs.
Learning outcomes
After successfully completing this course you should be able to:
LO1.
Knowledge Outcomes - Explain how programming languages are defined in terms of grammars, type checking rules, and evaluation/execution rules and be able to write/modify grammars and type checking rules.
LO2.
Knowledge Outcomes - Understand the structure of a compiler in terms of lexical analysis, parsing, static semantic analysis and code generation.
LO3.
Knowledge Outcomes - Apply and explain the theory of lexical analysis of programming languages (i.e., recognising keywords, numbers, variable names, operators, etc.) using regular expressions and finite state machines.
LO4.
Knowledge Outcomes - Apply and explain the theory of parsing of (programming language) grammars (i.e., recognising the structure of programs in terms of procedures, statements and expressions, etc) using (top-down), recursive-descent parsing and (bottom-up) shift/reduce parsing.
LO5.
Knowledge Outcomes - Comprehend the static semantics of programming languages (i.e., handling variable, type and procedure declarations and analysing the type correctness of programs) and apply the theory to develop static semantic analysers.
LO6.
Knowledge Outcomes - Understand the run-time organisation of memory using stacks for expression evaluation, procedure calls, and local variable allocation; and heaps for dynamic memory allocation ("new" in Java) and apply that understanding to handle code generation.
LO7.
Knowledge Outcomes - Understand and apply strategies for generating machine code for programming language expressions, control structures, and procedure/method calls.
LO8.
Knowledge Outcomes - Understand modern programming language constructs and be able to implement them.
LO9.
Knowledge Outcomes - Understand the use of interpreters to implement programming languages and be able to implement them.
LO10.
Skills Outcomes - An improved ability to structure and program large systems.
LO11.
Skills Outcomes - Use compiler generation tools to generate lexical analysers (from regular expressions), parsers (from grammars), and static semantic analysis components.
Assessment
Assessment summary
Category | Assessment task | Weight | Due date |
---|---|---|---|
Quiz |
Online quizzes
|
10% |
21/03/2025 3:00 pm 28/03/2025 3:00 pm 4/04/2025 3:00 pm 2/05/2025 3:00 pm 16/05/2025 3:00 pm 30/05/2025 3:00 pm |
Computer Code | Assignment 1 | 15% |
17/04/2025 3:00 pm |
Computer Code | Assignment 2 | 20% |
23/05/2025 3:00 pm |
Examination |
Final Examination
|
55% |
End of Semester Exam Period 7/06/2025 - 21/06/2025 |
A hurdle is an assessment requirement that must be satisfied in order to receive a specific grade for the course. Check the assessment details for more information about hurdle requirements.
Assessment details
Online quizzes
- Online
- Mode
- Written
- Category
- Quiz
- Weight
- 10%
- Due date
21/03/2025 3:00 pm
28/03/2025 3:00 pm
4/04/2025 3:00 pm
2/05/2025 3:00 pm
16/05/2025 3:00 pm
30/05/2025 3:00 pm
Task description
There will be 6 computer-based quizzes worth 2% each. Only the marks for your best 5 quizzes will contribute towards your final grade. They will be due at 3pm on Friday in weeks 4, 5, 6, 9, 11 and 13.
This assessment task evaluates students' abilities, skills and knowledge without the aid of generative Artificial Intelligence (AI) or Machine Translation (MT). Students are advised that the use of AI or MT technologies to develop responses is strictly prohibited and may constitute student misconduct under the Student Code of Conduct.
Submission guidelines
The quizzes are accessed via the course Blackboard site.
Deferral or extension
You cannot defer or apply for an extension for this assessment.
To accommodate unforeseen circumstances such as illness, your quiz score will be based on the best 5 out of 6 submissions.
Late submission
You will receive a mark of 0 if this assessment is submitted late.
Assignment 1
- Mode
- Product/ Artefact/ Multimedia
- Category
- Computer Code
- Weight
- 15%
- Due date
17/04/2025 3:00 pm
Task description
Extending a recursive descent compiler. This assignment involves extending a compiler written in Java using recursive descent parsing to handle additional language features. Full details of the language extensions and files for the compiler will be provided via Blackboard.
This assessment task evaluates students' abilities, skills and knowledge without the aid of generative Artificial Intelligence (AI) or Machine Translation (MT). Students are advised that the use of AI or MT technologies to develop responses is strictly prohibited and may constitute student misconduct under the Student Code of Conduct.
Submission guidelines
Assignments are to be submitted online via Blackboard unless otherwise specified for a particular assessment item.
Deferral or extension
You may be able to apply for an extension.
The maximum extension allowed is 7 days. Extensions are given in multiples of 24 hours.
This course uses a progressive assessment approach where feedback and/or detailed solutions will be released to students within 14 days.
Late submission
A penalty of 10% of the maximum possible mark will be deducted per 24 hours from time submission is due for up to 7 days. After 7 days, you will receive a mark of 0.
Assignment 2
- Mode
- Product/ Artefact/ Multimedia
- Category
- Computer Code
- Weight
- 20%
- Due date
23/05/2025 3:00 pm
Task description
Extending a compiler using a parser generator. This assignment involves extending a compiler written using a parser generator and the lexical analyzer generator to handle additional language features. Full details of the language extensions and files for the compiler will be provided via Blackboard.
This assessment task evaluates students' abilities, skills and knowledge without the aid of generative Artificial Intelligence (AI) or Machine Translation (MT). Students are advised that the use of AI or MT technologies to develop responses is strictly prohibited and may constitute student misconduct under the Student Code of Conduct.
Submission guidelines
Assignments are to be submitted online via Blackboard unless otherwise specified for a particular assessment item.
Deferral or extension
You may be able to apply for an extension.
The maximum extension allowed is 7 days. Extensions are given in multiples of 24 hours.
This course uses a progressive assessment approach where feedback and/or detailed solutions will be released to students within 14 days.
Late submission
A penalty of 10% of the maximum possible mark will be deducted per 24 hours from time submission is due for up to 7 days. After 7 days, you will receive a mark of 0.
Final Examination
- Hurdle
- Identity Verified
- In-person
- Mode
- Written
- Category
- Examination
- Weight
- 55%
- Due date
End of Semester Exam Period
7/06/2025 - 21/06/2025
Task description
The final examination is compulsory and covers the entire course. The final examination is closed book but you may bring in one A4-size page crib sheet:
- it may be written on both sides
- it may be hand written
- if machine produced it must be a minimum of 10 point font size
- it is to be handed in with your examination script
No calculators permitted.
Further details on the examination content and past examination papers will be provided at the end of semester and will be available from Blackboard.
Method of delivery: This exam will be conducted as an invigilated on-campus exam.
This assessment task evaluates students' abilities, skills and knowledge without the aid of generative Artificial Intelligence (AI) or Machine Translation (MT). Students are advised that the use of AI or MT technologies to develop responses is strictly prohibited and may constitute student misconduct under the Student Code of Conduct.
Hurdle requirements
Refer to the course grading for details.Exam details
Planning time | 10 minutes |
---|---|
Duration | 120 minutes |
Calculator options | No calculators permitted |
Open/closed book | Closed Book examination - specified written materials permitted |
Materials | One A4 sheet of handwritten or typed notes, double sided, is permitted |
Exam platform | Paper based |
Invigilation | Invigilated in person |
Submission guidelines
Deferral or extension
You may be able to defer this exam.
Course grading
Full criteria for each grade is available in the Assessment Procedure.
Grade | Description |
---|---|
1 (Low Fail) |
Absence of evidence of achievement of course learning outcomes. Course grade description: MARK < 20% (see the end of this section for a definition of MARK) |
2 (Fail) |
Minimal evidence of achievement of course learning outcomes. Course grade description: MARK >= 20%. Your final percentage mark is the minimum of MARK and 46%. |
3 (Marginal Fail) |
Demonstrated evidence of developing achievement of course learning outcomes Course grade description: MARK >= 47% and Exam >= 37%. Your final percentage mark is the minimum of MARK and 49%. |
4 (Pass) |
Demonstrated evidence of functional achievement of course learning outcomes. Course grade description: MARK >= 50% and Exam >= 40%. Your final percentage mark is the minimum of MARK and 64%. |
5 (Credit) |
Demonstrated evidence of proficient achievement of course learning outcomes. Course grade description: MARK >= 65% and Exam >= 55%. Your final percentage mark is the minimum of MARK and 74%. |
6 (Distinction) |
Demonstrated evidence of advanced achievement of course learning outcomes. Course grade description: MARK >= 75% and Exam >= 65%. Your final percentage mark is the minimum of MARK and 84%. |
7 (High Distinction) |
Demonstrated evidence of exceptional achievement of course learning outcomes. Course grade description: MARK >= 85% and Exam >= 75%. Your final percentage mark is MARK. |
Additional course grading information
The assessment consists of online quizzes, two assignments and a final examination. You should complete the online quizzes, both assignments, and the final examination. MARK is defined as follows
MARK = 0.10*Quiz + 0.15*A1 + 0.20*A2 + 0.55*Exam
where Quiz stands for the quiz mark (out of 100), A1 and A2 stand for the assignment marks (each out of 100), and Exam stands for your examination mark (out of 100).
The final mark (MARK) will be rounded to the nearest whole number. At the discretion of the course coordinator, final grades may be moderated.
Supplementary assessment
Supplementary assessment is available for this course.
Additional assessment information
Theᅠassessment tasks evaluateᅠstudents abilities, skills and knowledge without the aid of Artificial Intelligence (AI). Students are advised that the use of AI technologies to develop responses is strictly prohibited and may constitute student misconduct under the Student Code of Conduct.
Having Troubles?
If you are having difficulties with any aspect of the course material, you should seek help and speak to the course teaching staff.
If external circumstances are affecting your ability to work on the course, you should seek help as soon as possible. The University and UQ Union have organisations and staff who are able to help; for example, UQ Student Services are able to help with study and exam skills, tertiary learning skills, writing skills, financial assistance, personal issues, and disability services (among other things). Complaints and criticisms should be directed in the first instance to the course coordinator. If you are not satisfied with the outcome, you may bring the matter to the attention of the School of EECS Director of Teaching and Learning.
Learning resources
You'll need the following resources to successfully complete the course. We've indicated below if you need a personal copy of the reading materials or your own item.
Library resources
Find the required and recommended resources for this course on the UQ Library website.
Additional learning resources information
The course will involve the use of the Java compiler and runtime environment using IntelliJᅠ(or Eclipse). It is assumed that you are already familiar with these resources. In addition, the course will use the Java-CUP parser generator and JFlex lexical analyzer generator, which are written in Java. Instruction on how to use these compiler generation tools will be provided in the course.
Documentation for Java-CUP and JFlex is available via the links in the Library's reading resources for the course.
Documentation on the use of the School's computing facilities can be found at http://student.eait.uq.edu.au/.
Learning activities
The learning activities for this course are outlined below. Learn more about the learning outcomes that apply to this course.
Filter activity type by
Please select
Learning period | Activity type | Topic |
---|---|---|
Multiple weeks From Week 1 To Week 13 |
Problem-based learning |
Weekly problem-based learning The problem-based learning exercises form an extremely important part of the course. The problem-based learning exercises provide the basis from which you can attempt the assignments and cover questions similar to those in the final examination. They are an essential part of the learning process in this course. |
Multiple weeks From Week 1 To Week 2 |
Lecture |
Week 1-2: Language definition Definition of the syntax of programming languages using context-free grammars and regular expressions. Learning outcomes: L01, L03, L04, L08 |
Week 1 (24 Feb - 02 Mar) |
Lecture |
Week 1: Course overview Phases of a compiler and introduction to interpreters. Learning outcomes: L02, L08, L10 |
Multiple weeks From Week 2 To Week 3 |
Lecture |
Week 2-3: Recursive descent parsing Implementing a parser for an Extended Backus-Naur Form (EBNF) grammar using recursive methods, one for each non-terminal symbol in the grammar. Handling syntax error recovery and abstract syntax tree building. Learning outcomes: L04, L08, L10 |
Multiple weeks From Week 3 To Week 4 |
Lecture |
Week 3-4: Assignment 1 compiler Detailed discussion of the recursive descent parser used in assignment 1, including parsing, syntax error recovery, static semantics (checking the types of variables, etc.), abstract syntax tree construction and run time interpreter. |
Lecture |
Weeks 3-4: Static semantic analysis The structure of the identifier tables. Representing programming languages types. Representing nested scopes and handling declarations. Static Semantics of Expressions and Statements. Learning outcomes: L01, L05, L08 |
|
Week 4 (17 Mar - 23 Mar) |
Lecture |
Week 4: Interpreters Using interpreters to implement programming languages. Learning outcomes: L08, L09 |
Week 5 (24 Mar - 30 Mar) |
Lecture |
Weeks 5: Grammars and LL(1) parsing Rewriting grammars to avoid left factors and left recursion (so that they are suitable for recursive descent parsing). First and follow sets. LL(1) grammars and parsing. Learning outcomes: L04 |
Multiple weeks From Week 6 To Week 9 |
Lecture |
Weeks 6-9: Bottom-up parsing Shift/reduce parsing. LR(0) parsing. Assignment 2 compiler using parser generator tool. Parsing conflicts. Operator precedence and associativity. LR(1) parsing. LALR(1) parsing. Learning outcomes: L04, L11 |
Multiple weeks From Week 6 To Week 12 |
Lecture |
Weeks 6-12: Code generation Code generation for a stack machine. Code generation for expressions, control structures, and procedure calls. The material on code generation is treated incrementally throughout the course as necessary. |
Multiple weeks From Week 10 To Week 12 |
Lecture |
Week 10-12: Runtime organisation Stack-based runtime organisation for handling procedure calls and local variable allocation. Parameter passing mechanisms. Heap organisation for dynamically allocated variables; garbage collection. Representing objects and classes. |
Week 11 (12 May - 18 May) |
Lecture |
Week 11: Scanning Regular expressions. Deterministic finite automata (DFA). Nondeterministic finite automata (NFA). Converting regular expressions into NFAs and NFAs into DFAs. Scanner for PL0. Learning outcomes: L03, L11 |
Week 13 (26 May - 01 Jun) |
Lecture |
Week 13: Course Summary An overview of what has been covered in the course, and how it relates to the final examination. |
Policies and procedures
University policies and procedures apply to all aspects of student life. As a UQ student, you must comply with University-wide and program-specific requirements, including the:
- Student Code of Conduct Policy
- Student Integrity and Misconduct Policy and Procedure
- Assessment Procedure
- Examinations Procedure
- Reasonable Adjustments - Students Policy and Procedure
Learn more about UQ policies on my.UQ and the Policy and Procedure Library.
School guidelines
Your school has additional guidelines you'll need to follow for this course: