DATA 202 (2024) - Postgraduate students

400 level students should enrol in DATA 472.

All postgraduate students will be required to complete a project worth 40% of their final mark, instead of the final examination that the DATA202 students sit.

Full details of the project will be discussed in class, however it will involve writing an 8 page report, and providing a 10 minute software demonstration. The demonstration will be held on PICK Thursday 13 June 2024 PICK (Room TTR104 which is not the original room listed here). (An online Zoom option will be available for students unable to attend in person.)

A tutorial on Shiny will be held if students would like us to run one. Here are some tutorial slides and R code. This online tutorial is also really helpful.

Postgraduate Programming Project

PROJECT PRESENTATION and REPORT

It is a requirement (and courtesy) that all students go to as many talks as possible on Thursday 13 June (9am-3pm, Room TTR104 PICK, and an online option will be available). The presentations are for *10 min*, which includes 2-3 min for questions/answers. While presenting your report and application you can use your laptop or the room computer system. Each student should submit the report and app files through the submission system.

By 6pm on Tuesday 11 June submit:
  • the project report (named project-report.pdf)

By 6pm on Thursday 13 June submit a zipfile (shinyapp.zip) containing:
  • files with all your programming code, indicating clearly their purpose.
  • the data file(s) used in the project;
  • presentation slides (if any);

Postgraduate students must carry out a software project using R and will:
  • PICK write a 1 page project proposal (due 6pm, Thursday 18 April: submit via the ECS Assessment System)
  • write appropriately documented software
  • write a 6-8 page report (due 6pm, Tuesday 11 June)
  • present the software and report in a 10 minute demonstration (Thursday 13 June - online presentation option available).

This will be worth 40% of the final grade. The other 60% will come from the interm marks from the rest of the course, as sat by students enrolled in DATA 202.

The software should
  • Use a specific single data set, or accept data (from the user), or webscrape data, or generate data
  • Perform some kind of analysis/transformation/processing/subsetting of the data
  • Present the results of that analysis - graphically and/or numerically

The software should be written in R using an R Shiny web interface both to accept user inputs and to display outputs. Some level of user input/choice is required.

The nature of the analysis will depend on the context – the nature of the data, and the outputs required.

PICK All projects must be approved by the Course Coordinator – the process is as follows:
  • Develop your ideas over the mid-term break
  • Get familiar with the R Shiny software – doing an online tutorial is easiest: https://shiny.rstudio.com/tutorial/
  • Here's another tutorial: https://deanattali.com/blog/building-shiny-apps-tutorial/
  • And a video too: https://vimeo.com/131218530
  • Check your ideas with one of the course lecturers
  • Write a short description (<1 page) of what you want to do, and submit this by 6pm, Thursday 18 April
  • Approvals for projects will be given in the week starting Monday 22 Apr
  • Talk to Louise after class, or make an appointment to see her, to discuss your proposal - do not email about it (it's hopeless to discuss these things over email.)
  • Louise will be holding a zoom briefing session for DATA472 students on Wed 20 March at 2.10pm at the usual class link: https://vuw.zoom.us/my/data202

You are welcome to work with your classmates as you learn Shiny, and to discuss your ideas. However your final software, report and presentation are all your own individual responsibility, and must be your own work.

Please talk to one of the course lecturers if you are uncertain about your project.

Note your project should not be just a single analysis of a single dataset. It may have options for a variety of types of analysis, and in that case the user should be able to choose subsets of the data, and/or choose the nature of the analysis and the displayed output. The focus should be on the data and on the display of the results.

Here are some examples to guide your thinking about your project:
  • Develop an application that retrieves flight information from an airport website and plots a map showing inbound aircraft locations;
  • Simulate the activity in a single server shop – showing the queue occupancy and table occupancy, and service time distributions – with the ability of the user to set the arrival rate and service time parameters;
  • Develop a website that allows the user to provide an interactive data analysis on the SURF data file. This application should allow the user to generate summary statistics on any chosen column, and draw appropriate graphs of any pair of variables;
  • Develop a Shiny front end to a SQLite database – allowing the user to extract subsets of individuals according to specific criteria.
  • Have a look at the Gallery of suggestions in the R Shiny tutorial site
  • And here is a Shiny app from the NZ Census Mortality study: https://nzcms-ct-data-explorer.shinyapps.io/version8/
  • Here is a Stats NZ page with data on river water quality, which contains an embedded Shiny app: https://stats.govt.nz/indicators/river-water-quality-clarity-and-turbidity (if the app stops working after a while on that page, refresh the page which should bring it back)
  • And here are some examples from the Department of Conservation: an albatross tracker; a seabird modelling tool, and a display of data on catch of protected species.

These are only examples - you are encouraged to think broadly about what you might do, and to discuss options with the course lecturers.

Project Proposal Outline

PICK Your 1 page project proposal should contain:

  • The background to the problem/context of the data
  • The goals of your application
  • The data used
  • What the application will do

Upload your proposal through the ECS submission system (on the Assignments page).

The aim of the research proposal is to give us an idea of what you plan to do, so that we can be sure that it meets the requirements of the course.

The course lecturers will consider each proposal, and may approve it subject to suggested modifications. You can discuss these with the course lecturers. (If your first proposal is not approved, you will need to develop another one for approval.)

Project Report Outline

You should follow this outline when writing your 6-8 page project report.

1. Introduction and Motivation

Briefly describe the scope of your application and who you would expect to use it. Explain why you decided on this project and why it was a worthwhile undertaking.

2. Data Overview

Describe the data used by your application. Include the source of the data and any steps you undertook to prepare the data.

3. App Overview

Describe the main functionality of your application. Explain what the user can do and what information the application will display in response. Explain any main design decisions you made e.g. why did you chose particular inputs and outputs? You should illustrate this section with screenshots of your application.

4. Insights

Explain what have you learned while writing this application. Describe the main challenges you faced and how you overcame them e.g. challenges with obtaining the data, challenges with writing the application. Did your application turn out as you expected? Is there anything you would change if you were to start anew? Are there any extensions to the application that you would like to have made if you had more time?

5. Conclusion

Give a brief summary of your project and the insights gained from it.

Feedback on report drafts

If you would like feedback on a draft of your report, please arrange a meeting time with one of the course lecturers. Bring a printed copy of your draft to the meeting.