__STYLES__

National Rail Exploratory Dashboard (Finalist)

Tools used in this project
National Rail Exploratory Dashboard (Finalist)

National Rail Power BI Dashboard

About this project

Overview

The goal of the dashboard was to identify key business drivers and opportunities for improvement, including:

  • The most popular routes
  • Peak travel times
  • Revenue from different ticket types & classes
  • On-time performance and contributing factors

Key Insights

One quick note on how the dataset impacts this analysis: it seems probable that it's a (hopefully representative) sample of the total data during the period covered, because it shows less than 2 passengers per scheduled trip. If it did include all of the ticket sales, the system would be losing tremendous amounts of money. I'll ignore this issue for the rest of this analysis.

The most popular route was Manchester Piccadilly to Liverpool Lime Street, with over 4600 tickets sold. However, because of the low average ticket price (£4) it only generated 2.3% of total revenue.

The peak travel times coincided with typical commute times--6 to 8 a.m. and 4-6 p.m.-- and those peaks were consistent across every day of the week, even weekends.

Standard tickets generated almost 4 times as much revenue as first class tickets. The "advance purchase" ticket type generated about 42% of overall revenue, while "off peak" generated approximately 30% and "anytime" generated the remainder.

The overall on-time performance was 90.7%, with weather as the most common delay/cancellation cause at over 25%. The other factors, in descending order, were staffing, signal failure, technical issue (all a bit more than 20%), with only about 8% of the delays and cancellations due to traffic.

5.3% of all trips were delayed, and 4% were cancelled. Almost 2300 travelers were impacted by delays, and just under 1900 by cancellations. Assuming all travelers who requested refunds received them, refunds cost the train system £38,700 in lost revenue. Travelers with "off-peak" tickets were much more likely to request refunds compared to holders of other ticket types.

Recommendations/Next Steps

  1. Obtain a full dataset so that this analysis can be verified, and we can get complete details on actual revenue.
  2. Once we have that dataset, look at train capacity and figure out which routes and times are the most profitable considering that factor.
  3. Determine what types of weather problems cause the most issues, and whether system changes can reduce how much weather impacts the system. For example, does more work need to be done to make sure routes are clear of hanging trees, or is it possible to have a special snow fleet with ploughs if snow is a big issue?
  4. Obtain detailed information about staffing, especially at stations like Edinburgh Waverly, which had a 0% on-time percentage, all caused by staffing issues. Do additional people need to be hired to prevent delays due to staffing problems, or can schedules be adjusted?
  5. Collect additional information about where signal failures are happening, and determine if the power supply can be made more reliable or additional equipment is needed to detect problems before they result in delays.
  6. Determine specifically what "technical issues" are causing delays and cancellations, and prioritize based upon the impact of each.

The Dataset

The dataset consisted of a single file containing mock train ticket sales for National Rail in the UK, from January to April 2024, including details on the type of ticket, the date & time for each journey, the departure & arrival stations, the ticket price, and more. Each observation was for a single ticket sale.

The only data cleaning required was to correct capitalization and naming conventions in the "Reason for Delay" column, which reduced the number of categories.

Transformations

I created several calculated columns using Power Query and DAX to better analyze the data, including:

  • Route (combining Departure and Arrival Station)
  • Arrival City/Departure City
  • Departure/Arrival DateTimes (to more easily calculate durations)
  • Expected/Actual Trip Duration
  • Minutes Delayed

I also built a calendar table and linked it with the railway table to create the data model, and created a number of DAX measures to support the visualizations I planned to create. Among the more critical measures were ones that combined the single ticket sale observations into groups of train trips, so on-time rates could be determined.

The Dashboard

The dashboard I built contains 4 pages plus 1 drill-down page:

  • Overview
  • Peak Times & Routes
  • Performance
  • Revenue
  • Destination Details (the drill-down page)

The Overview page shows big picture details, including KPIs for total revenue, on-time trip percentage, and total passenger (ticket count). Also included on this page are bar chars showing the routes that generated the most revenue and the most sales, and which ticket categories and types generated the most tickets and sales.

undefined

The Peak Times and Routes page shows peak times by departure hour and day, as well as providing bar charts to show numbers of passengers by departure and destination locations. A day of week filter is included so users can explore differences by day.

undefined

The Performance page shows KPIs for on-time trip performance, the numbers of trips delayed, cancelled and scheduled, and the number of passengers/tickets effected by the delays and cancellations. I've also included a chart to view these details for individual departure stations (or arrival stations), and users can drill-down into details for individual departure stations from there. There's also a donut chart displaying the delay/cancellation causes and a area chart showing on-time trip performance by day.

undefined

The Destination Details drilldown page shows KPIs for on-time trip performance, percentage delayed, and percentage cancelled, and compares those values to the overall values for all locations. There's also donut chart showing cancellation/delay causes for the station, and a table that includes duration/delays for routes served by the destination.

undefined

The Revenue page allows uses to explore details about revenue and refund information. The revenue route column shows total revenue and net revenue (total-refunds) for the highest revenue routes. A slider allows users to estimate the revenue gains that might result from a ticket price increase.

There are 2 donut charts: one to break down the revenue generated by railcard holders vs. non-card holders, and another that shows the breakdown of revenue by ticket type.

There's also a KPI card showing detailed refund information, including the total amount refunded, and the number of delays and cancellations, refund requests, and percentage of delayed/cancelled ticket holders who requested refunds. Finally, there's a 100% bar chart showing the percentage of refund requests by ticket type and class.

undefined

Additional project images

Discussion and feedback(0 comments)
2000 characters remaining