How to Use Open Data to Develop an Application
Our portal is where we provide all the data and APIs you need to create the next awesome public transport application! As a blanket rule, we do not offer programming guidance and assistance as our focus is on finding as much data as possible and publishing it on the Open Data Hub, thus, we leave the coding up to you. However, we do get asked a lot on how to use our data to develop applications. Without getting too technical, this blog post describes in detail what is in each of our main APIs, how they all fit together and some tips to help you build an awesome transport app. We also encourage you to join our Open Data Forum where you can connect with other developers for some more technical assistance, request data and even brag about your latest creation!
1.0 Our APIs and Data Feeds
You can find all our data, including APIs, in the data catalogue on the Open Data Hub. Some of the main APIs include public transport data, live traffic, trip planning and boating. To build a public transport app there are key APIs that you need to learn to use. The following APIs are commonly used to build a public transport app:
This dataset contains the main static (no real-time) public transport information in General Transit Feed Specification (GTFS) format for all operators. Most of the time we will refer to this as the "GTFS bundle" and can be accessed via the API or by download as a zip file. It includes static timetables, stop locations, and route shape information as well as regional routes, trackwork and transport routes not available in real-time feeds. The Timetables Complete GTFS dataset contains the following files (explained further in Section 2.0 below):
- agency.txt - transit agencies that provide the data in this feed
- calendar.txt - dates for service IDs used by each trip
- calendar_dates.txt - exceptions for the service IDs defined in the calendar.txt file
- routes.txt - transit routes. A route is a group of trips that are displayed to riders as a single service
- shapes.txt - rules for drawing lines on a map to represent a transit organization's routes
- stop_times.txt - times that a vehicle arrives at and departs from individual stops for each trip
- stops.txt - Individual locations where vehicles pick up or drop off passengers
- trips.txt - trips for each route. A trip is a sequence of two or more stops that occurs at specific time
This dataset contains the same data as the Timetables Complete GTFS dataset above but is delivered in TransXChange (TXC) format. This dataset also includes all school bus routes, which the Timetables Complete GTFS dataset does not have at the time of writing (but is coming soon!). The TXC format is based on XML and is an implementation of the Transmodel open standard for public transport information. You can find more information about it at www.transxchange.org.uk. It can be accessed via the API or by downloading as a zip file.
This dataset also contains static timetables, stop locations and route shape information in GTFS format but only for operators that support real-time data. We might sometimes refer to this dataset as the "real-time GTFS bundle". You can match up this dataset with our real-time APIs to come up with a complete view of both static and real-time data, this is explained further in Section 2.2 below. This dataset can be accessed via the API, which is split by endpoints for each mode of transport - buses (a bundle with all operators and also individual ones for each), ferries, light rail, NSW TrainLink and Sydney Trains.
This dataset contains operator contact details and location facilities for train stations, ferry wharves and bus interchanges. Location facilities refers to various things such as address, location, access type, transport mode/s available, car parking, bike racks and more. This dataset is best accessed by programmatically downloading the csv files on the Open Data Hub, see dataset description for more details.
This dataset contains the current vehicle positions of buses, ferries, light rail and trains in GTFS-Realtime (GTFS-R) format. The GTFS-R data exchange format is based on protocol buffers. You will need to understand how to work with protocol buffers in order to view the data in a readable format. This dataset can be accessed via its API. More about protocol buffers can be found in Section 3.0 below.
This dataset contains the stop time updates for active trips, replacement vehicles, and changed stopping patterns in GTFS-Realtime format for buses, ferries, light rail and trains. As with the dataset above, the realtime trip updates also require protocol buffers to convert the data into a readable format. This dataset can be accessed via its API. More about protocol buffers can be found in Section 3.0 below.
This dataset contains real-time alerts at either the stop, trip or service line level in GTFS-Realtime (GTFS-R) format for buses, ferries, light rail and trains. This dataset also requires protocol buffers to convert the data into a readable format. This dataset can be accessed via its API. More about protocol buffers can be found in Section 3.0 below.
This dataset allows you to create your own public transport trip planner. The APIs interact with the transportnsw.info trip planner and provide the ability for NSW public transport trip planning, departure board, travel alerts, real-time transport services, walk and drive legs and Opal fares. The data is accessed by the relevant APIs and is split into the following five end-points that serve various data:
- Stop Finder API: Provides capability to return all NSW public transport stop, station, wharf, points of interest and known addresses to be used for auto-suggest/autocomplete (to be used with the Trip planner and Departure board APIs).
- Trip Planner API: Provides capability to provide NSW public transport trip plan options, including walking and driving legs, real-time and Opal fare information.
- Departure API: Provides capability to provide NSW public transport departure information from a stop, station or wharf including real-time.
- Service Alert API: Provides capability to display all public transport service status and incident information (as published from the Incident Capture System).
- Coordinate Request API: When given a specific geographical location, this API finds public transport stops, stations, wharfs and points of interest around that location.
2.0 How it All Fits Together
You will need to learn about all the APIs above and use a combination of them whether you are developing an app that focuses on one function or building a complete public transport app.
2.1 Static Data
The first thing to wrap your head around should be the GTFS format, how our GTFS bundle is structured and what data each of the files contains. To learn all about the GTFS format you should visit https://developers.google.com/transit/gtfs/reference. However, Transport for NSW have made slight modifications to the files in order to suit specific data items from our modes of transport. This is why you should reference our technical documentation for each mode of transport as outlined in Section 4.0 below.
The GTFS bundle contains the 8 files listed in Section 1.1. These files contain all the static public transport data you need to develop an application. The files are related or linked to each other using common keys/IDs. The diagram below shows the relationship between the files and fields in the GTFS bundle for buses:
All our static data is provided in common file formats that should be straightforward to open and view. This includes csv, xml or text files, which should be easy to work with even if you have limited programming experience.
2.2 Using Real-time Data
Things can get a bit more complex if you want your application to use real-time data. To get started, you should familiarise yourself with the GTFS-Realtime (GTFS-R) format by reading https://developers.google.com/transit/gtfs-realtime/reference. The real-time APIs and data feeds as specified in Section 1.0 provide real-time data directly from the operators through our API gateway. The data exchange format for GTFS-R feeds is based on protocol buffers. Protocol buffers are a language and platform-neutral mechanism for serializing structured data. You will need the TfNSW GTFS-R proto file to convert the data into a readable form.
To use our real-time APIs you will need to use the Public Transport - Timetables - For Realtime dataset to match up the services from the static data to the real-time data. You can use our Reference Tables for GTFS Feeds to see a complete list of agencies and how they are defined in each data feed. The Reference Tables can be used to identify those agencies that provide real-time data. If you decide to use the complete GTFS bundle in conjunction with the real-time APIs for each more of transport then you will need to filter out the agencies in the bundle and use the corresponding real-time agencies.
You should now have a better idea of how our APIs and data feeds relate to one another, what they're used for and how to combine them to get the output you're looking for to develop your application. We strongly recommend you read our technical documentation outlined in Section 4.0 to get a good understanding of all our data at a more technical level.
3.0 Tips, Tricks and Troubleshooting
3.1 Acquiring an API Key
To get your own API key to use for your app you will have to register on the Open Data Hub and set up an application. You can follow our User Guide to get started.
You should be using the API Key authentication method to call our APIs. When calling an API endpoint, you specify your API key as an HTTP header called 'Authorization'. For example, if your API key was ak123, your request URL will include a header like the one below:
Authorization: apikey ak123
3.3 Account and Throttle Limits
The default account plan when you register on the Open Data Hub is the "Bronze Plan", which gives you a quota of 60,000 API calls per day and a rate limit of 5 per second. We may upgrade your account for special cases but generally there's no reason why you should be hitting those limits. This is clarified further below in Section 3.4.
3.4 GTFS and Real-time Data Updates
The frequency in which data is updated depends on each mode of transport and whether it's GTFS or real-time data.
- Buses: GTFS bundles update nightly between 20:00 and 04:30 and real-time data updates every 10 seconds
- Sydney Trains: GTFS updates daily at approx. 01:30 and real-time data updates every 10 seconds
- NSW Trains: GTFS updates daily at around 01:00 and real-time data updates every 30 seconds
- Ferries: GTFS updates daily at approx. 05:15 and real-time data updates every 30 seconds
- Lightrail: GTFS updates infrequently as it's a manual process. Real-time data updates approx. every 10 seconds
Due to the update frequency of each the GTFS and GTFS-R data, you shouldn't have to call the APIs more than once per day for each mode for the GTFS data or more than once every 10-15 seconds for the real-time feeds. Therefore there's no reason why you should hit your account or throttle limits. There is generally no need to call an API more often than its data is updated since the data won't have changed.
For more tips, tricks and general information you should read the following pages before diving in to code:
- Our User Guide runs through how to register and set up an application to acquire an API key
- Visit the API Basics page to learn more about how to integrate with our API endpoints
- Our Troubleshooting page is a great reference guide for tips, tricks and further information about the various modes of transport
As stated in Section 2.0, our GTFS files might differ slightly from the standard to suit our modes of transport better. The structure of the files and overall bundle should remain the same but we might use some fields differently according to what data we can offer. These changes are outlined in our technical documentation listed below. These documents also describe how each data value or field is derived and what information each file provides. We strongly recommend you read all of these before attempting to use our APIs and build an application.
4.1 General TfNSW GTFS Release Notes
This is the GTFS release notes for all greater Sydney transport operators on our API gateway
4.2 Sydney Trains Technical Documentation
Technical documentation for the Sydney Trains real-time data feeds.
4.3 Buses Technical Documentation
Technical documentation for buses, includes both GTFS and real-time information.
4.4 Ferries Technical Documentation
Technical documentation for the Ferries real-time data feeds.
4.5 Lightrail Technical Documentation
Technical documentation for the Lightrail real-time data feeds.
4.6 NSW Trains Technical Documentation
Technical documentation for the NSW Trains data feeds, includes both GTFS and GTFS-R feeds.
4.7 TfNSW Trip Planner Documentation
Technical documentation for the TfNSW Trip Planner API end-points.
For more technical documentation and other information visit our Documentation page on the Open Data Hub. Most of the questions we receive from app developers can be answered by reading all the documentation that's available.
5.0 Other Guides and Technical Assistance
Although we don't offer any programming assistance or guidance, there are many resources out there to help you out. Your first port of call should be our Open Data Forum where you can connect with other developers and ask for help or request for data to use in your new project. If you're new to working with APIs you can find tutorials and guides on the Stackexchange Opendata Community or you should use our API Explorer to see a live demo of how they work.
Below is a list of guides that were written by others outside of Transport for NSW. These include coding and are more technical or advanced. Please note that some of these might not be up to date. Always make sure to read all our documentation and latest reference guides to make sure you're using the latest data and using the recommended methods.
- Tracking Sydney Ferries in Real-Time - http://themagiscian.com/2017/07/23/tracking-sydney-ferries-in-real-time-with-opensource-gis-tools
- How to Access Real-Time Bus Positions - http://nbviewer.jupyter.org/gist/timbennett/7ec739fc619459316859d3875b76a76b/notebook.ipynb
- Language bindings generated from the GTFS-realtime protocol buffer spec - https://github.com/google/gtfs-realtime-bindings
- How To Make Calls to a REST API Using C# - http://stackoverflow.com/questions/9620278/how-do-i-make-calls-to-a-rest-api-using-c
Hopefully you now have a good understanding of how our APIs work and how you can integrate with them to make awesome apps! Don't forget to let us know about your latest creation by either emailing us at OpenDataProgram@transport.nsw.gov.au, posting on the Open Data Forum using the 'Brag' category or tweeting at us @DataTfNSW.
- The TfNSW Open Data Team