Colour Stripe
Mobile Colour Stripe

Making Data Great With Our Innovation Challenge Winners

In this post, we take a detailed look at the three solutions to come out of the Make Data Great Innovation Challenge

The challenge, launched in November 2020, presented 5 challenge statements varying in focus from data discovery to data update notifications and personalisation. 

From many exciting submissions, 3 solutions were chosen to progress and are now live on the Open Data Hub - and ready to make our data great!  

Data Update Notification Service by Arid Systems

One of the solutions comes from Arid Systems - lead by Bill Wood, Director - which is a start-up that works on solutions for making real-time data easily accessible and shared. 

“Our bread and butter is the so-called “Internet of Things”. It’s a tremendously exciting field and is leading to an explosion of real-time data which will become increasingly important to transport, along with almost every other aspect of our modern world”, says Bill. 

The solution provides Open Data Hub users with immediate notifications of updates to datasets. This will be of particular use to our regular users, as well as anyone who’s waiting for the latest release of an ongoing dataset such as Opal Tap-On/Tap-Off. 

“The service provides alerts in real-time. For example, an alert saying “the transport timetables (which may underpin your application) have just been updated, so come get them!” This unburdens the users of the data of that discovery process, and ensures they won’t be caught out with outdated data – allowing them to continue to focus their energies upon innovating with the data itself.”

“The core challenge we faced was knowing what had changed and how. Think about it this way: you are a keen follower of supermarket pricing, perhaps you buy for a large family. But how to stretch the budget? You can periodically check the weekly circulars, wander the aisles of your local stores, noting prices and comparing to your notes from last week. But the core challenge remains – when and where to check, and how often to do so?”

“That’s how things are with most open data sources. Take public transport timetables for example – or any open data that is subject to change. You incorporate this information into your application. Prior to the implementation of our Real-Time Service, like the shopping challenge, the only way to know data had been updated was to keep downloading it and comparing it to last time. But again, when and how often?”

So, how does the solution work? 

“We use cloud-based computing which is tightly coupled to the Open Data Hub’s data repositories. The current Open Data Hub infrastructure is Drupal-based and comes with various API’s for accessing the database which underpins it. We continuously scan the database and perform some fairly sophisticated analysis around the metadata to ascertain that something has changed, and how. We then translate that into understandable and meaningful indications for the users of that data.”

This is the first time Arid Systems have participated in a TfNSW Innovation Challenge, and saw Make Data Great as a win-win opportunity. 

“We were using the Open Data Hub as a source of real-time data to assist in the development of our solution, and came across some shortcomings. It seemed fairly obvious that if we were challenged, other app developers might be experiencing the same.” 

“Innovation is a messy business! If the process were straightforward, it would hardly be innovation. The biggest challenge for us was that while we understood the problem and had a sense for the solution, until we launched into it we really didn’t know what it would take, nor could we see the hurdles we would face. The effort turned out to be considerably greater than we envisaged, and the solution somewhat adapted, however looking back we feel that we have solved the fundamental challenge which we set out to address.”

“The team at Transport for NSW have been tremendously supportive of our work, and truly understand the challenges around innovation. For that we are thankful. We remain enormously impressed that they continue to push on the existing open data boundaries, and are so proactive in leading with a continuous, innovation-centred process of improvement.”

To set up your Data Update Notification Service, login to the Open Data Hub and under My Account you will find the Subscriptions option. By creating a subscription you can be notified whenever the dataset you are interested in is updated.

We have also added a new Data Updates page, which can be found under the Developers menu, that provides an overview of all updated data over the past month. A number of filters are available to narrow down what is shown. 

Connect with Arid Systems via their LinkedIn page

Odie Bot by Data Driven

The next Make Data Great solution is Odie - a chat bot created by Data Driven, who is a Microsoft Gold Partner specialising in advanced analytics consulting and building modern data platforms for its customers. 

“We deliver innovative data and AI solutions to help organisations build a data-driven culture and empower their business decisions with insights”, says Sofia Oropreza, Sales Director. 

“Entering the NSW Transport Innovation Challenge made sense for us as we have deep experience in the Government Transport domain and understand the challenges around finding useful data.”

“We enjoy solving business problems with technical solutions in the simplest way possible and thought that Odie Bot was an elegant solution to the problem and would be useful for Open Data Hub users.”

Odie is an embedded website chat bot that helps users easily find datasets, answer questions, and more. 

“Odie helps users find the data they are looking for in a quicker, friendlier way, whilst reducing support costs and effort. Odie is an interactive, conversational bot embedded in the Open Data Hub website acting as the first line of interaction between users and the business.”

“When a website user first opens the page, Odie welcomes them and helps them answer common support queries as well as surface lesser known datasets. More satisfied users = more engagement and more innovation.”

The Open Data and Innovation team will be able to continually train Odie to serve our customers more effectively. Odie understands misspelt words and similar phrases, improving the user experience for our customers. 

The chat bot is available 24/7, and improves discoverability of unknown datasets which encourages innovation as less popular datasets are surfaced.

“Odie uses Natural Language Understanding combined with fuzzy search logic to try and understand the user's request as closely as possible. This allows users to converse with Odie in a non-technical, conversational way to find the data solutions they want.”

“The bot leverages the Microsoft Bot Framework and Azure serverless PaaS services, with its "brain" built using Azure Cognitive services such as LUIS for natural language understanding to help better understand the user's intent, Azure QnA maker to create a conversational question-and-answer layer over existing data, and Azure Search for full text and filtered queries.”

“The Innovation Challenge process was easy to follow, and the Open Data Hub provided a good overview of the challenges they were trying to solve and the key areas to be addressed. In addition, the Open Data Hub team was always available during the project.”

Odie is now live on our website (desktop only), ready to answer your questions and help you find datasets. You can start getting help from the Odie chatbot by clicking on the speech bubble in the bottom left hand corner of the Open Data Hub.

Head over to the Data Driven website to learn more about what they do and read more customer success stories. 

GTFS Studio by Lynxx

Last budefinitely not least is the GTFS Studio, built by Lynxx, which provides a ‘human’ readable overview of GTFS bundles, allowing non-technical users to access and understand GTFS data. Through the intuitive interface, users are able to explore GTFS data that they have defined using specific endpoints. 

"Lynxx is an advanced analytics and high tech systems consultancy, with a lot of our work in the public transport sector. We are headquartered in the Netherlands, but decided to open our Asia-Pacific hub in Sydney (a little over 4 years ago) because of the advent of the open data philosophy in NSW public transport systems” says Matt McInnes, Lynxx Managing Director for Asia Pacific. 

"Our client work involves a lot of reading and interpreting GTFS & GTFS-R data, and often we need to interrogate timetables, produce GTFS-R feeds or process GTFS data. We built a tool for our own use to make GTFS and GTFS-R more humanly accessible, so we figured this sort of tool could be useful for a wider group of users”

How will this improve the experience for Open Data Hub Users? “They’ll no longer need to download GTFS bundles, unpack them, load them into spreadsheets and then build complex filters to be able to read, understand and interpret timetables.” 

"The NSW GTFS Studio will allow users to see, filter and interact with timetables, or partial timetables, in a very flexible way. Users will be able to search by route, operator, stop, and see route maps and patterns.” 

"A really practical example would be that previously, if you wanted to know how many trains there were between Orange and Dubbo in a given day in a month from now, you would need to either read a PDF timetable from the operator, complete a multi-day search (and count services) via the trip planners, or download a GTFS bundle and convert formats. Now, a user will simply be able to select the route and time and the data will be displayed on the screen.”

The Lynxx GTFS Studio was built using Python in the backend with a Django web framework and a React frontend.

"Although the solution is relatively simple, we have found that presenting timetables in a human readable format makes it easier to pick-up errors. Because of the size and complexity of GTFS bundles, sometimes errors (e.g. wrong shape files, route ID conflicts) only become apparent when they are visible on a screen, as the data exists (so passes automatic checking) but it fails a human logic test. Similarly, GTFS-R data disappears usually immediately but we're collecting it and allowing recent-past interrogation” 

Having been involved in previous innovation challenges, Lynxx were well positioned to participate in Make Data Great. 

"We're old hands with the innovation challenge processes so know what they're all about and are comfortable with the ways of working. We love the flexibility and collaborative approach that that team uses.”

"We need to be precise with the scope - because all the participants are bringing something new to the table, it is always fun to keep creating, so we do need to make sure we close down the imagination and focus on delivery too.”

Lynxx have also built a commercial version of the GTFS Studio, to be launched shortly as the Intelligent Transport Timetable Studio, which contains additional features such as custom analytics views and flexible input data sources. 

You can stay up to date with all things Lynxx via their website and LinkedIn page.