Visualise 50,000 Dublin Bus GPS Traces

Visualise 50,000 Dublin Bus GPS traces – Visualisation – http://cdb.io/1g87nwy Post by Colin Broderick

This is a short post to detail how you can produce a time series visualisation like the one above using data from Dublinked and CartoDB.

The Dataset

Dublin City Council and Dublin Bus have made available a dataset through Dublinked which contains a single month’s worth of Dublin Bus GPS traces. 

To download the dataset please visit the following link – http://dublinked.com/datastore/datasets/dataset-304.php

It is a single .zip file which when unzipped contains 31 gzipped csv files, one for each day of January named like so siri.20130131.csv.gz (siri.YYYMMDD.csv.gz)

So today you will take one of the files, the one for the 16th of January (siri.20130116.csv.gz) and extract it. You should now have a file name siri.20130116.csv.

The first thing to notice about this file is that it is 162.7MB in size. In order to use this file with CartoDB Free Plan you must reduce this file size to 10MB or less.

Now some information on the structure of that file. We can see that the file has 776, 641 lines by running the following command

wc -l siri.20130101.csv

Ok, so now to look at the file itself. It is made of lines containing information like the line below. The file does not include the field names on the first line so we will have to add that in a minute. But first you need to split that file so we can later upload it to CartoDB. Since I’m on a Unix machine it is very simple to do this, you can do so by running the following command which will split the file up into chunks that are 50,000 rows each:

split -l 50000 siri.20130101.csv

When you go back to this folder you will see that the csv file has been split into 21 files named something like xaa.csv, xab.csv and so on. You will see that each of these files is now 4.7MB in size, perfect for uploading to CartoDB.

Right, so what do those files look like then? 55,000 lines which are just like the one below.

1358333631000000,4,0,00041001,2013-01-16,4938,HN,0,-6.251646,53.341492,349,4006,43038,494,1

But what do those fields mean? Go back to the metadata on the download link page and you will see that the fields of the file are as follows:

Timestamp,Line ID,Direction,Journey Pattern ID,Time Frame,Vehicle Journey ID,Operator,Congestion,Lon,Lat,Delay,Block ID,Vehicle ID,Stop ID,At Stop

You will need to open one of the new .csv files in your text editor of choice and paste in the above line on the first line so the first two lines are like so:

Timestamp,Line ID,Direction,Journey Pattern ID,Time Frame,Vehicle Journey ID,Operator,Congestion,Lon,Lat,Delay,Block ID,Vehicle ID,Stop ID,At Stop

1358333631000000,4,0,00041001,2013-01-06, 4938,HN,0,-6.251646,53.341492,2, 349,4006,43038,494,1

Ok so now you are ready to add the data to CartoDB and get visualising.

First you must go to CartoDB.com and either login to your existing account or else sign up for a new, free account.

You will then be presented with the view below once you have logged in. You will need to click the link at the bottom of the page to upload the GPS Trace Data.

Next click select a file and navigate to the csv file you want to upload and Create Table.

Once the file is uploaded you will be presented by with a table view of the data.

There are two things you need to do before you jump to the map view. First you need to click on the timestamp column where it says string and then change the this from string to number.

You also need to do the same for the line_id column from the string type to number. Ok we’re finally set to give visualisation a go. Next you should click the Map View tab on the top of the page. You will see all 55,000 points displayed on the map. As you can see that creates a rather busy looking map.

You should probably use a Basemap that looks a little less busy, so click Basemap on the upper left corner and select the GMaps Dark. You should get something like this:

Right lets animate this map. First you should click the wizards tab on the right of the screen and select the Torque Map option.

To make the map the column you will use is the timestamp column. This column contains an integer value since epoch time aka the number of seconds since Thursday, 1 January 1970 (http://en.wikipedia.org/wiki/Unix_time). This will give us sequential display of each of our bus traces. If you don’t believe me you will in a few minutes.

So you are now going to change some other parameters in the wizard to make the visualisation look a bit nicer. So you need to change the Marker Fill properties as follows:

Marker Fill:           Size: 1,  Color: #F11810 (red), Opacity: 0.9

Marker Stroke:     Size: 1, Color: #FFF, Opacity: 0.1

Duration:  60 seconds

Steps: 128

Blend Mode: src-over

Resolution: 1

So as you may know I created a Frequent Transport Map for Dublin so naturally I am interested to see the buses that serve those more frequent routes. In order to do this you will need to add an SQL filter that will show only the buses with those a line_id which corresponds to those routes.

To do this you click the SQL tab and you can use the query below which will filter the table and visualisation to only show these routes.

SELECT * FROM db_2013_01_16_xal WHERE (line_id = 4 OR line_id = 7 OR line_id = 13 OR line_id = 15 OR line_id = 16 OR line_id = 27 OR line_id = 39 OR line_id = 40 OR line_id = 46 OR line_id = 83 OR line_id = 123 OR line_id = 145 OR line_id = 150)

As you can see there are a lot less points cluttering up the map.

That’s it you’ve have finished the map. Maybe you could use different colours as you wish.

The last thing to do is to publish your visualisation by clicking Visualize in the top right hand corner and give your visualisation a name like Dublin Bus 50,000 GPS Traces 16/1/13.

Finally you will need to click Publish to get a link to share with the world.

You will get a big long link like this:

http://colinb.cartodb.com/viz/09afd4ba-5d25-11e3-91f6-5dec37ad60a3/embed_map?title=true&description=true&search=false&shareable=true&cartodb_logo=true&layer_selector=false&legends=true&scrollwheel=true&sublayer_options=1&sql=SELECT%20*%20FROM%20db_2013_01_16_xal%20WHERE%20(line_id%20%3D%204%20OR%20line_id%20%3D%207%20OR%20line_id%20%3D%2013%20OR%20line_id%20%3D%2015%20OR%20line_id%20%3D%2016%20OR%20line_id%20%3D%2027%20OR%20line_id%20%3D%2039%20OR%20line_id%20%3D%2040%20OR%20line_id%20%3D%2046%20OR%20line_id%20%3D%2083%20OR%20line_id%20%3D%20123%20OR%20line_id%20%3D%20145%20OR%20line_id%20%3D%20150)&sw_lat=53.28615303292634&sw_lon=-6.458244323730469&ne_lat=53.38250924866269&ne_lon=-6.020164489746094 which you will probably shrink using your favourite provider to something more useable like this – http://ow.ly/rsnQv

That’s it, congratulations you’ve created a very nice animated visualization of some of Dublin Bus’s GPS traces.

About Author

Colin Broderick is the creator of the Dublin Frequent Transport Map which received widespread media coverage. He obtained a BSc in Spatial planning with First Class honours from Dublin Institute of Technology. He is currently pursuing a Master of Science in Geospatial Technologies at ISEGI Nova Lisboa, WWU Muenster & UJI Castellón. Formerly he worked as a Planner with  EirGrid, the Irish Electricity Transmission System Operator. Colin also produces visualisations of national geospatial datasets with a focus on those which aid spatial decision making.


3 thoughts on “Visualise 50,000 Dublin Bus GPS Traces

  1. Thanks, I wasn’t aware of CARTODB, that looks like a very impressive tool. The visualisation of the data looks great. Thanks for the guide.

  2. Hello Colin, I am trying to work with your data, but I noticed that when I select all the GPS point for one single vehicleID and sort them in time order, they are NOT forming a path. What am I doing wrong ? The file i used was siri.20130116.csv

Leave a comment