Friday, 7 January 2011

Finding a route on the Underground using pure Java code

I started my project by being impressed by the interesting possibilities involved with developing my own software that could traverse the entire London Underground network. It read as follows:

Q3. A classic but interesting to write program: Write a Java program to find the route between any
two stations on the London Underground network (and now London Overground). The user should
be able to select a start and destination station and the program then finds the route between
them, printing out travel instructions such as which line to take, where to change lines, and so on.
The data describing the network should be stored in one or more files, and read when the program
is run.


Of course I hadn’t reckoned on how hard it would be.

First off, I hadn’t previously considered the nature of a graph data structure.

A graph (in computer science) is a collection of nodes and edges. Nodes in my case represent stations. Each node has a set of items called edges. Edges represent the links between the nodes, in a way that reflects the true inter-connectedness of the tube network.

Each edge comes with three parts to it. There’s the name of the starting node, the name of the endpoint node (on the other end), and the name of the line that the edge is describing.

So you see, a working route finder is very dependent on having the right data available.

First off I had to get my data in manageable format. I extracted the data, and organized it in such a way as to make it useable. I used this data to create what is know as an ‘adjacency list’. This special type of list contains information about the entire network using a Hash Map. The use of a Hash Map helped in improving the efficiency of the adjacency list as it meant avoiding a full traversal to locate items within it. This aspect was very important.

One key reason for creating the Breadth First Search.

The Breadth First Search algorithm (BFS) is, unlike its counter part Depth First, useful in traversing adjacency lists that do not include any weighted edge data. My data only represented the nodes and not the distance between them.

If I’d had distance data then I would have used Dijkstra’s algorithm or possibly A* to do the traversing.

So I spent some time mulling over just how I was going to get back the shortest path for any two stops. It was hard work and involved several re-writes to get it.

The real breakthrough was in understanding the nature of tree traversal. Although the breadth-first algorithm is designed to traverse a graph of any shape, the resulting path it makes is one involving all sibling nodes on any given level. Therefore I had to track which level the search was at in the tree.




















Knowing this meant I could then retrace the steps from which I had come, because the parent of any node was any node that was exactly one level less than the current. I was then able to create a list of all stations visited (in reverse order). So, finished by flipping those round and the output is there including all stations and lines involved. ☺

Please take a look over the code I wrote to see what you think.


Here's the whole class file with everything including data and simple UI for compilation. Please note the package name as it might cause problems.

13 comments:

  1. Hi, I'm learning java as a hobby myself, and this program seemed like a pretty interesting program to write. I was wondering if you could share the source code with me?

    ReplyDelete
  2. http://dl.dropbox.com/u/12814074/BreadthFirstShortestPath.java

    ReplyDelete
  3. Also,

    http://research.cs.queensu.ca/~daver/235/C1352963146/E20070302133910/index.html

    please check here for the source data used in this example.. : )

    ReplyDelete
  4. Going by the code, there seems to be a few other classes (stationGraph, NodeQueue, etc.). Would you be willing to share those as well? I can't really test/run the program with just this one class.

    ReplyDelete
  5. That's true.. Ok here it is... the filenames for station_names.txt and tub_edges.txt mentioned in BreadthFirstShortestPath.java has to point to the file in your directory system.

    http://dl.dropbox.com/u/12814074/TubeRoute.zip

    It should open with a simple UI with two demo stations filled in. Unfortunately you have to restart the application every time you want a new route. Don't forget to use underscore for stations!

    :)

    ReplyDelete
  6. Don't forget to check the map that the route is the shortest one!

    ReplyDelete
  7. It doesnt work when you select Uxbridge and a station you get a null pointer exception

    ReplyDelete
  8. Check the data - some of the stations are stored with weird names... I never said it was perfect

    ReplyDelete
  9. That sounds interesting and could you post all the steps required for running a code. This will avoid the confusions...

    ReplyDelete
  10. You need to import the directory into your Java IDE of choice. I use Eclipse, but you could use Netbeans. Once you have imported it, you just need to run it.

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete
  12. This comment has been removed by the author.

    ReplyDelete
  13. please provide a source code of that program

    ReplyDelete