The last week I’ve spent crunching time tables, making a pipeline that takes the raw timetable data you get from the transit companies, processing and reorganizing it, and outputting a compressed version that contains the same information but is much smaller and easier to use. I’ve been making a lot of progress on that. It’s been a lot of running python programs against a database, looking at the database, seeing something that doesn’t look quite right, tweaking the python code, running it again, and so on.
At this point I can take the 16 MB of raw input I get – and that’s a zip file, uncompressed it’s 150 MB – and compress it down to a file that contains almost all the data but is only 6.4 MB uncompressed, 1.3 MB compressed, and is much easier for the app to use. I’m pretty happy with that for now, it’s small enough that it’s reasonable to bundle it with an app. I can make it even smaller and I will later on but this week I’ll leave it and switch to working on the android app. By the end of the week I should have an app that’s starting to be useful except that it’ll show data that has bits missing and is slightly wrong. Then – possibly next week – I’ll go back and fix the data.