Making schedules even smaller

This week I’ve been working on how schedules are stored.

What I’m really working towards is showing other stops in walking distance when showing departures from a particular stop. But that requires some nontrivial changes to how I compute departures in general and at this point I’m reluctant to just keep working on the existing implementation because that only works on android, and that way I’m working my way further and further from the iPhone version.

So I first want to get the version that will work on both android and iPhone, the C++ version, up to the point where it works as well as the android-only version, and from then on the improvements I make will benefit both platforms. However, now that I’m back to working on the basic algorithms I realize that some of the decisions I made with how schedules are stored weren’t optimal and I’d rather not implement them again in C++. Which brings me to what I’ve been working on this week: changing the format to make it simpler, smaller, and easier to access.

It’s all about data compression which may sound boring but in my experience always turns out to be fun. It tends to be a cycle of: come up with an idea for how to make things smaller, implement it, try it on real data, see if it actually works, if not start over again.

The screenshot above is what it looks like when it actually works, it’s the latest approach I’ve tried which is probably what I’ll end up going with. It reduces the list of stops visited by a trip down to 3.4% of its original size, from 11.1 MB down to 0.4 MB, and timing information about when stops are visited to 7.1% of the original size, from 21.6 MB down to 1.5 MB. Then on top of that there’s another different compression step which will squeeze it even more.

I’d like these improvements to be part of the next update to the app and I’m not quite there yet, but I’ve just about nailed down how I want things to work and now I just have to implement it.