Sunday, July 20, 2008

GSoC Hoogle: Week 8

This week I've been travelling quite a bit, and rather busy with other things. Hopefully next week I'll be able to focus more time on Hoogle!

This week I fleshed out the final part of type search, including support for instances and alpha renaming of variables. After having implemented all the bits in the type search, I tried to convert the base libraries - and it failed, taking up too much time/memory to feasibly finish.

The type search is based around the idea of having nodes in a graph representing types, and then moving between these nodes, at a cost. In order to avoid a blow-up in the number of nodes in the graph, types are alpha-normalised and then alpha-renaming is performed afterwards. Instead of having 3 type nodes for (a,b), (c,d) and (a,a) there is just one named (a,b) and a 3 sets of alpha-renamings. All is good.

However, once you introduce instance restrictions, the types blow up. For example, from the type node a, you can move to Eq a => a, Ord a => a, Show a => a etc. The large (but feasible) number of type nodes, combined with even a small number of class names, gives a huge number of nodes. In fact, for every type variable in a node there are 2^n possible instance contexts it could take. All is bad.

Fortunately there is a solution - move instance checking outside the type graph. This makes the number of nodes feasible, and should work fairly well. It also has a few other benefits, including slightly better scoring and a simpler implementation in a few places. I also came up with a strategy for moving the cost associated with alpha-renaming into the graph search, which further simplifies things.

Of course, all this work takes time, so overall progress is slower than I would have liked. However, the results so far are promising, and the problems of scale seem to have been successfully addressed. The problem of fast and accurate type searching is hard, but hopefully Hoogle 4 will have a scalable solution that should be useful.

Next week: I want to finish the implementation of type searching, and check it works on the full base libraries. A release would be good, although may take place early in the following week.

User visible changes: Creating a database for the base library will now fail with a stack overflow. Hopefully next weeks changes will fix this!

No comments: