We know there's low hanging fruit available for optimizing PyStack, but we currently have no good way to benchmark our performance and quantify any improvements. Design some sort of a test harness that can be used for measuring the performance impact of our changes, possibly using https://asv.readthedocs.io/en/stable/