What should we add?
In issue #14996 we added a pass that optimizes chains of single-qubit Clifford+T/Tdg.
A more general form of this algorithm is Zhang's algorithm (arXiv:1903.12456), where we no longer restrict to pi/4 rotations, nor to single-qubit gates. I think this would be nice to have because it can potentially reduce many more non-Clifford gates.
This more general implementation of Zhang's algorithm will have runtime O(n.m^2), because we can no longer just merge/cancel with the rotation ahead of the queue --- we must potentially consider repeated commutations until we find a rotation to merge/cancel with. The complexity was recently improved in arXiv:2407.07846 to O(n.m + n.m.h) where h is the number of internal Hadamards. Optimality is no longer guaranteed here (unless all rotations have non-specific and non-repeated parameters)
It will no longer be feasible to pre-compute all update rules in a big table. Instead one should work with a representation of Paulis using 2n+1 bits (n for the Z part, n for the X part, 1 for the angle). How each Clifford gate updates this Pauli is (for example) described in this paper: arXiv:2212.06928 (sections 2.2 & 2.4.1).
It can be difficult to track global phase in this situation, but I think if we just document that in the pass then it's fine.
What should we add?
In issue #14996 we added a pass that optimizes chains of single-qubit Clifford+T/Tdg.
A more general form of this algorithm is Zhang's algorithm (arXiv:1903.12456), where we no longer restrict to pi/4 rotations, nor to single-qubit gates. I think this would be nice to have because it can potentially reduce many more non-Clifford gates.
This more general implementation of Zhang's algorithm will have runtime
O(n.m^2), because we can no longer just merge/cancel with the rotation ahead of the queue --- we must potentially consider repeated commutations until we find a rotation to merge/cancel with. The complexity was recently improved in arXiv:2407.07846 toO(n.m + n.m.h)where h is the number of internal Hadamards. Optimality is no longer guaranteed here (unless all rotations have non-specific and non-repeated parameters)It will no longer be feasible to pre-compute all update rules in a big table. Instead one should work with a representation of Paulis using 2n+1 bits (n for the Z part, n for the X part, 1 for the angle). How each Clifford gate updates this Pauli is (for example) described in this paper: arXiv:2212.06928 (sections 2.2 & 2.4.1).
It can be difficult to track global phase in this situation, but I think if we just document that in the pass then it's fine.