Evaluation on Objective Benchmarks

I think this work is meaningful and provide remarkable results. However, I find all the test benchs are subjective benchs which outputs are judged by LLMs. Have you tried using MoA for objective tasks such as MMLU or MATH? I think this could make MoA even more valuable. Thanks!