Hi neovateai, this is a excellent work on UI-related tasks. And I found that the UI-UG of 7B parameters outperforms many models on referring and grounding tasks. But it seems still improvement on generation tasks. I want to ask that have you ever tried bigger size of model on this methods (Qwen-2.5-vl-14b, 32b, internVL)?
Best wishes.