- A-Eval:A benchmark designed to evaluate Chat LLMs of various scales from a practical application perspective.
- CHiSafetyBench:A benchmark for LLMs safety, which is designed based on the standard "Basic security requirements for generative artificial intelligence service" issued by the Chinese government on February 29, 2024.
- CDDMBench:A multimodal benchmark dataset and model for crop disease diagnosis.
- RAODBench:A Benchmark for Road Abandoned Object Detection from Video Surveillance.
- TADBench:A Large-scale Benchmark for Traffic Accidents Detection from Video Surveillance.
2024.6 We released an application-driven benchmark A-Eval.
2024.6 We released a Chinese safety benchmark CHiSafetyBench.
2024.7 We released A multimodal benchmark dataset for crop disease diagnosis, CDDMBench.
2024.12 We released a benchmark for road abandoned object detection from video surveillance, RAODBench.
2025.1 We released a large-scale benchmark for traffic accidents detection from video surveillance, TADBench.
China Unicom AI Innovation Center, China United Network Communication Group Co.,Ltd.