GitHub - betagouv/ComparIA: Interroger à l'aveugle deux modèles de langage conversationnels sur des tâches exprimées en français et comparer les résultats.

Comparateur d’IA conversationnelles / Conversational AI comparator

Compar:IA est un outil permettant de comparer à l’aveugle différents modèles d'IA conversationnelle pour sensibiliser aux enjeux de l'IA générative (biais, impact environmental) et constituer des jeux de données de préférence en français.

Compar:IA is a tool for blindly comparing different conversational AI models to raise awareness about the challenges of generative AI (bias, environmental impact) and to build up French-language preference datasets.

🌐 comparia.beta.gouv.fr · 📚 À propos · 🚀 Description de la startup d'Etat

_{Cliquez sur l'image ci-dessus pour consulter le site (s'ouvre dans un nouvel onglet)}

Le comparateur est basé sur Gradio et FastChat, le code de l'arène Chatbot Arena par LMSYS (voir Project architecture and rationale (en) plus bas).

The comparator is based on Gradio and FastChat, the Chatbot Arena code by LMSYS (see Project architecture and rationale below).

Run the arena

Rename register-api-endpoint-file.json.dist to register-api-endpoint-file.json and add valid API keys

With Docker

cd docker/; docker compose up -d

Without Docker

Due to how Gradio's Custom Components work and because they haven't been published as Python packages, building them manually is a bit tedious. At the moment we use 4 custom components:

pip install -r requirements.txt
cd custom_components/frinput
gradio cc install;gradio cc build --no-generate-docs
cd ../../custom_components/customradiocard
gradio cc install;gradio cc build --no-generate-docs
cd ../../custom_components/customdropdown
gradio cc install;gradio cc build --no-generate-docs
cd ../../custom_components/customchatbot
gradio cc install; npm install @gouvfr/dsfr;gradio cc build --no-generate-docs
cd ../..

then export LANGUIA_DEBUG=True; uvicorn main:app --reload --timeout-graceful-shutdown 1 or simpy uvicorn main:app

Project architecture and rationale

LMSys fork

We initially forked LMSYS' FastChat codebase, used at https://lmarena.ai to get an immediately running arena. Its architecture was composed of:

the arena (a Gradio project with 2-3 Python files)
a controller to register model workers

But as it was easier to run models in vLLM Docker containers or by using external APIs, the controller / model workers architecture ended up being unused code. Furthermore, we needed a dashboard for the controller so it got recoded.

Custom Components

Our main focus with compar:IA is to invest heavily on overall design and UX/UI. Thanks to Gradio's Custom Components we can customize any Gradio component as a Svelte app, and control the user interface look and feel.

We currently use 4 distinct (and sometimes poorly named) Custom Components:

FrInput: the DSFR input component
CustomDropdown: encompasses most of the first screen, with mode selection, models selection, and initial textarea
CustomRadioCard: used in the first screen for suggestions and later for voting
CustomChatbot: a component crafted for the specific compar:IA experience, allowing you to compare two chatbots' response to one user message, and receive user's feedback

Mounted `gradio.Blocks` within a FastAPI app

Because we needed a static website as well, we used Gradio's mount_gradio_app feature, allowing you to customize how FastAPI serves the gradio app (Gradio is based on FastAPI), while using the underlying FastAPI app to serve other pages. This lives in main.py while most of the Gradio code is split between languia/block_arena.py and languia/listeners.py. The static site's pages are in the templates/ folder, which also hosts the complex Jinja2 template files needed in the arena (especially after the "reveal" step).

Future evolutions

After 8 months of intensive development, the Gradio framework may show some limits, especially when it comes to fully custom CSS. Ugly CSS overrides are used heavily throughout this repo (especially in the infamous assets/custom-arena.css), while the integration of the French design system (DSFR) is made difficult by how Gradio adds a lot of Svelte-generated CSS everywhere. Furthermore, since the app is now more stable, we don't need to iterate quickly anymore, which is what Gradio allowed, and we could gain some snappiness by using a Svelte SPA (Single Page App) and a lighter frontend-backend communication. I feel there is a gradual path consisting in decapsulating the Custom Components one by one into a basic Svelte app, and replace Gradio with a basic FastAPI endpoint, screen-by-screen and iterating. If you have opinions on this, I warmy welcome you to open an issue on the matter 🙃

Name	Name	Last commit message	Last commit date
Latest commit ketsapiwiq fix: margins in vs. badge Mar 19, 2025 cc071d5 · Mar 19, 2025 History 3,050 Commits
.github	.github	chore: disable CI	Jul 11, 2024
assets	assets	feat: add peren	Mar 12, 2025
custom_components	custom_components	fix: margins in vs. badge	Mar 19, 2025
data	data	Update train code to support the new tokenizer (#2498)	Sep 30, 2023
docker	docker	refacto: use dsn string for db	Mar 14, 2025
languia	languia	fix: don't fail in logs	Mar 19, 2025
schemas	schemas	reorg utils folder	Mar 10, 2025
templates	templates	feat: natural sort	Mar 17, 2025
tests	tests	fix: locust	Dec 30, 2024
utils	utils	drop pii_analyzed col	Mar 17, 2025
.dockerignore	.dockerignore	rework vars env + gitignore	Jun 24, 2024
.gitignore	.gitignore	round also co2	Mar 5, 2025
.pylintrc	.pylintrc	[lint] Add code formatter and linter (#511)	Apr 21, 2023
LICENSE	LICENSE	Initial commit	Mar 19, 2023
README.md	README.md	doc: README	Feb 12, 2025
controller.py	controller.py	fix: bypass for o3-mini	Mar 19, 2025
demo.py	demo.py	feat: add gradio only file	Mar 12, 2025
main.py	main.py	raise objective	Mar 14, 2025
models-extra-info.toml	models-extra-info.toml	fix: reasoning in qwq	Mar 19, 2025
register-api-endpoint-file.json.dist	register-api-endpoint-file.json.dist	fix example config	Mar 12, 2025
requirements.txt	requirements.txt	cleanup reqs	Mar 10, 2025
sonar-project.properties	sonar-project.properties	fix: sonar	Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comparateur d’IA conversationnelles / Conversational AI comparator

Run the arena

With Docker

Without Docker

Project architecture and rationale

LMSys fork

Custom Components

Mounted `gradio.Blocks` within a FastAPI app

Future evolutions

About

Releases

Packages

Contributors 249

Languages

License

betagouv/ComparIA

Folders and files

Latest commit

History

Repository files navigation

Comparateur d’IA conversationnelles / Conversational AI comparator

Run the arena

With Docker

Without Docker

Project architecture and rationale

LMSys fork

Custom Components

Mounted gradio.Blocks within a FastAPI app

Future evolutions

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 249

Languages

Mounted `gradio.Blocks` within a FastAPI app

Packages