Evals section update

I've reviewed:

* [Create an Evaluation Suite/Evaluate Tools](https://docs.arcade.dev/en/home/evaluate-tools/create-an-evaluation-suite)
* [Run evaluations with the Arcade CLI](https://docs.arcade.dev/en/home/evaluate-tools/run-evaluations)

## Suggested edits for both

* Use the "Outcomes/YWL/Prereqs" header from the [Build Tools content](https://docs.arcade.dev/en/home/build-tools/create-a-mcp-server)
* Match the page title to the navigation item—they do not!

## Edits for [Evaluate Tools](https://docs.arcade.dev/en/home/evaluate-tools/create-an-evaluation-suite)

* Step 2 has navigating to `my_server`, but if you created an MCP server as per Prerequisites, you'll already be in that folder. Rephrase as, "in your server's root folder, create a new Python file..."
* Step 4 "Run the evaluation" has: 

> 
> ```
> export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>
> arcade evals .
> ```
> 
> Split these into two. The evals call should stand alone.
> 
> Reference your recent quickstart for how to handle environment variables. Possibly call this out in a warning (hey, don't forget to set your env variables!).

* Remove the bit about the different providers. Currently it ONLY works with open AI.
* Move "How it works" and "Next Steps" outside the steps.
* "Critic Classes" should be moved to a reference page. Consider consolidating with their explanations in [Why Evaluations?](https://docs.arcade.dev/en/home/evaluate-tools/why-evaluate-tools). 
* [Advanced evaluation cases](https://docs.arcade.dev/en/home/evaluate-tools/create-an-evaluation-suite#advanced-evaluation-cases) could also be moved to its own page. Remember, the outcome of this page was to evaluate. 
* 

## [Run Evaluations/Run evaluations with the Arcade CLI](https://docs.arcade.dev/en/home/evaluate-tools/run-evaluations)

* Overall this page is both guide (how Evals work) where the former is a tutorial, and it's also a reference (all the options). It is like a non-tutorial version of the last page, which makes it a little repetitive. I would lean into making this a comprehensive guide for `arcade evals` and move the advanced content from Evaluate in to it. You might split this into a guide as well as a reference, to DRY and shorten the pages (folks looking for a command reference are not looking for a tutorial)

* The section on [Handling multiple models](https://docs.arcade.dev/en/home/evaluate-tools/run-evaluations#handling-multiple-models) needs to be removed. Currently it only supports OpenAI, though you could just point this out and say "more coming soon!"


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Evals section update #515

Suggested edits for both

Edits for Evaluate Tools

Run Evaluations/Run evaluations with the Arcade CLI

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Evals section update #515

Description

Suggested edits for both

Edits for Evaluate Tools

Run Evaluations/Run evaluations with the Arcade CLI

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions