-
Notifications
You must be signed in to change notification settings - Fork 7
Description
I've reviewed:
Suggested edits for both
- Use the "Outcomes/YWL/Prereqs" header from the Build Tools content
- Match the page title to the navigation item—they do not!
Edits for Evaluate Tools
- Step 2 has navigating to
my_server, but if you created an MCP server as per Prerequisites, you'll already be in that folder. Rephrase as, "in your server's root folder, create a new Python file..." - Step 4 "Run the evaluation" has:
export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY> arcade evals .Split these into two. The evals call should stand alone.
Reference your recent quickstart for how to handle environment variables. Possibly call this out in a warning (hey, don't forget to set your env variables!).
- Remove the bit about the different providers. Currently it ONLY works with open AI.
- Move "How it works" and "Next Steps" outside the steps.
- "Critic Classes" should be moved to a reference page. Consider consolidating with their explanations in Why Evaluations?.
- Advanced evaluation cases could also be moved to its own page. Remember, the outcome of this page was to evaluate.
Run Evaluations/Run evaluations with the Arcade CLI
-
Overall this page is both guide (how Evals work) where the former is a tutorial, and it's also a reference (all the options). It is like a non-tutorial version of the last page, which makes it a little repetitive. I would lean into making this a comprehensive guide for
arcade evalsand move the advanced content from Evaluate in to it. You might split this into a guide as well as a reference, to DRY and shorten the pages (folks looking for a command reference are not looking for a tutorial) -
The section on Handling multiple models needs to be removed. Currently it only supports OpenAI, though you could just point this out and say "more coming soon!"