diff --git a/docs/docs/annotation-based-tools.md b/docs/docs/annotation-based-tools.md index c5634bc9d4..056458a09f 100644 --- a/docs/docs/annotation-based-tools.md +++ b/docs/docs/annotation-based-tools.md @@ -21,7 +21,7 @@ To start using annotation-based tools in your project, you need to understand th ## @Tool annotation The `@Tool` annotation is used to mark functions that should be exposed as tools to LLMs. -The functions annotated with `@Tool` are collected by reflection from objects that implement the `ToolSet` interface. For details, see [Implement the ToolSet interface](#implement-the-toolset-interface). +The functions annotated with `@Tool` are collected by reflection from objects that implement the `ToolSet` interface. For details, see [Implement the ToolSet interface](#1-implement-the-toolset-interface). ### Definition diff --git a/docs/docs/basic-agents.md b/docs/docs/basic-agents.md deleted file mode 100644 index b68a594c22..0000000000 --- a/docs/docs/basic-agents.md +++ /dev/null @@ -1,189 +0,0 @@ -# Basic agents - -The `AIAgent` class is the core component that lets you create AI agents in your Kotlin applications. - -You can build simple agents with minimal configuration or create sophisticated agents with advanced capabilities by -defining custom strategies, tools, configurations, and custom input/output types. - -This page guides you through the steps necessary to create a basic agent with customizable tools and configurations. - -A basic agent processes a single input and provides a response. -It operates within a single cycle of tool-calling to complete its task and provide a response. -This agent can return either a message or a tool result. -The tool result is returned if the tool registry is provided to the agent. - -If your goal is to build a simple agent to experiment with, you can provide only a prompt executor and LLM when creating it. -But if you want more flexibility and customization, you can pass optional parameters to configure the agent. -To learn more about configuration options, see [API reference](https://api.koog.ai/agents/agents-core/ai.koog.agents.core.agent/-a-i-agent/-a-i-agent.html). - -## Prerequisites - -- You have a valid API key from the LLM provider used to implement an AI agent. For a list of all available providers, see [LLM providers](llm-providers.md). - -!!! tip - Use environment variables or a secure configuration management system to store your API keys. - Avoid hardcoding API keys directly in your source code. - -## Creating a basic agent - -### 1. Add dependencies - -To use the `AIAgent` functionality, include all necessary dependencies in your build configuration: - -``` -dependencies { - implementation("ai.koog:koog-agents:VERSION") -} -``` - -For all available installation methods, see [Install Koog](getting-started.md#install-koog). - -### 2. Create an agent - -To create an agent, create an instance of the `AIAgent` class and provide the `promptExecutor` and `llmModel` parameters: - - -```kotlin -val agent = AIAgent( - promptExecutor = simpleOpenAIExecutor(System.getenv("OPENAI_API_KEY")), - llmModel = OpenAIModels.Chat.GPT4o -) -``` - - -### 3. Add a system prompt - -A system prompt is used to define agent behavior. To provide the prompt, use the `systemPrompt` parameter: - - -```kotlin -val agent = AIAgent( - promptExecutor = simpleOpenAIExecutor(System.getenv("YOUR_API_KEY")), - systemPrompt = "You are a helpful assistant. Answer user questions concisely.", - llmModel = OpenAIModels.Chat.GPT4o -) -``` - - -### 4. Configure LLM output - -Provide a temperature of LLM output generation using the `temperature` parameter: - - -```kotlin -val agent = AIAgent( - promptExecutor = simpleOpenAIExecutor(System.getenv("YOUR_API_KEY")), - systemPrompt = "You are a helpful assistant. Answer user questions concisely.", - llmModel = OpenAIModels.Chat.GPT4o, - temperature = 0.7 -) -``` - - -### 5. Add tools - -Agents use tools to complete specific tasks. -You can use the built-in tools or implement your own custom tools if needed. - -To configure tools, use the `toolRegistry` parameter that defines the tools available to the agent: - - -```kotlin -val agent = AIAgent( - promptExecutor = simpleOpenAIExecutor(System.getenv("YOUR_API_KEY")), - systemPrompt = "You are a helpful assistant. Answer user questions concisely.", - llmModel = OpenAIModels.Chat.GPT4o, - temperature = 0.7, - toolRegistry = ToolRegistry { - tool(SayToUser) - } -) -``` - -In the example, `SayToUser` is the built-in tool. To learn how to create a custom tool, see [Tools](tools-overview.md). - -### 6. Adjust agent iterations - -Provide the maximum number of steps the agent can take before it is forced to stop using the `maxIterations` parameter: - - -```kotlin -val agent = AIAgent( - promptExecutor = simpleOpenAIExecutor(System.getenv("YOUR_API_KEY")), - systemPrompt = "You are a helpful assistant. Answer user questions concisely.", - llmModel = OpenAIModels.Chat.GPT4o, - temperature = 0.7, - toolRegistry = ToolRegistry { - tool(SayToUser) - }, - maxIterations = 30 -) -``` - - -### 7. Handle events during agent runtime - -Basic agents support custom event handlers. -While having an event handler is not required for creating an agent, it might be helpful for testing, debugging, or making hooks for chained agent interactions. - -For more information on how to use the `EventHandler` feature for monitoring your agent interactions, see [Event Handlers](agent-event-handlers.md). - -### 8. Run the agent - -To run the agent, use the `run()` function: - - -```kotlin -val agent = AIAgent( - promptExecutor = simpleOpenAIExecutor(System.getenv("OPENAI_API_KEY")), - systemPrompt = "You are a helpful assistant. Answer user questions concisely.", - llmModel = OpenAIModels.Chat.GPT4o, - temperature = 0.7, - toolRegistry = ToolRegistry { - tool(SayToUser) - }, - maxIterations = 100 -) - -fun main() = runBlocking { - val result = agent.run("Hello! How can you help me?") -} -``` - - -The agent produces the following output: - -``` -Agent says: Hello! I'm here to assist you with a variety of tasks. Whether you have questions, need information, or require help with specific tasks, feel free to ask. How can I assist you today? -``` diff --git a/docs/docs/complex-workflow-agents.md b/docs/docs/complex-workflow-agents.md deleted file mode 100644 index b38535cb58..0000000000 --- a/docs/docs/complex-workflow-agents.md +++ /dev/null @@ -1,538 +0,0 @@ -# Complex workflow agents - -In addition to basic agents, the `AIAgent` class lets you build agents that handle complex workflows by defining -custom strategies, tools, configurations, and custom input/output types. - -!!! tip - If you are new to Koog and want to create the simplest agent, start with [Basic agents](basic-agents.md). - -The process of creating and configuring such an agent typically includes the following steps: - -1. Provide a prompt executor to communicate with the LLM. -2. Define a strategy that controls the agent workflow. -3. Configure agent behavior. -4. Implement tools for the agent to use. -5. Add optional features like event handling, memory, or tracing. -6. Run the agent with user input. - -## Prerequisites - -- You have a valid API key from the LLM provider used to implement an AI agent. For a list of all available providers, see [LLM providers](llm-providers.md). - -!!! tip - Use environment variables or a secure configuration management system to store your API keys. - Avoid hardcoding API keys directly in your source code. - -## Creating a complex workflow agent - -### 1. Add dependencies - -To use the `AIAgent` functionality, include all necessary dependencies in your build configuration: - -``` -dependencies { - implementation("ai.koog:koog-agents:VERSION") -} -``` - -For all available installation methods, see [Install Koog](getting-started.md#install-koog). - -### 2. Provide a prompt executor - -Prompt executors manage and run prompts. -You can choose a prompt executor based on the LLM provider you plan to use. -Also, you can create a custom prompt executor using one of the available LLM clients. -To learn more, see [Prompt executors](prompts/prompt-executors.md). - -For example, to provide the OpenAI prompt executor, you need to call the `simpleOpenAIExecutor` function and provide it with the API key required for authentication with the OpenAI service: - - -```kotlin -val promptExecutor = simpleOpenAIExecutor(token) -``` - - -To create a prompt executor that works with multiple LLM providers, do the following: - -1) Configure clients for the required LLM providers with the corresponding API keys. For example: - -```kotlin -val openAIClient = OpenAILLMClient(System.getenv("OPENAI_KEY")) -val anthropicClient = AnthropicLLMClient(System.getenv("ANTHROPIC_KEY")) -val googleClient = GoogleLLMClient(System.getenv("GOOGLE_KEY")) -``` - -2) Pass the configured clients to the `MultiLLMPromptExecutor` class constructor to create a prompt executor with multiple LLM providers: - -```kotlin -val multiExecutor = MultiLLMPromptExecutor(openAIClient, anthropicClient, googleClient) -``` - - -### 3. Define a strategy - -A strategy defines the workflow of your agent by using nodes and edges. It can have arbitrary input and output types, -which can be specified in `strategy` function generic parameters. These will be input/output types of the `AIAgent` as well. -Default type for both input and output is `String`. - -!!! tip - To learn more about strategies, see [Custom strategy graphs](custom-strategy-graphs.md) - -#### 3.1. Understand nodes and edges - -Nodes and edges are the building blocks of the strategy. - -Nodes represent processing steps in your agent strategy. - - - -```kotlin -val processNode by node { input -> - // Process the input and return an output - // You can use llm.writeSession to interact with the LLM - // You can call tools using callTool, callToolRaw, etc. - transformedOutput -} -``` - - -!!! tip - There are also pre-defined nodes that you can use in your agent strategy. To learn more, see [Predefined nodes and components](nodes-and-components.md). - -Edges define the connections between nodes. - - - -```kotlin -// Basic edge -edge(sourceNode forwardTo targetNode) - -// Edge with condition -edge(sourceNode forwardTo targetNode onCondition { output -> - // Return true to follow this edge, false to skip it - output.contains("specific text") -}) - -// Edge with transformation -edge(sourceNode forwardTo targetNode transformed { output -> - // Transform the output before passing it to the target node - "Modified: $output" -}) - -// Combined condition and transformation -edge(sourceNode forwardTo targetNode onCondition { it.isNotEmpty() } transformed { it.uppercase() }) -``` - - -#### 3.2. Implement the strategy - -To implement the agent strategy, call the `strategy` function and define nodes and edges. For example: - - -```kotlin -val agentStrategy = strategy("Simple calculator") { - // Define nodes for the strategy - val nodeSendInput by nodeLLMRequest() - val nodeExecuteTool by nodeExecuteTool() - val nodeSendToolResult by nodeLLMSendToolResult() - - // Define edges between nodes - // Start -> Send input - edge(nodeStart forwardTo nodeSendInput) - - // Send input -> Finish - edge( - (nodeSendInput forwardTo nodeFinish) - transformed { it } - onAssistantMessage { true } - ) - - // Send input -> Execute tool - edge( - (nodeSendInput forwardTo nodeExecuteTool) - onToolCall { true } - ) - - // Execute tool -> Send the tool result - edge(nodeExecuteTool forwardTo nodeSendToolResult) - - // Send the tool result -> finish - edge( - (nodeSendToolResult forwardTo nodeFinish) - transformed { it } - onAssistantMessage { true } - ) -} -``` - -!!! tip - The `strategy` function lets you define multiple subgraphs, each containing its own set of nodes and edges. - This approach offers more flexibility and functionality compared to using simplified strategy builders. - To learn more about subgraphs, see [Subgraphs](subgraphs-overview.md). - -### 4. Configure the agent - -Define agent behavior with a configuration: - -```kotlin -val agentConfig = AIAgentConfig.withSystemPrompt( - prompt = """ - You are a simple calculator assistant. - You can add two numbers together using the calculator tool. - When the user provides input, extract the numbers they want to add. - The input might be in various formats like "add 5 and 7", "5 + 7", or just "5 7". - Extract the two numbers and use the calculator tool to add them. - Always respond with a clear, friendly message showing the calculation and result. - """.trimIndent() -) -``` - - -For more advanced configuration, you can specify which LLM the agent will use and set the maximum number of iterations the agent can perform to respond: - -```kotlin -val agentConfig = AIAgentConfig( - prompt = Prompt.build("simple-calculator") { - system( - """ - You are a simple calculator assistant. - You can add two numbers together using the calculator tool. - When the user provides input, extract the numbers they want to add. - The input might be in various formats like "add 5 and 7", "5 + 7", or just "5 7". - Extract the two numbers and use the calculator tool to add them. - Always respond with a clear, friendly message showing the calculation and result. - """.trimIndent() - ) - }, - model = OpenAIModels.Chat.GPT4o, - maxAgentIterations = 10 -) -``` - - -### 5. Implement tools and set up a tool registry - -Tools let your agent perform specific tasks. -To make a tool available for the agent, add it to a tool registry. -For example: - -```kotlin -// Implement a simple calculator tool that can add two numbers -@LLMDescription("Tools for performing basic arithmetic operations") -class CalculatorTools : ToolSet { - @Tool - @LLMDescription("Add two numbers together and return their sum") - fun add( - @LLMDescription("First number to add (integer value)") - num1: Int, - - @LLMDescription("Second number to add (integer value)") - num2: Int - ): String { - val sum = num1 + num2 - return "The sum of $num1 and $num2 is: $sum" - } -} - -// Add the tool to the tool registry -val toolRegistry = ToolRegistry { - tools(CalculatorTools()) -} -``` - - -To learn more about tools, see [Tools](tools-overview.md). - -### 6. Install features - -Features let you add new capabilities to the agent, modify its behavior, provide access to external systems and resources, -and log and monitor events while the agent is running. -The following features are available: - -- EventHandler -- AgentMemory -- Tracing - -To install the feature, call the `install` function and provide the feature as an argument. -For example, to install the event handler feature, you need to do the following: - - -```kotlin -// install the EventHandler feature -installFeatures = { - install(EventHandler) { - onAgentStarting { eventContext: AgentStartingContext -> - println("Starting agent: ${eventContext.agent.id}") - } - onAgentCompleted { eventContext: AgentCompletedContext -> - println("Result: ${eventContext.result}") - } - } -} -``` - - -To learn more about feature configuration, see the dedicated page. - -### 7. Run the agent - -Create the agent with the configuration option created in the previous stages and run it with the provided input: - -```kotlin -val agent = AIAgent( - promptExecutor = promptExecutor, - toolRegistry = toolRegistry, - strategy = agentStrategy, - agentConfig = agentConfig, - installFeatures = { - install(EventHandler) { - onAgentStarting { eventContext: AgentStartingContext -> - println("Starting agent: ${eventContext.agent.id}") - } - onAgentCompleted { eventContext: AgentCompletedContext -> - println("Result: ${eventContext.result}") - } - } - } -) - -fun main() { - runBlocking { - println("Enter two numbers to add (e.g., 'add 5 and 7' or '5 + 7'):") - - // Read the user input and send it to the agent - val userInput = readlnOrNull() ?: "" - val agentResult = agent.run(userInput) - println("The agent returned: $agentResult") - } -} -``` - - -## Working with structured data - -The `AIAgent` can process structured data from LLM outputs. For more details, see [Structured data processing](structured-output.md). - -## Using parallel tool calls - -The `AIAgent` supports parallel tool calls. This feature lets you process multiple tools concurrently, improving performance for independent operations. - -For more details, see [Parallel tool calls](tools-overview.md#parallel-tool-calls). - -## Full code sample - -Here is the complete implementation of the agent: - -```kotlin -// Use the OpenAI executor with an API key from an environment variable -val promptExecutor = simpleOpenAIExecutor(System.getenv("OPENAI_API_KEY")) - -// Create a simple strategy -val agentStrategy = strategy("Simple calculator") { - // Define nodes for the strategy - val nodeSendInput by nodeLLMRequest() - val nodeExecuteTool by nodeExecuteTool() - val nodeSendToolResult by nodeLLMSendToolResult() - - // Define edges between nodes - // Start -> Send input - edge(nodeStart forwardTo nodeSendInput) - - // Send input -> Finish - edge( - (nodeSendInput forwardTo nodeFinish) - transformed { it } - onAssistantMessage { true } - ) - - // Send input -> Execute tool - edge( - (nodeSendInput forwardTo nodeExecuteTool) - onToolCall { true } - ) - - // Execute tool -> Send the tool result - edge(nodeExecuteTool forwardTo nodeSendToolResult) - - // Send the tool result -> finish - edge( - (nodeSendToolResult forwardTo nodeFinish) - transformed { it } - onAssistantMessage { true } - ) -} - -// Configure the agent -val agentConfig = AIAgentConfig( - prompt = Prompt.build("simple-calculator") { - system( - """ - You are a simple calculator assistant. - You can add two numbers together using the calculator tool. - When the user provides input, extract the numbers they want to add. - The input might be in various formats like "add 5 and 7", "5 + 7", or just "5 7". - Extract the two numbers and use the calculator tool to add them. - Always respond with a clear, friendly message showing the calculation and result. - """.trimIndent() - ) - }, - model = OpenAIModels.Chat.GPT4o, - maxAgentIterations = 10 -) - -// Implement a simple calculator tool that can add two numbers -@LLMDescription("Tools for performing basic arithmetic operations") -class CalculatorTools : ToolSet { - @Tool - @LLMDescription("Add two numbers together and return their sum") - fun add( - @LLMDescription("First number to add (integer value)") - num1: Int, - - @LLMDescription("Second number to add (integer value)") - num2: Int - ): String { - val sum = num1 + num2 - return "The sum of $num1 and $num2 is: $sum" - } -} - -// Add the tool to the tool registry -val toolRegistry = ToolRegistry { - tools(CalculatorTools()) -} - -// Create the agent -val agent = AIAgent( - promptExecutor = promptExecutor, - toolRegistry = toolRegistry, - strategy = agentStrategy, - agentConfig = agentConfig, - installFeatures = { - install(EventHandler) { - onAgentStarting { eventContext: AgentStartingContext -> - println("Starting agent: ${eventContext.agent.id}") - } - onAgentCompleted { eventContext: AgentCompletedContext -> - println("Result: ${eventContext.result}") - } - } - } -) - -fun main() { - runBlocking { - println("Enter two numbers to add (e.g., 'add 5 and 7' or '5 + 7'):") - - // Read the user input and send it to the agent - val userInput = readlnOrNull() ?: "" - val agentResult = agent.run(userInput) - println("The agent returned: $agentResult") - } -} -``` - diff --git a/docs/docs/functional-agents.md b/docs/docs/functional-agents.md deleted file mode 100644 index 65a283460d..0000000000 --- a/docs/docs/functional-agents.md +++ /dev/null @@ -1,270 +0,0 @@ -# Functional agents - -Functional agents are lightweight AI agents that operate without building complex strategy graphs. -Instead, the agent logic is implemented as a lambda function that handles user input, interacts with an LLM, -optionally calls tools, and produces a final output. It can perform a single LLM call, process multiple LLM calls in sequence, or loop based on user input, as well as LLM and tool outputs. - -!!! tip - - If you already have a [basic agent](basic-agents.md) as your first MVP, but run into task-specific limitations, use a functional agent to prototype custom logic. You can implement custom control flows in plain Kotlin while still using most Koog features, including history compression and automatic state management. - - For production-grade needs, refactor your functional agent into a [complex workflow agent](complex-workflow-agents.md) with strategy graphs. This provides persistence with controllable rollbacks for fault-tolerance and advanced OpenTelemetry tracing with nested graph events. - -This page guides you through the steps necessary to create a minimal functional agent and extend it with tools. - -## Prerequisites - -Before you start, make sure that you have the following: - -- A working Kotlin/JVM project. -- Java 17+ installed. -- A valid API key from the LLM provider used to implement an AI agent. For a list of all available providers, refer to [LLM providers](llm-providers.md). -- (Optional) Ollama installed and running locally if you use this provider. - -!!! tip - Use environment variables or a secure configuration management system to store your API keys. - Avoid hardcoding API keys directly in your source code. - -## Add dependencies - -The `AIAgent` class is the main class for creating agents in Koog. -Include the following dependency in your build configuration to use the class functionality: - -``` -dependencies { - implementation("ai.koog:koog-agents:VERSION") -} -``` -For all available installation methods, see [Install Koog](getting-started.md#install-koog). - -## Create a minimal functional agent - -To create a minimal functional agent, do the following: - -1. Choose the input and output types that the agent handles and create a corresponding `AIAgent` instance. - In this guide, we use `AIAgent`, which means the agent receives and returns `String`. -2. Provide the required parameters, including a system prompt, prompt executor, and LLM. -3. Define the agent logic with a lambda function wrapped into the `functionalStrategy {...}` DSL method. - -Here is an example of a minimal functional agent that sends user text to a specified LLM and returns a single assistant message. - - - -```kotlin -// Create an AIAgent instance and provide a system prompt, prompt executor, and LLM -val mathAgent = AIAgent( - systemPrompt = "You are a precise math assistant.", - promptExecutor = simpleOllamaAIExecutor(), - llmModel = OllamaModels.Meta.LLAMA_3_2, - strategy = functionalStrategy { input -> // Define the agent logic - // Make one LLM call - val response = requestLLM(input) - // Extract and return the assistant message content from the response - response.asAssistantMessage().content - } -) - -// Run the agent with a user input and print the result -val result = mathAgent.run("What is 12 × 9?") -println(result) -``` - - -The agent can produce the following output: - -``` -The answer to 12 × 9 is 108. -``` - -This agent makes a single LLM call and returns the assistant message content. -You can extend the agent logic to handle multiple sequential LLM calls. For example: - - - -```kotlin -// Create an AIAgent instance and provide a system prompt, prompt executor, and LLM -val mathAgent = AIAgent( - systemPrompt = "You are a precise math assistant.", - promptExecutor = simpleOllamaAIExecutor(), - llmModel = OllamaModels.Meta.LLAMA_3_2, - strategy = functionalStrategy { input -> // Define the agent logic - // The first LLM call to produce an initial draft based on the user input - val draft = requestLLM("Draft: $input").asAssistantMessage().content - // The second LLM call to improve the draft by prompting the LLM again with the draft content - val improved = requestLLM("Improve and clarify.").asAssistantMessage().content - // The final LLM call to format the improved text and return the final formatted result - requestLLM("Format the result as bold.").asAssistantMessage().content - } -) - -// Run the agent with a user input and print the result -val result = mathAgent.run("What is 12 × 9?") -println(result) -``` - - -The agent can produce the following output: - -``` -When multiplying 12 by 9, we can break it down as follows: - -**12 (tens) × 9 = 108** - -Alternatively, we can also use the distributive property to calculate this: - -**(10 + 2) × 9** -= **10 × 9 + 2 × 9** -= **90 + 18** -= **108** -``` - -## Add tools - -In many cases, a functional agent needs to complete specific tasks, such as reading and writing data or calling APIs. -In Koog, you expose such capabilities as tools and let the LLM call them in the agent logic. - -This chapter takes the minimal functional agent created above and demonstrates how to extend the agent logic using tools. - - -1) Create an annotation-based tool. For more details, see [Annotation-based tools](annotation-based-tools.md). - - -```kotlin -@LLMDescription("Simple multiplier") -class MathTools : ToolSet { - @Tool - @LLMDescription("Multiplies two numbers and returns the result") - fun multiply(a: Int, b: Int): Int { - val result = a * b - return result - } -} -``` - - -To learn more about available tools, refer to the [Tool overview](tools-overview.md). - -2) Register the tool to make it available to the agent. - - - -```kotlin -val toolRegistry = ToolRegistry { - tools(MathTools()) -} -``` - - -3) Pass the tool registry to the agent to enable the LLM to request and use the available tools. - -4) Extend the agent logic to identify tool calls, execute the requested tools, send their results back to the LLM, and repeat the process until no tool calls remain. - -!!! note - Use a loop only if the LLM continues to issue tool calls. - - - -```kotlin -val mathWithTools = AIAgent( - systemPrompt = "You are a precise math assistant. When multiplication is needed, use the multiplication tool.", - promptExecutor = simpleOllamaAIExecutor(), - llmModel = OllamaModels.Meta.LLAMA_3_2, - toolRegistry = toolRegistry, - strategy = functionalStrategy { input -> // Define the agent logic extended with tool calls - // Send the user input to the LLM - var responses = requestLLMMultiple(input) - - // Only loop while the LLM requests tools - while (responses.containsToolCalls()) { - // Extract tool calls from the response - val pendingCalls = extractToolCalls(responses) - // Execute the tools and return the results - val results = executeMultipleTools(pendingCalls) - // Send the tool results back to the LLM. The LLM may call more tools or return a final output - responses = sendMultipleToolResults(results) - } - - // When no tool calls remain, extract and return the assistant message content from the response - responses.single().asAssistantMessage().content - } -) - -// Run the agent with a user input and print the result -val reply = mathWithTools.run("Please multiply 12.5 and 4, then add 10 to the result.") -println(reply) -``` - - -The agent can produce the following output: - -``` -Here is the step-by-step solution: - -1. Multiply 12.5 and 4: - 12.5 × 4 = 50 - -2. Add 10 to the result: - 50 + 10 = 60 -``` - -## What's next - -- Learn how to return structured data using the [Structured output API](structured-output.md). -- Experiment with adding more [tools](tools-overview.md) to the agent. -- Improve observability with the [EventHandler](agent-events.md) feature. -- Learn how to handle long-running conversations with [History compression](history-compression.md). diff --git a/docs/docs/getting-started/basic-agents.md b/docs/docs/getting-started/basic-agents.md new file mode 100644 index 0000000000..f1b0b6f805 --- /dev/null +++ b/docs/docs/getting-started/basic-agents.md @@ -0,0 +1,286 @@ +# Basic agents + +A basic agent processes a single input and provides a response. +It operates within a single cycle of tool-calling to complete its task and provide a response. +This agent can return either a message or a tool result. +It will return a tool result if you provide a tool registry to the agent. + +??? note "Prerequisites" + + --8<-- "getting-started-snippets.md:prerequisites" + + --8<-- "getting-started-snippets.md:dependencies" + + --8<-- "getting-started-snippets.md:api-key" + + Examples on this page assume that you have set the `OPENAI_API_KEY` environment variable. + +## Create a minimal agent + +The [`AIAgent`](https://api.koog.ai/agents/agents-core/ai.koog.agents.core.agent/-a-i-agent/index.html) interface +is the primary starting point for creating Koog agents. +The overloaded `invoke()` operator functions on its companion object +enable you to instantiate this interface with a constructor-like syntax. + +For example, to create the most basic agent, you can provide only a [prompt executor](../prompts/prompt-executors.md) +and a [model configuration](../model-capabilities.md#creating-a-model-llmodel-configuration): + + +```kotlin +val agent = AIAgent( + promptExecutor = simpleOpenAIExecutor(System.getenv("OPENAI_API_KEY")), + llmModel = OpenAIModels.Chat.GPT4o +) +``` + + +This agent will expect a string as input (user's request) and return a string as output (LLM response). +To run the agent, use the `run()` function with some user input: + + +```kotlin +fun main() = runBlocking { + val result = agent.run("Hello! How can you help me?") + println(result) +} +``` + + +The agent will return a generic answer, such as: + +```text +I can assist with a wide range of topics and tasks. Here are some examples: + +1. **Answering questions**: I can provide information on various subjects, from science and history to entertainment and culture. +2. **Generating text**: I can help with writing tasks, such as suggesting alternative phrases, providing definitions, or even creating entire articles or stories. +3. **Translation**: I can translate text from one language to another, including popular languages such as Spanish, French, German, Chinese, and many more. +4. **Conversation**: I can engage in natural-sounding conversations, using context and understanding to respond to questions and statements. +5. **Brainstorming**: I can help generate ideas for creative projects, such as writing stories, composing music, or coming up with business ideas. +6. **Learning**: I can help with language learning, explaining grammar rules, vocabulary, and pronunciation. +7. **Calculations**: I can perform mathematical calculations, including basic arithmetic, algebra, and more advanced math concepts. + +What's on your mind? Do you have a specific question, topic, or task you'd like to tackle? +``` + +## Add a system prompt + +Provide a [system message](../prompts/prompt-creation/index.md#system-message) to define the agent's role and behavior, +as well as the purpose, context, and instructions related to the task. + + +```kotlin +val agent = AIAgent( + promptExecutor = simpleOpenAIExecutor(System.getenv("YOUR_API_KEY")), + systemPrompt = "You are an expert in internet memes. Be helpful, friendly, and answer user questions concisely, showing your knowledge of memes.", + llmModel = OpenAIModels.Chat.GPT4o +) +``` + + +The instructions in the system prompt will guide the agent's response: + +```text +I'm here to help you navigate the wild world of internet memes! + +What's on your mind? Are you trying to understand a specific meme, need help finding a popular joke, or perhaps want some recommendations for trending memes? Let me know, and I'll do my best to provide you with some LOLs! +``` + +## Configure LLM output + +You can provide some [LLM parameters](../llm-parameters.md#llm-parameter-reference) directly to the agent constructor +to customize the behavior of the LLM. +For example, use the `temperature` parameter to adjust the randomness of the generated responses: + + +```kotlin +val agent = AIAgent( + promptExecutor = simpleOpenAIExecutor(System.getenv("YOUR_API_KEY")), + systemPrompt = "You are an expert in internet memes. Be helpful, friendly, and answer user questions concisely, showing your knowledge of memes.", + llmModel = OpenAIModels.Chat.GPT4o, + temperature = 0.7 +) +``` + + +Here are some response examples with different temperature values: + +=== "0.4" + + ```text + I'm here to help you navigate the wild world of internet memes! Whether you're looking for explanations, examples, or just want to share a meme with someone, I'm your go-to expert. What's on your mind? Got a specific meme in mind that's got you curious? Or maybe you need some meme-related advice? Fire away! + ``` + +=== "0.7" + + ```text + I'm here to help you navigate the wild world of internet memes! + + What's on your mind? Need help understanding a specific meme, finding a popular joke or trend, or maybe even creating your own meme? Let's get this meme party started! + ``` + +=== "1.0" + + ```text + I'd be happy to help you navigate the wild world of internet memes! + + Whether you're looking for explanations of classic memes, suggestions for new ones to try out, or just want to discuss your favorite meme culture trends, I'm here to assist. What's on your mind? + + Do you have a specific question about memes (e.g., "What does this meme mean?"), or are you looking for some meme-related recommendations (e.g., "Can you recommend a funny meme to share with friends?"). Let me know how I can help! + ``` + +## Add tools + +Agents can use [tools](../tools-overview.md) to complete specific tasks. +Koog provides some built-in tools, or you can implement your own custom tools. + +To configure tools, use the `toolRegistry` parameter that defines the tools available to the agent: + + +```kotlin +val agent = AIAgent( + promptExecutor = simpleOpenAIExecutor(System.getenv("YOUR_API_KEY")), + systemPrompt = "You are an expert in internet memes. Be helpful, friendly, and answer user questions concisely, showing your knowledge of memes.", + llmModel = OpenAIModels.Chat.GPT4o, + temperature = 0.7, + toolRegistry = ToolRegistry { + tool(AskUser) + } +) +``` + + +In the example, [`AskUser`](https://api.koog.ai/agents/agents-ext/ai.koog.agents.ext.tool/-ask-user/index.html) +is a built-in tool that helps the agent maintain a conversation with the user. +If the agent decides to ask the user a question, +it can call this tool that writes to `stdout` via `println()` and reads from `stdin` via `readln()`. + +Here is an example interaction with the agent: + +```text +Agent: Which meme would you like me to explain? Please choose from: Grumpy Cat, Success Kid, or Doge. + +User: Explain Doge + +Agent: + +**Doge** + +Doge is a popular internet meme that originated on the imageboard website 4chan in 2013. The meme features a picture of a Shiba Inu dog with a caption written in a playful, exaggerated style. + +The typical format of a Doge meme includes: + +* A photo of a Shiba Inu dog +* A caption written in all capital letters, using a intentionally simplistic and childlike tone +* Exaggerated or made-up words or phrases, often used to convey a humorous or nonsensical idea + +Examples of Doge memes might include: + +* "Such wow. Such happy." +* "I had fun today!" +* "Wow, I am good at napping." + +The meme is known for its lighthearted and playful tone, and is often used to express excitement, happiness, or silliness. The meme has since become a cultural phenomenon, with countless variations and parodies emerging online. +``` + +## Adjust agent iterations + +To avoid infinite loops, Koog allows any agent to take a limited number of steps (50 by default). +Use the `maxIterations` parameter to either increase this limit if you expect the agent to require more steps +(tool calls and LLM requests) or decrease it for agents that require only a few steps. +For example, a simple agent described here is not likely to require more than 10 steps: + + +```kotlin +val agent = AIAgent( + promptExecutor = simpleOpenAIExecutor(System.getenv("YOUR_API_KEY")), + systemPrompt = "You are an expert in internet memes. Be helpful, friendly, and answer user questions concisely, showing your knowledge of memes.", + llmModel = OpenAIModels.Chat.GPT4o, + temperature = 0.7, + toolRegistry = ToolRegistry { + tool(AskUser) + }, + maxIterations = 10 +) +``` + + +## Handle events during agent runtime + +To assist with testing and debugging, as well as making hooks for chained agent interactions, +Koog provides the [EventHandler](https://api.koog.ai/agents/agents-features/agents-features-event-handler/ai.koog.agents.features.eventHandler.feature/-event-handler/index.html) feature. +Call the `handleEvents()` function inside the agent constructor lambda to install the feature and register event handlers: + + +```kotlin +val agent = AIAgent( + promptExecutor = simpleOpenAIExecutor(System.getenv("YOUR_API_KEY")), + systemPrompt = "You are an expert in internet memes. Be helpful, friendly, and answer user questions concisely, showing your knowledge of memes.", + llmModel = OpenAIModels.Chat.GPT4o, + temperature = 0.7, + toolRegistry = ToolRegistry { + tool(AskUser) + }, + maxIterations = 10 +){ + handleEvents { + // Handle tool calls + onToolCallStarting { eventContext -> + println("Tool called: ${eventContext.toolName} with args ${eventContext.toolArgs}") + } + } +} +``` + + +The agent will now output something similar to the following when it calls the `AskUser` tool: + +```text +Tool called: __ask_user__ with args {"message":"Which meme would you like me to explain?"} +``` + +For more information about Koog agent features, see [Features overview](../features-overview.md). + +## Next steps + +- Learn more about building [functional agents](functional-agents.md) and [standard agents](standard-agents.md) diff --git a/docs/docs/getting-started/functional-agents.md b/docs/docs/getting-started/functional-agents.md new file mode 100644 index 0000000000..220959eadc --- /dev/null +++ b/docs/docs/getting-started/functional-agents.md @@ -0,0 +1,174 @@ +# Functional agents + +Functional agents are lightweight AI agents that operate without building complex strategy graphs. +Instead, you implement the agent logic as a lambda function that handles user input, +interacts with an LLM, calls tools if necessary, and produces the final output. + +??? note "Prerequisites" + + --8<-- "getting-started-snippets.md:prerequisites" + + --8<-- "getting-started-snippets.md:dependencies" + + --8<-- "getting-started-snippets.md:api-key" + + Examples on this page assume that you are running Llama 3.2 locally via Ollama. + +This page describes how to implement a functional strategy to prototype some custom logic for your agent. +For production needs, refactor your functional agent into a [standard agent](standard-agents.md) +by implementing a proper strategy graph. + +## Create a minimal functional agent + +To create a minimal functional agent, +use the same [`AIAgent`](https://api.koog.ai/agents/agents-core/ai.koog.agents.core.agent/-a-i-agent/index.html) interface +as for a [basic agent](basic-agents.md) +and pass an instance of [`AIAgentFunctionalStrategy`](https://api.koog.ai/agents/agents-core/ai.koog.agents.core.agent/-a-i-agent-functional-strategy/index.html) to it. +The most convenient way is to use the `functionalStrategy {...}` DSL method. + +For example, here is how to define a functional strategy that expects a string input and returns a string output, +makes one LLM call, then returns the content of the assistant message from the response. + + +```kotlin +val strategy = functionalStrategy { input -> + val response = requestLLM(input) + response.asAssistantMessage().content +} + +val mathAgent = AIAgent( + promptExecutor = simpleOllamaAIExecutor(), + llmModel = OllamaModels.Meta.LLAMA_3_2, + strategy = strategy +) + +fun main() = runBlocking { + val result = mathAgent.run("What is 12 × 9?") + println(result) +} +``` + + +The agent can produce the following output: + +```text +The answer to 12 × 9 is 108. +``` + +## Make sequential LLM calls + +You can extend the previous strategy to make multiple sequential LLM calls: + + +```kotlin +val strategy = functionalStrategy { input -> + // The first LLM call produces an initial draft based on the user input + val draft = requestLLM("Draft: $input").asAssistantMessage().content + // The second LLM call improves the initial draft + val improved = requestLLM("Improve and clarify.").asAssistantMessage().content + // The final LLM call formats the improved text and returns the result + requestLLM("Format the result as bold.").asAssistantMessage().content +} +``` + + +The agent can produce the following output: + +```text +To calculate the product of 12 and 9, we multiply these two numbers together. + +12 × 9 = **108** +``` + +## Add tools + +In many cases, a functional agent needs to complete specific tasks, +such as reading and writing data, calling APIs, or performing other deterministic operations. +In Koog, you expose such capabilities as [tools](../tools-overview.md) and let the LLM decide when to call them. + +Here is what you need to do: + +1. Create an [annotation-based tool](../annotation-based-tools.md). +2. Add it to a tool registry and pass the registry to the agent. +3. Make sure the agent strategy can identify tool calls in LLM responses, execute the requested tools, + send their results back to the LLM, and repeat the process until there are no tool calls remaining. + + +```kotlin +@LLMDescription("Tools for performing math operations") +class MathTools : ToolSet { + @Tool + @LLMDescription("Multiplies two numbers and returns the result") + fun multiply(a: Int, b: Int): Int { + // This is not necessary, but it helps to see the tool call in the console output + println("Multiplying $a and $b...") + return a * b + } +} + +val toolRegistry = ToolRegistry { + tool(MathTools()::multiply) +} + +val strategy = functionalStrategy { input -> + // Send the user input to the LLM + var responses = requestLLMMultiple(input) + + // Only loop while the LLM requests tools + while (responses.containsToolCalls()) { + // Extract tool calls from the response + val pendingCalls = extractToolCalls(responses) + // Execute the tools and return the results + val results = executeMultipleTools(pendingCalls) + // Send the tool results back to the LLM. The LLM may call more tools or return a final output + responses = sendMultipleToolResults(results) + } + + // When no tool calls remain, extract and return the assistant message content from the response + responses.single().asAssistantMessage().content +} + +val mathAgentWithTools = AIAgent( + promptExecutor = simpleOllamaAIExecutor(), + llmModel = OllamaModels.Meta.LLAMA_3_2, + toolRegistry = toolRegistry, + strategy = strategy +) + +fun main() = runBlocking { + val result = mathAgentWithTools.run("Multiply 3 by 4, then multiply the result by 5.") + println(result) +} +``` + + +The agent can produce the following output: + +```text +Multiplying 3 and 4... +Multiplying 12 and 5... +The result of multiplying 3 by 4 is 12. Multiplying 12 by 5 gives us a final answer of 60. +``` + +## Next steps + +- Learn how to [implement a proper strategy graph](standard-agents.md) diff --git a/docs/docs/getting-started.md b/docs/docs/getting-started/index.md similarity index 61% rename from docs/docs/getting-started.md rename to docs/docs/getting-started/index.md index 8a8c02999f..cc30ccd3ab 100644 --- a/docs/docs/getting-started.md +++ b/docs/docs/getting-started/index.md @@ -1,210 +1,144 @@ # Getting started -This guide will help you install Koog and create your first AI agent. +This guide will help you start using Koog in your project. ## Prerequisites -Before you start, make sure you have the following: - -- A working Kotlin/JVM project with Gradle or Maven. -- Java 17+ installed. -- A valid API key for your preferred [LLM provider](llm-providers.md) (not required for Ollama, which runs locally). +--8<-- "getting-started-snippets.md:prerequisites" ## Install Koog -To use Koog, you need to include all necessary dependencies in your build configuration. - -!!! note - Replace `LATEST_VERSION` with the latest version of Koog published on Maven Central. +--8<-- "getting-started-snippets.md:dependencies" -=== "Gradle (Kotlin DSL)" +## Set up an API key - 1. Add the dependency to the `build.gradle.kts` file. - - ```kotlin - dependencies { - implementation("ai.koog:koog-agents:LATEST_VERSION") - } - ``` - 2. Make sure that you have `mavenCentral()` in the list of repositories. - - ```kotlin - repositories { - mavenCentral() - } - ``` - -=== "Gradle (Groovy)" - - 1. Add the dependency to the `build.gradle` file. - - ```groovy - dependencies { - implementation 'ai.koog:koog-agents:LATEST_VERSION' - } - ``` - 2. Make sure that you have `mavenCentral()` in the list of repositories. - ```groovy - repositories { - mavenCentral() - } - ``` +Koog requires either an API key from a [supported LLM provider](../llm-providers.md) or a locally running LLM. -=== "Maven" - - 1. Add the dependency to the `pom.xml` file. - - ```xml - - ai.koog - koog-agents-jvm - LATEST_VERSION - - ``` - 2. Make sure that you have `mavenCentral()` in the list of repositories. - - ```xml - - - mavenCentral - https://repo1.maven.org/maven2/ - - - ``` - -!!! note - When integrating Koog with [Ktor servers](ktor-plugin.md), [Spring applications](spring-boot.md), or [MCP tools](model-context-protocol.md), - you need to include the additional dependencies in your build configuration. - For the exact dependencies, refer to the relevant pages in the Koog documentation. - - -## Set an API key - -!!! tip - Use environment variables or a secure configuration management system to store your API keys. - Avoid hardcoding API keys directly in your source code. +!!! warning + Avoid hardcoding API keys in the source code. + Use environment variables to store API keys. === "OpenAI" - Get your [API key](https://platform.openai.com/api-keys) and assign it as an environment variable. + Get your [OpenAI API key](https://platform.openai.com/api-keys) and assign it to the `OPENAI_API_KEY` environment variable. === "Linux/macOS" - ```bash + ```shell export OPENAI_API_KEY=your-api-key ``` === "Windows" - ```shell + ```cmd setx OPENAI_API_KEY "your-api-key" ``` - - Restart your terminal to apply the changes. You can now retrieve and use the API key to create an agent. === "Anthropic" - Get your [API key](https://console.anthropic.com/settings/keys) and assign it as an environment variable. + Get your [Anthropic API key](https://console.anthropic.com/settings/keys) and assign it to the `ANTHROPIC_API_KEY` environment variable. === "Linux/macOS" - ```bash + ```shell export ANTHROPIC_API_KEY=your-api-key ``` === "Windows" - ```shell + ```cmd setx ANTHROPIC_API_KEY "your-api-key" ``` - - Restart your terminal to apply the changes. You can now retrieve and use the API key to create an agent. === "Google" - Get your [API key](https://aistudio.google.com/app/api-keys) and assign it as an environment variable. + Get your [Gemini API key](https://aistudio.google.com/app/api-keys) and assign it to the `GOOGLE_API_KEY` environment variable. === "Linux/macOS" - ```bash + ```shell export GOOGLE_API_KEY=your-api-key ``` === "Windows" - ```shell + ```cmd setx GOOGLE_API_KEY "your-api-key" - ``` - - Restart your terminal to apply the changes. You can now retrieve and use the API key to create an agent. + ``` === "DeepSeek" - - Get your [API key](https://platform.deepseek.com/api_keys) and assign it as an environment variable. + + Get your [DeepSeek API key](https://platform.deepseek.com/api_keys) and assign it to the `DEEPSEEK_API_KEY` environment variable. === "Linux/macOS" - ```bash + ```shell export DEEPSEEK_API_KEY=your-api-key ``` === "Windows" - ```shell + ```cmd setx DEEPSEEK_API_KEY "your-api-key" - ``` - - Restart your terminal to apply the changes. You can now retrieve and use the API key to create an agent. + ``` === "OpenRouter" - Get your [API key](https://openrouter.ai/keys) and assign it as an environment variable. + Get your [OpenRouter API key](https://openrouter.ai/keys) and assign it to the `OPENROUTER_API_KEY` environment variable. === "Linux/macOS" - ```bash + ```shell export OPENROUTER_API_KEY=your-api-key ``` === "Windows" - ```shell + ```cmd setx OPENROUTER_API_KEY "your-api-key" - ``` - - Restart your terminal to apply the changes. You can now retrieve and use the API key to create an agent. + ``` === "Bedrock" - Get valid [AWS credentials](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_bedrock.html) (an access key and a secret key) and assign them as environment variables. + [Generate an Amazon Bedrock API key](https://docs.aws.amazon.com/bedrock/latest/userguide/api-keys.html) and assign it to the `BEDROCK_API_KEY` environment variable. === "Linux/macOS" - ```bash - export AWS_BEDROCK_ACCESS_KEY=your-access-key - export AWS_BEDROCK_SECRET_ACCESS_KEY=your-secret-access-key + ```shell + export BEDROCK_API_KEY=your-api-key ``` === "Windows" + ```cmd + setx BEDROCK_API_KEY "your-api-key" + ``` + +=== "Mistral" + + Get your [Mistral API key](https://console.mistral.ai/api-keys) and assign it to the `MISTRAL_API_KEY` environment variable. + + === "Linux/macOS" + ```shell - setx AWS_BEDROCK_ACCESS_KEY "your-access-key" - setx AWS_BEDROCK_SECRET_ACCESS_KEY "your-secret-access-key" - ``` + export MISTRAL_API_KEY=your-api-key + ``` - Restart your terminal to apply the changes. You can now retrieve and use the API key to create an agent. + === "Windows" -=== "Ollama" + ```cmd + setx MISTRAL_API_KEY "your-api-key" + ``` - Install Ollama and run a model locally without an API key. +=== "Ollama" - For more information, see [Ollama documentation](https://docs.ollama.com/quickstart). + Run a local LLM in Ollama as described in the [Ollama documentation](https://docs.ollama.com/quickstart). -## Create and run an agent +## Create your first Koog agent === "OpenAI" - The example below creates and runs a simple AI agent using the [`GPT-4o`](https://platform.openai.com/docs/models/gpt-4o) model. + The following example creates and runs a simple Koog agent using the [`GPT-4o`](https://platform.openai.com/docs/models/gpt-4o) model via the OpenAI API. ```kotlin fun main() = runBlocking { - // Get an API key from the OPENAI_API_KEY environment variable + // Get the OpenAI API key from the OPENAI_API_KEY environment variable val apiKey = System.getenv("OPENAI_API_KEY") ?: error("The API key is not set.") @@ -250,7 +184,7 @@ To use Koog, you need to include all necessary dependencies in your build config === "Anthropic" - The example below creates and runs a simple AI agent using the [`Claude Opus 4.1`](https://www.anthropic.com/news/claude-opus-4-1) model. + The following example creates and runs a simple Koog agent using the [`Claude Opus 4.1`](https://www.anthropic.com/news/claude-opus-4-1) model via the Anthropic API. ```kotlin fun main() = runBlocking { - // Get an API key from the ANTHROPIC_API_KEY environment variable + // Get the Anthropic API key from the ANTHROPIC_API_KEY environment variable val apiKey = System.getenv("ANTHROPIC_API_KEY") ?: error("The API key is not set.") @@ -294,7 +228,7 @@ To use Koog, you need to include all necessary dependencies in your build config === "Google" - The example below creates and runs a simple AI agent using the [`Gemini 2.5 Pro`](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-pro) model. + The following example creates and runs a simple Koog agent using the [`Gemini 2.5 Pro`](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-pro) model via the Gemini API. ```kotlin fun main() = runBlocking { - // Get an API key from the GOOGLE_API_KEY environment variable + // Get the Gemini API key from the GOOGLE_API_KEY environment variable val apiKey = System.getenv("GOOGLE_API_KEY") ?: error("The API key is not set.") @@ -338,7 +272,7 @@ To use Koog, you need to include all necessary dependencies in your build config === "DeepSeek" - The example below creates and runs a simple AI agent using the `deepseek-chat` model. + The following example creates and runs a simple Koog agent using the `deepseek-chat` model via the DeepSeek API. ```kotlin fun main() = runBlocking { - // Get an API key from the DEEPSEEK_API_KEY environment variable + // Get the DeepSeek API key from the DEEPSEEK_API_KEY environment variable val apiKey = System.getenv("DEEPSEEK_API_KEY") ?: error("The API key is not set.") @@ -379,7 +313,7 @@ To use Koog, you need to include all necessary dependencies in your build config === "OpenRouter" - The example below creates and runs a simple AI agent using the [`GPT-4o`](https://openrouter.ai/openai/gpt-4o) model. + The following example creates and runs a simple Koog agent using the [`GPT-4o`](https://openrouter.ai/openai/gpt-4o) model via the OpenRouter API. ```kotlin fun main() = runBlocking { - // Get an API key from the OPENROUTER_API_KEY environment variable + // Get the OpenRouter API key from the OPENROUTER_API_KEY environment variable val apiKey = System.getenv("OPENROUTER_API_KEY") ?: error("The API key is not set.") @@ -414,7 +348,7 @@ To use Koog, you need to include all necessary dependencies in your build config === "Bedrock" - The example below creates and runs a simple AI agent using the [`Claude Sonnet 4.5`](https://www.anthropic.com/news/claude-sonnet-4-5) model. + The following example creates and runs a simple Koog agent using the [`Claude Sonnet 4.5`](https://www.anthropic.com/news/claude-sonnet-4-5) model via the Bedrock API. ```kotlin fun main() = runBlocking { - // Get access keys from the AWS_BEDROCK_ACCESS_KEY and AWS_BEDROCK_SECRET_ACCESS_KEY environment variables - val awsAccessKeyId = System.getenv("AWS_BEDROCK_ACCESS_KEY") - ?: error("The access key is not set.") - - val awsSecretAccessKey = System.getenv("AWS_BEDROCK_SECRET_ACCESS_KEY") - ?: error("The secret access key is not set.") + // Get the Bedrock API key from the BEDROCK_API_KEY environment variable + val apiKey = System.getenv("BEDROCK_API_KEY") + ?: error("The API key is not set.") // Create an agent val agent = AIAgent( - promptExecutor = simpleBedrockExecutor(awsAccessKeyId, awsSecretAccessKey), + promptExecutor = simpleBedrockExecutorWithBearerToken(apiKey), llmModel = BedrockModels.AnthropicClaude4_5Sonnet ) @@ -461,9 +392,55 @@ To use Koog, you need to include all necessary dependencies in your build config What would you like help with today? ``` +=== "Mistral" + + The following example creates and runs a simple Koog agent using the [`Mistral Medium 3.1`](https://docs.mistral.ai/models/mistral-medium-3-1-25-08) model via the Mistral AI API. + + + ```kotlin + fun main() = runBlocking { + // Get the Mistral AI API key from the MISTRAL_API_KEY environment variable + val apiKey = System.getenv("MISTRAL_API_KEY") + ?: error("The API key is not set.") + + // Create an agent + val agent = AIAgent( + promptExecutor = simpleMistralAIExecutor(apiKey), + llmModel = MistralAIModels.Chat.MistralMedium31 + ) + + // Run the agent + val result = agent.run("Hello! How can you help me?") + println(result) + } + ``` + + + The example can produce the following output: + + ``` + I can assist you with a wide range of topics and tasks. Here are some examples: + + 1. **Answering questions**: I can provide information on various subjects, including history, science, technology, literature, and more. + 2. **Providing definitions**: If you're unsure about the meaning of a word or phrase, I can help define it for you. + 3. **Generating text**: Whether it's writing an email, creating content for social media, or composing a story, I can help with text generation. + 4. **Translation**: I can translate text from one language to another. + 5. **Conversation**: We can have a chat about any topic that interests you, and I'll respond accordingly. + 6. **Language practice**: If you're learning a new language, I can help with pronunciation, grammar, and vocabulary practice. + 7. **Brainstorming**: If you're stuck on a problem or need ideas for a project, I can help brainstorm solutions. + 8. **Summarization**: If you have a long piece of text and want a summary, I can condense it for you. + + What's on your mind? Is there something specific you'd like help with? + ``` + === "Ollama" - The example below creates and runs a simple AI agent using the [`llama3.2`](https://ollama.com/library/llama3.2) model. + The following example creates and runs a simple Koog agent using the [`llama3.2`](https://ollama.com/library/llama3.2) model running locally via Ollama. llmRequest + llmRequest --Message.Response--> onToolCall{{onToolCall}} + llmRequest --Message.Response--> onAssistantMessage{{onAssistantMessage}} + onAssistantMessage --String--> Output + onToolCall --Message.Tool.Call--> executeTool --ReceivedToolResult--> sendToolResult + sendToolResult --Message.Response--> onToolCall + sendToolResult --Message.Response--> onAssistantMessage +``` + +## Build a strategy graph + +In Koog, you implement a strategy using [AIAgentGraphStrategyBuilder](https://api.koog.ai/agents/agents-core/ai.koog.agents.core.dsl.builder/-a-i-agent-graph-strategy-builder/index.html). +Just like every node has an input and output type, +the strategy as a whole also defines some input and output type. +This example assumes that the input and output types are strings, +which means the agent implementing this strategy will expect a string and return a string. + +To create a strategy, use the `strategy()` function with two generics as the input and output types, +provide a unique identifier for the strategy, and define the nodes and edges. + + +```kotlin +val calculatorAgentStrategy = strategy("Simple calculator") { + val nodeSendInput by nodeLLMRequest() + val nodeExecuteTool by nodeExecuteTool() + val nodeSendToolResult by nodeLLMSendToolResult() + + edge(nodeStart forwardTo nodeSendInput) + edge(nodeSendInput forwardTo nodeFinish onAssistantMessage { true }) + edge(nodeSendInput forwardTo nodeExecuteTool onToolCall { true }) + edge(nodeExecuteTool forwardTo nodeSendToolResult) + edge(nodeSendToolResult forwardTo nodeFinish onAssistantMessage { true }) + edge(nodeSendToolResult forwardTo nodeExecuteTool onToolCall { true }) +} +``` + + +This example uses only [predefined nodes](../nodes-and-components.md), +but you can also create [custom nodes](../custom-nodes.md). + +Every strategy graph must have a path from `nodeStart` to `nodeFinish` connected by [edges](../custom-strategy-graphs.md#edges). +Edges can have conditions to determine when to follow a particular edge. +Edges can also transform the output of the previous node before passing it to the next one. +This is necessary to connect nodes that have non-matching output and input types. + +In the previous example, `onToolCall { true }` means that the edge will follow only if the previous node returned a tool call (`Message.Tool.Call`). + +When using `onAssistantMessage { true }`, the edge will follow only if the previous node returned an assistant message (`Message.Assistant`). +This function also extracts the content of the assistant message, effectively transforming `Message.Assistant` to `String`, +because `nodeFinish` expects a string. + +!!! tip + + Instead of `onAssistantMessage {true}`, you can do the following: + + ```kotlin + onIsInstance(Message.Assistant::class) transformed { it.content } + ``` + + Or: + + ```kotlin + onCondition { it is Message.Assistant } transformed { it.asAssistantMessage().content } + ``` + +## Create and run the agent + +Let's create an agent instance with this strategy and run it: + + +```kotlin +val calculatorAgentStrategy = strategy("Simple calculator") { + val nodeSendInput by nodeLLMRequest() + val nodeExecuteTool by nodeExecuteTool() + val nodeSendToolResult by nodeLLMSendToolResult() + + edge(nodeStart forwardTo nodeSendInput) + edge(nodeSendInput forwardTo nodeFinish onAssistantMessage { true }) + edge(nodeSendInput forwardTo nodeExecuteTool onToolCall { true }) + edge(nodeExecuteTool forwardTo nodeSendToolResult) + edge(nodeSendToolResult forwardTo nodeFinish onAssistantMessage { true }) + edge(nodeSendToolResult forwardTo nodeExecuteTool onToolCall { true }) +} + +val mathAgent = AIAgent( + promptExecutor = simpleOllamaAIExecutor(), + llmModel = OllamaModels.Meta.LLAMA_3_2, + strategy = calculatorAgentStrategy +) + +fun main() = runBlocking { + val result = mathAgent.run("Multiply 3 by 4, then multiply the result by 5, then add 10, then add 123.") + println(result) +} +``` + + +When you run this agent, it will respond with something like this: + +```text +To calculate this, I'll follow the order of operations: + +1. Multiply 3 by 4: 3 * 4 = 12 +2. Multiply the result by 5: 12 * 5 = 60 +3. Add 10: 60 + 10 = 70 +4. Add 123: 70 + 123 = 193 + +The final answer is 193. +``` + +However, since this agent does not have any tools, it generates the whole answer. +Since the LLM never returns a tool call, this is what happens: + +```mermaid +--- +config: + flowchart: + defaultRenderer: "elk" +--- +graph LR + subgraph nodeStart + Input + end + + subgraph nodeFinish + Output + end + + subgraph nodeSendInput + llmRequest(Request LLM) + end + + Input --String--> llmRequest --Message.Response--> onAssistantMessage{{onAssistantMessage}} --String--> Output + +``` + +Even though ir is correct in this case, the answer will depend on the arithmetic abilities of the underlying LLM. +To make sure the calculations are correct, we should provide the agent with math tools. +Then the LLM will be able to decide to call tools that perform the calculations deterministically. + +## Add tools + +Define [tools](../tools-overview.md) for performing math operations and add them to a [ToolRegistry](https://api.koog.ai/agents/agents-tools/ai.koog.agents.core.tools/-tool-registry/index.html): + + +```kotlin +@LLMDescription("Tools for performing math operations") +class MathTools : ToolSet { + @Tool + @LLMDescription("Adds two numbers and returns the result") + fun add(a: Int, b: Int): Int { + // This is not necessary, but it helps to see the tool call in the console output + println("Adding $a and $b...") + return a + b + } + @Tool + @LLMDescription("Multiplies two numbers and returns the result") + fun multiply(a: Int, b: Int): Int { + // This is not necessary, but it helps to see the tool call in the console output + println("Multiplying $a and $b...") + return a * b + } +} + +val toolRegistry = ToolRegistry { + tools(MathTools()) +} +``` + + +Add the tool registry to the agent configuration: + + +```kotlin +val mathAgent = AIAgent( + promptExecutor = simpleOllamaAIExecutor(), + llmModel = OllamaModels.Meta.LLAMA_3_2, + strategy = calculatorAgentStrategy, + toolRegistry = toolRegistry +) + +fun main() = runBlocking { + val result = mathAgent.run("Multiply 3 by 4, then multiply the result by 5, then add 10, then add 123.") + println(result) +} +``` + + +When you run the agent now, it will respond with something like this: + +```text +Multiplying 3 and 4... +The output from the first operation was multiplied by 5: +5 * 12 = 60 + +Then, 10 was added to the result: +60 + 10 = 70 + +Finally, 123 was added to the result: +70 + 123 = 193 +``` + +According to this output, the agent correctly performed the calculations, but it only called the `multiply` tool once +instead of calling the corresponding tool for every operation. +We can help the agent by describing its role and providing instructions for using appropriate tools in the system prompt. + +## Provide a system prompt + +A [system prompt](../prompts/prompt-creation/index.md#system-message) defines the agent's role and instructions for performing tasks. +In our example, it is important to describe how the agent should process complex multistep calculations: + + +```kotlin +val mathAgent = AIAgent( + promptExecutor = simpleOllamaAIExecutor(), + llmModel = OllamaModels.Meta.LLAMA_3_2, + systemPrompt = """ + You are a simple calculator assistant. + You can add and multiply two numbers using the 'add' and 'multiply' tools. + When the user provides input, extract the numbers and operations they requested. + Use the appropriate tool for the first operation, then the next one, and so on, until you calculate the result. + Always respond with a clear, friendly message showing the calculation and result. + """.trimIndent(), + toolRegistry = toolRegistry, + strategy = calculatorAgentStrategy +) + +fun main() = runBlocking { + val result = mathAgent.run("Multiply 3 by 4, then multiply the result by 5, then add 10, then add 123.") + println(result) +} +``` + + +When you run the agent now, it will respond with something like this: + +```text +Multiplying 3 and 4... +Multiplying 12 and 5... +Adding 60 and 10... +Adding 70 and 123... +The final result is: 193 +``` + +As you can see, the agent now correctly calls the appropriate tool for each operation, +ensuring that it performs the calculations deterministically instead of risking a hallucinated result. + +## Next steps + +- Learn about [planner agents](planner-agents.md) +- Enhance your agent by [installing additional features](../features-overview.md) +- Improve the predictability and reliability with [structured output](../structured-output.md) diff --git a/docs/docs/index.md b/docs/docs/index.md index 47a60005b6..ec2fe83a5a 100644 --- a/docs/docs/index.md +++ b/docs/docs/index.md @@ -7,7 +7,7 @@ You can customize agent capabilities with a modular feature system and deploy yo
-- :material-rocket-launch:{ .lg .middle } [**Getting started**](getting-started.md) +- :material-rocket-launch:{ .lg .middle } [**Getting started**](getting-started/index.md) --- @@ -25,25 +25,25 @@ You can customize agent capabilities with a modular feature system and deploy yo
-- :material-robot-outline:{ .lg .middle } [**Basic agents**](basic-agents.md) +- :material-robot-outline:{ .lg .middle } [**Basic agents**](getting-started/basic-agents.md) --- Create and run agents that process a single input and provide a response -- :material-script-text-outline:{ .lg .middle } [**Functional agents**](functional-agents.md) +- :material-script-text-outline:{ .lg .middle } [**Functional agents**](getting-started/functional-agents.md) --- Create and run lightweight agents with custom logic in plain Kotlin -- :material-graph-outline:{ .lg .middle } [**Complex workflow agents**](complex-workflow-agents.md) +- :material-graph-outline:{ .lg .middle } [**Standard agents**](getting-started/standard-agents.md) --- Create and run agents that handle complex workflows with custom strategies -- :material-state-machine:{ .lg .middle } [**Planner agents**](planner-agents.md) +- :material-state-machine:{ .lg .middle } [**Planner agents**](getting-started/planner-agents.md) --- diff --git a/docs/docs/javascripts/mermaid.mjs b/docs/docs/javascripts/mermaid.mjs new file mode 100644 index 0000000000..ac4c3df6d3 --- /dev/null +++ b/docs/docs/javascripts/mermaid.mjs @@ -0,0 +1,12 @@ +import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.esm.min.mjs'; +import elkLayouts from 'https://cdn.jsdelivr.net/npm/@mermaid-js/layout-elk@0/dist/mermaid-layout-elk.esm.min.mjs'; + +mermaid.registerLayoutLoaders(elkLayouts); +mermaid.initialize({ + startOnLoad: false, + securityLevel: "loose", + layout: "elk", +}); + +// Important: necessary to make it visible to Material for MkDocs +window.mermaid = mermaid; diff --git a/docs/docs/llm-providers.md b/docs/docs/llm-providers.md index b988dc003d..d52b454e9b 100644 --- a/docs/docs/llm-providers.md +++ b/docs/docs/llm-providers.md @@ -63,7 +63,7 @@ and multi‑provider options are available. ## Next steps -- [Create and run an agent](getting-started.md) with a specific LLM provider. +- [Create and run an agent](getting-started/index.md) with a specific LLM provider. - Learn more about [prompts](prompts/index.md). [^1]: Capability is supported only by some models of the provider. diff --git a/docs/docs/prompts/index.md b/docs/docs/prompts/index.md index 78266593c4..124f48b491 100644 --- a/docs/docs/prompts/index.md +++ b/docs/docs/prompts/index.md @@ -31,7 +31,7 @@ val myPrompt = prompt("hello-koog") { !!! note AI agents can take a simple text prompt as input. They automatically convert the text prompt to the Prompt object and send it to the LLM for execution. - This is useful for a [basic agent](../basic-agents.md) + This is useful for a [basic agent](../getting-started/basic-agents.md) that only needs to run a single request and does not require complex conversation logic. ## Running prompts @@ -107,10 +107,11 @@ The prompt lifecycle in an agent usually includes several stages: ### Initial prompt setup -When you [initialize an agent](../getting-started/#create-and-run-an-agent), you define -a [system message](prompt-creation/index.md#system-message) that sets the agent's behavior. -Then, when you call the agent's `run()` method, you typically provide an initial [user message](prompt-creation/index.md#user-messages) -as input. Together, these messages form the agent's initial prompt. For example: +When you [initialize an agent](../getting-started/index.md#create-your-first-koog-agent), +you can define a [system message](prompt-creation/index.md#system-message) that sets the agent's behavior. +Then, when you call the agent's `run()` method, +you typically provide an initial [user message](prompt-creation/index.md#user-messages) as input. +Together, these messages form the agent's initial prompt. For example: |"result to"| A ``` -For more [advanced configurations](../complex-workflow-agents.md#4-configure-the-agent), you can also use -[AIAgentConfig](https://api.koog.ai/agents/agents-core/ai.koog.agents.core.agent.config/-a-i-agent-config/index.html) +For more advanced configurations, you can also use [AIAgentConfig](https://api.koog.ai/agents/agents-core/ai.koog.agents.core.agent.config/-a-i-agent-config/index.html) to define the agent's initial prompt. ### Automatic prompt updates @@ -168,9 +168,9 @@ to define the agent's initial prompt. As the agent runs its strategy, [predefined nodes](../nodes-and-components.md) automatically update the prompt. For example: -- [`nodeLLMRequest`](../nodes-and-components/#nodellmrequest): Appends a user message to the prompt and captures the LLM response. -- [`nodeLLMSendToolResult`](../nodes-and-components/#nodellmsendtoolresult): Appends tool execution results to the conversation. -- [`nodeAppendPrompt`](../nodes-and-components/#nodeappendprompt): Inserts specific messages into the prompt at any point in the workflow. +- [`nodeLLMRequest`](../nodes-and-components.md#nodellmrequest): Appends a user message to the prompt and captures the LLM response. +- [`nodeLLMSendToolResult`](../nodes-and-components.md#nodellmsendtoolresult): Appends tool execution results to the conversation. +- [`nodeAppendPrompt`](../nodes-and-components.md#nodeappendprompt): Inserts specific messages into the prompt at any point in the workflow. ### Context window management diff --git a/docs/docs/prompts/prompt-creation/multimodal-content.md b/docs/docs/prompts/prompt-creation/multimodal-content.md index 3a060afdc7..684c7deb58 100644 --- a/docs/docs/prompts/prompt-creation/multimodal-content.md +++ b/docs/docs/prompts/prompt-creation/multimodal-content.md @@ -15,7 +15,7 @@ Each function supports two ways of configuring attachment parameters, so you can - Create and pass a `ContentPart` object to the function for custom control over attachment parameters. !!! note - Multimodal content support varies by [LLM provider](../llm-providers.md). + Multimodal content support varies by [LLM provider](../../llm-providers.md). Check the provider documentation for supported content types. ### Auto-configured attachments diff --git a/docs/includes/abbreviations.md b/docs/docs/snippets/abbreviations.md similarity index 100% rename from docs/includes/abbreviations.md rename to docs/docs/snippets/abbreviations.md diff --git a/docs/docs/snippets/getting-started-snippets.md b/docs/docs/snippets/getting-started-snippets.md new file mode 100644 index 0000000000..3746409675 --- /dev/null +++ b/docs/docs/snippets/getting-started-snippets.md @@ -0,0 +1,48 @@ +--- +search: +exclude: true +--- + +# --8<-- [start:prerequisites] +Ensure your environment and project meet the following requirements: + +- JDK 17+ +- Kotlin 2.2.0+ +- Gradle 8.0+ or Maven 3.8+ +# --8<-- [end:prerequisites] + +# --8<-- [start:dependencies] +Add the [Koog package](https://central.sonatype.com/artifact/ai.koog/koog-agents/) as a dependency: + +=== "Gradle (Kotlin)" + + ``` kotlin title="build.gradle.kts" + dependencies { + implementation("ai.koog:koog-agents:0.6.0") + } + ``` + +=== "Gradle (Groovy)" + + ``` groovy title="build.gradle" + dependencies { + implementation 'ai.koog:koog-agents:0.6.0' + } + ``` + +=== "Maven" + + ```xml title="pom.xml" + + ai.koog + koog-agents-jvm + 0.6.0 + + ``` +# --8<-- [end:dependencies] + +# --8<-- [start:api-key] +Get an API key from an LLM provider or run a local LLM via Ollama. +For more information, see [Getting started](../getting-started/index.md). +# --8<-- [end:api-key] + diff --git a/docs/docs/spring-boot.md b/docs/docs/spring-boot.md index a5238b0d42..bc9c7cb13e 100644 --- a/docs/docs/spring-boot.md +++ b/docs/docs/spring-boot.md @@ -326,8 +326,8 @@ API key is required but not provided ## Next Steps -- Learn about the [basic agents](basic-agents.md) to build minimal AI workflows -- Explore [complex workflow agents](complex-workflow-agents.md) for advanced use cases +- Learn about the [basic agents](getting-started/basic-agents.md) to build minimal AI workflows +- Explore [standard agents](getting-started/standard-agents.md) for advanced use cases - See the [tools overview](tools-overview.md) to extend your agents' capabilities - Check out [examples](examples.md) for real-world implementations - Read the [glossary](glossary.md) to understand the framework better diff --git a/docs/docs/streaming-api.md b/docs/docs/streaming-api.md index e5efa76921..44c0d63d28 100644 --- a/docs/docs/streaming-api.md +++ b/docs/docs/streaming-api.md @@ -451,23 +451,20 @@ val agentStrategy = strategy("library-assistant") { ```kotlin val toolRegistry = ToolRegistry { - tool(BookTool()) + tool(BookTool()) } val runner = AIAgent( - promptExecutor = simpleOpenAIExecutor(token), - toolRegistry = toolRegistry, - strategy = agentStrategy, - agentConfig = agentConfig + promptExecutor = simpleOpenAIExecutor("OPENAI_API_KEY"), + llmModel = OpenAIModels.Chat.GPT4o, + toolRegistry = toolRegistry ) ``` diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index 251d5f49c3..59f3fc4646 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -7,12 +7,12 @@ nav: - Key features: key-features.md - LLM providers: llm-providers.md - Glossary: glossary.md - - Getting started: getting-started.md - - Agent types: - - Basic agents: basic-agents.md - - Functional agents: functional-agents.md - - Complex workflow agents: complex-workflow-agents.md - - Planner agents: planner-agents.md + - Getting started: + - getting-started/index.md + - Basic agents: getting-started/basic-agents.md + - Functional agents: getting-started/functional-agents.md + - Standard agents: getting-started/standard-agents.md + - Planner agents: getting-started/planner-agents.md - Prompts: - prompts/index.md - Creating prompts: @@ -93,11 +93,11 @@ markdown_extensions: - pymdownx.highlight: anchor_linenums: true line_spans: __span - pygments_lang_class: true - pymdownx.inlinehilite - pymdownx.snippets: + base_path: docs/snippets auto_append: - - includes/abbreviations.md + - abbreviations.md - pymdownx.details - pymdownx.superfences: custom_fences: @@ -125,7 +125,11 @@ plugins: 'simple-api-getting-started.md': 'index.md' 'advanced-tool-implementation.md': 'class-based-tools.md' 'key-concepts.md': 'glossary.md' - 'single-run-agents.md': 'basic-agents.md' + 'single-run-agents.md': 'getting-started/basic-agents.md' + 'basic-agents.md': 'getting-started/basic-agents.md' + 'functional-agents.md': 'getting-started/functional-agents.md' + 'complex-workflow-agents.md': 'getting-started/standard-agents.md' + 'planner-agents.md': 'getting-started/planner-agents.md' - llmstxt: markdown_description: Koog is a Kotlin-based framework designed to build and run AI agents entirely in idiomatic Kotlin. full_output: llms-full.txt @@ -137,11 +141,10 @@ plugins: - llm-providers.md: This page provides a list of LLM providers supported by Koog. - glossary.md: This page provides explanations of key terms and concepts related to Koog and agentic development. Getting started: - - getting-started.md: This page provides a guide on installing Koog and creating a minimal AI agent. - Agent types: - - basic-agents.md: This guide lets you build a basic agent with minimum required configuration. - - functional-agents.md: This guide shows you how to build a lightweight, non‑graph agent that you control with a simple loop. - - complex-workflow-agents.md: This page explains how you can create agents that handle complex workflows by defining custom strategies, tools, configurations, and custom input and output types. + - index.md: This page provides a guide on installing Koog and creating a minimal AI agent. + - getting-started/basic-agents.md: This guide lets you build a basic agent with minimum required configuration. + - getting-started/functional-agents.md: This guide shows you how to build a lightweight, non‑graph agent that you control with a simple loop. + - getting-started/standard-agents.md: This page explains how you can create agents that handle complex workflows by defining custom strategies, tools, configurations, and custom input and output types. Prompts: - prompts/index.md: This page includes an overview of how to create and run prompts with Koog. - prompts/prompt-creation/index.md: This page provides details about structured prompts created using the Kotlin DSL. @@ -265,5 +268,5 @@ extra: link: https://docs.koog.ai/koog-slack-channel/ name: Koog on Slack -watch: - - includes +extra_javascript: + - javascripts/mermaid.mjs