apify
diff --git a/‎.actor/input_schema.json
Lines changed: 1 addition & 1 deletion b/‎.actor/input_schema.json
Lines changed: 1 addition & 1 deletion
diff --git a/‎CHANGELOG.md
Lines changed: 5 additions & 0 deletions b/‎CHANGELOG.md
Lines changed: 5 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 18 additions & 17 deletions b/‎README.md
Lines changed: 18 additions & 17 deletions
@@ -77,7 +77,7 @@
             "type": "string",
             "description": "Select a scraping tool for extracting the target web pages. The Browser tool is more powerful and can handle JavaScript heavy websites, while the Plain HTML tool can't handle JavaScript but is about two times faster.",
             "editor": "select",
-            "default": "browser-playwright",
+            "default": "raw-http",
             "enum": ["browser-playwright", "raw-http"],
             "enumTitles": ["Browser (uses Playwright)", "Raw HTTP"]
         },
 
@@ -1,5 +1,10 @@
 This changelog summarizes all changes of the RAG Web Browser
 
+### 1.0.9 (2025-03-14)
+
+🚀 Features
+- Change default value for `scrapingTool` from 'browser-playwright' to 'raw-http' to improve latency.
+
 ### 1.0.8 (2025-03-07)
 
 🚀 Features
 
@@ -2,7 +2,7 @@
 
 [![RAG Web Browser](https://apify.com/actor-badge?actor=apify/rag-web-browser)](https://apify.com/apify/rag-web-browser)
 
-This Actor provides web browsing functionality for AI and LLM applications,
+This Actor provides web browsing functionality for AI agents and LLM applications,
 similar to the [web browsing](https://openai.com/index/introducing-chatgpt-search/) feature in ChatGPT.
 It accepts a search phrase or a URL, queries Google Search, then crawls web pages from the top search results, cleans the HTML, converts it to text or Markdown,
 and returns it back for processing by the LLM application.
@@ -107,20 +107,20 @@ The response is a JSON array with objects containing the web content from the fo
 
 The `/search` GET HTTP endpoint accepts the following query parameters:
 
-| Parameter                    | Type    | Default               | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
-|------------------------------|---------|-----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `query`                      | string  | N/A                   | Enter Google Search keywords or a URL to a specific web page. The keywords might include the [advanced search operators](https://blog.apify.com/how-to-scrape-google-like-a-pro/). You need to percent-encode the value if it contains some special characters.                                                                                                                                                                                                                                                            |
-| `maxResults`                 | number  | `3`                   | The maximum number of top organic Google Search results whose web pages will be extracted. If `query` is a URL, then this parameter is ignored and the Actor only fetches the specific web page.                                                                                                                                                                                                                                                                                                                           |
-| `outputFormats`              | string  | `markdown`            | Select one or more formats to which the target web pages will be extracted. Use comma to separate multiple values (e.g. `text,markdown`)                                                                                                                                                                                                                                                                                                                                                                                   |
-| `requestTimeoutSecs`         | number  | `30`                  | The maximum time in seconds available for the request, including querying Google Search and scraping the target web pages. For example, OpenAI allows only [45 seconds](https://platform.openai.com/docs/actions/production#timeouts) for custom actions. If a target page loading and extraction exceeds this timeout, the corresponding page will be skipped in results to ensure at least some results are returned within the timeout. If no page is extracted within the timeout, the whole request fails.            |
-| `serpProxyGroup`             | string  | `GOOGLE_SERP`         | Enables overriding the default Apify Proxy group used for fetching Google Search results.                                                                                                                                                                                                                                                                                                                                                                                                                                  |
-| `serpMaxRetries`             | number  | `1`                   | The maximum number of times the Actor will retry fetching the Google Search results on error. If the last attempt fails, the entire request fails.                                                                                                                                                                                                                                                                                                                                                                         |
-| `scrapingTool`               | string  | `browser-playwright`  | Selects which scraping tool is used to extract the target websites. `browser-playwright` uses browser and can handle complex Javascript heavy website. Meanwhile `raw-http` uses simple HTTP request to fetch the HTML provided by the URL, it can't handle websites that rely on Javascript but it's about two times faster.                                                                                                                                                                                              |
-| `maxRequestRetries`          | number  | `1`                   | The maximum number of times the Actor will retry loading the target web page on error. If the last attempt fails, the page will be skipped in the results.                                                                                                                                                                                                                                                                                                                                                                 |
-| `dynamicContentWaitSecs`     | number  | `10`                  | The maximum time in seconds to wait for dynamic page content to load. The Actor considers the web page as fully loaded once this time elapses or when the network becomes idle.                                                                                                                                                                                                                                                                                                                                            |
-| `removeCookieWarnings`       | boolean | `true`                | If enabled, removes cookie consent dialogs to improve text extraction accuracy. This might increase latency.                                                                                                                                                                                                                                                                                                                                                                                                               |
-| `removeElementsCssSelector`  | string  | `see input`           | A CSS selector matching HTML elements that will be removed from the DOM, before converting it to text, Markdown, or saving as HTML. This is useful to skip irrelevant page content. The value must be a valid CSS selector as accepted by the `document.querySelectorAll()` function. \n\nBy default, the Actor removes common navigation elements, headers, footers, modals, scripts, and inline image. You can disable the removal by setting this value to some non-existent CSS selector like `dummy_keep_everything`. |
-| `debugMode`                  | boolean | `false`               | If enabled, the Actor will store debugging information in the dataset's debug field.                                                                                                                                                                                                                                                                                                                                                                                                                                       |
+| Parameter                    | Type    | Default       | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+|------------------------------|---------|---------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `query`                      | string  | N/A           | Enter Google Search keywords or a URL to a specific web page. The keywords might include the [advanced search operators](https://blog.apify.com/how-to-scrape-google-like-a-pro/). You need to percent-encode the value if it contains some special characters.                                                                                                                                                                                                                                                            |
+| `maxResults`                 | number  | `3`           | The maximum number of top organic Google Search results whose web pages will be extracted. If `query` is a URL, then this parameter is ignored and the Actor only fetches the specific web page.                                                                                                                                                                                                                                                                                                                           |
+| `outputFormats`              | string  | `markdown`    | Select one or more formats to which the target web pages will be extracted. Use comma to separate multiple values (e.g. `text,markdown`)                                                                                                                                                                                                                                                                                                                                                                                   |
+| `requestTimeoutSecs`         | number  | `30`          | The maximum time in seconds available for the request, including querying Google Search and scraping the target web pages. For example, OpenAI allows only [45 seconds](https://platform.openai.com/docs/actions/production#timeouts) for custom actions. If a target page loading and extraction exceeds this timeout, the corresponding page will be skipped in results to ensure at least some results are returned within the timeout. If no page is extracted within the timeout, the whole request fails.            |
+| `serpProxyGroup`             | string  | `GOOGLE_SERP` | Enables overriding the default Apify Proxy group used for fetching Google Search results.                                                                                                                                                                                                                                                                                                                                                                                                                                  |
+| `serpMaxRetries`             | number  | `1`           | The maximum number of times the Actor will retry fetching the Google Search results on error. If the last attempt fails, the entire request fails.                                                                                                                                                                                                                                                                                                                                                                         |
+| `scrapingTool`               | string  | `raw-http`    | Selects which scraping tool is used to extract the target websites. `browser-playwright` uses browser and can handle complex Javascript heavy website. Meanwhile `raw-http` uses simple HTTP request to fetch the HTML provided by the URL, it can't handle websites that rely on Javascript but it's about two times faster.                                                                                                                                                                                              |
+| `maxRequestRetries`          | number  | `1`           | The maximum number of times the Actor will retry loading the target web page on error. If the last attempt fails, the page will be skipped in the results.                                                                                                                                                                                                                                                                                                                                                                 |
+| `dynamicContentWaitSecs`     | number  | `10`          | The maximum time in seconds to wait for dynamic page content to load. The Actor considers the web page as fully loaded once this time elapses or when the network becomes idle.                                                                                                                                                                                                                                                                                                                                            |
+| `removeCookieWarnings`       | boolean | `true`        | If enabled, removes cookie consent dialogs to improve text extraction accuracy. This might increase latency.                                                                                                                                                                                                                                                                                                                                                                                                               |
+| `removeElementsCssSelector`  | string  | `see input`   | A CSS selector matching HTML elements that will be removed from the DOM, before converting it to text, Markdown, or saving as HTML. This is useful to skip irrelevant page content. The value must be a valid CSS selector as accepted by the `document.querySelectorAll()` function. \n\nBy default, the Actor removes common navigation elements, headers, footers, modals, scripts, and inline image. You can disable the removal by setting this value to some non-existent CSS selector like `dummy_keep_everything`. |
+| `debugMode`                  | boolean | `false`       | If enabled, the Actor will store debugging information in the dataset's debug field.                                                                                                                                                                                                                                                                                                                                                                                                                                       |
 
 <!-- TODO: we should probably add proxyConfiguration -->
 
@@ -131,7 +131,7 @@ RAG Web Browser has been designed for easy integration with LLM applications, GP
 
 ### OpenAPI schema
 
-Here you can find the [OpenAPI 3.1.0 schema](https://raw.githubusercontent.com/apify/rag-web-browser/refs/heads/master/docs/standby-openapi-3.1.0.json)
+Here you can find the [OpenAPI 3.1.0 schema](https://apify.com/apify/rag-web-browser/api/openapi)
 or [OpenAPI 3.0.0 schema](https://raw.githubusercontent.com/apify/rag-web-browser/refs/heads/master/docs/standby-openapi-3.0.0.json)
 for the Standby web server. Note that the OpenAPI definition contains
 all available query parameters, but only `query` is required.
@@ -203,7 +203,8 @@ In the Standby mode, the Actor runs an HTTP server that supports the MCP protoco
     data: {"result":{"content":[{"type":"text","text":"[{\"searchResult\":{\"title\":\"Language models recent news\",\"description\":\"Amazon Launches New Generation of LLM Foundation Model...\"}}
     ```
 
-To learn more about MCP server integration, check out the [RAG Web Browser MCP server documentation](https://github.com/apify/mcp-server-rag-web-browser).
+You can try the MCP server using the [MCP Tester Client](https://apify.com/jiri.spilka/tester-mcp-client) available on Apify. In the MCP client, simply enter the URL `https://rag-web-browser.apify.actor/sse` in the Actor input field and click **Run** and interact with server in a UI.
+To learn more about MCP servers, check out the blog post [What is Anthropic's Model Context Protocol](https://blog.apify.com/what-is-model-context-protocol/).
 
 ## ⏳ Performance optimization