You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Send results in response by rank
* Add ccs selector to remove navigation/footer elements etc. Return simple HTML by default (add option to enable/disable) readability plugin.
* Fix outputFormat in Standby mode
* Fix css selector for removing attributes, tags. Remove search results when scraping single URL
* Update lambda function
Copy file name to clipboardExpand all lines: .actor/actor.json
+1-1
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@
3
3
"name": "rag-web-browser",
4
4
"title": "RAG Web browser",
5
5
"description": "Web browser for OpenAI Assistants API and RAG pipelines, similar to a web browser in ChatGPT. It queries Google Search, scrapes the top N pages from the results, and returns their cleaned content as Markdown for further processing by an LLM.",
"description": "A CSS selector matching HTML elements that will be removed from the DOM, before converting it to text, Markdown, or saving as HTML. This is useful to skip irrelevant page content. The value must be a valid CSS selector as accepted by the `document.querySelectorAll()` function. \n\nBy default, the Actor removes common navigation elements, headers, footers, modals, scripts, and inline image. You can disable the removal by setting this value to some non-existent CSS selector like `dummy_keep_everything`.",
"description": "Specify how to transform the HTML to extract meaningful content without any extra fluff, like navigation or modals. The HTML transformation happens after removing and clicking the DOM elements.\n\n- **None** (default) - Only removes the HTML elements specified via 'Remove HTML elements' option.\n\n- **Readable text** - Extracts the main contents of the webpage, without navigation and other fluff.",
0 commit comments