Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] added a versatile and efficient Web Browsing Tool with Asynchronous Surfing #761

Closed
wants to merge 2 commits into from

Conversation

harshalmore31
Copy link
Collaborator

@harshalmore31 harshalmore31 commented Jan 25, 2025

This PR introduces a versatile and efficient web browsing tool that supports asynchronous surfing, dynamic content rendering, and real-time progress tracking. The tool is designed to provide users with a seamless and intuitive browsing experience.

Features

  • Asynchronous Search and Browsing: Utilizes aiohttp for non-blocking web requests.
  • Dynamic Webpage Handling: Leverages Playwright to handle modern, JavaScript-heavy websites.
  • Content Extraction: Employs BeautifulSoup for clean content extraction.
  • Retry Mechanism: Implements tenacity for resilience against transient failures.
  • Real-Time Progress Updates: Provides rich progress bars for real-time tracking.
  • Secure and Configurable: Supports configuration through .env files.

Advanced Capabilities

  • Handles dynamic, JavaScript-heavy websites for a smooth browsing experience.
  • Optimized for simultaneous surfing with concurrent processing and thread pooling.
  • Outputs browsed content in a clean, readable, and well-organized format.

Impact

  • Provides a user-friendly and efficient web surfing experience.
  • Ensures compatibility with modern web technologies and dynamic content.
  • Enhances productivity with structured and easily accessible browsing results.

📚 Documentation preview 📚: https://swarms--761.org.readthedocs.build/en/761/

@github-actions github-actions bot added the tools label Jan 25, 2025
"summary": summary
}

with open(os.path.join(self.outputs_dir, "search_results.json"), "w", encoding="utf-8") as f:

Check failure

Code scanning / Bearer

Unsanitized dynamic input in file path Error

Unsanitized dynamic input in file path
@@ -0,0 +1,258 @@
import asyncio
import aiohttp

Check failure

Code scanning / Pyre

Undefined import Error

Undefined import [21]: Could not find a module corresponding to import aiohttp.
@@ -0,0 +1,258 @@
import asyncio
import aiohttp
from bs4 import BeautifulSoup

Check failure

Code scanning / Pyre

Undefined import Error

Undefined import [21]: Could not find a module corresponding to import bs4.
import json
import os
from typing import List, Dict, Optional
from dotenv import load_dotenv

Check failure

Code scanning / Pyre

Undefined import Error

Undefined import [21]: Could not find a module corresponding to import dotenv.
import os
from typing import List, Dict, Optional
from dotenv import load_dotenv
import google.generativeai as genai

Check failure

Code scanning / Pyre

Undefined import Error

Undefined import [21]: Could not find a module corresponding to import google.generativeai.
from typing import List, Dict, Optional
from dotenv import load_dotenv
import google.generativeai as genai
from rich.console import Console

Check failure

Code scanning / Pyre

Undefined import Error

Undefined import [21]: Could not find a module corresponding to import rich.console.
from dotenv import load_dotenv
import google.generativeai as genai
from rich.console import Console
from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn, TimeRemainingColumn

Check failure

Code scanning / Pyre

Undefined import Error

Undefined import [21]: Could not find a module corresponding to import rich.progress.
import google.generativeai as genai
from rich.console import Console
from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn, TimeRemainingColumn
import html2text

Check failure

Code scanning / Pyre

Undefined import Error

Undefined import [21]: Could not find a module corresponding to import html2text.
from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn, TimeRemainingColumn
import html2text
from concurrent.futures import ThreadPoolExecutor, as_completed
from playwright.sync_api import sync_playwright

Check failure

Code scanning / Pyre

Undefined import Error

Undefined import [21]: Could not find a module corresponding to import playwright.sync\_api.
from concurrent.futures import ThreadPoolExecutor, as_completed
from playwright.sync_api import sync_playwright
import time
from tenacity import retry, stop_after_attempt, wait_exponential

Check failure

Code scanning / Pyre

Undefined import Error

Undefined import [21]: Could not find a module corresponding to import tenacity.
@kyegomez
Copy link
Owner

@harshalmore31 this goes in swarms-tools, not this library

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants