scraper

Star

Here are 9,695 public repositories matching this topic...

huginn / huginn

Star

Create agents that monitor and act on your behalf. Your agents are standing by!

notifications agent rss scraper automation twitter monitoring huginn feed feedgenerator webscraping twitter-streaming

Updated Jul 19, 2025
Ruby

mendableai / firecrawl

Star

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

markdown crawler data scraper ai html-to-markdown web-crawler scraping webscraping rag llm ai-scraping

Updated Jul 20, 2025
TypeScript

NaiboWang / EasySpider

Sponsor

Star

A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化的设计和执行爬虫任务。别名：ServiceWrapper面向Web应用的智能化服务封装系统。

Updated Jul 20, 2025
JavaScript

iawia002 / lux

Star

👾 Fast and simple video download library and CLI tool written in Go

go golang crawler scraper downloader youtube video download tumblr bilibili qq youku iqiyi

Updated Jul 2, 2025
Go

cheeriojs / cheerio

Sponsor

Star

The fast, flexible, and elegant library for parsing and manipulating HTML and XML.

html jquery parser scraper dom cheerio selector hacktoberfest htmlparser2 htmlparser

Updated Jul 21, 2025
TypeScript

feder-cr / Jobs_Applier_AI_Agent_AIHawk

Sponsor

Star

AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.

Updated May 28, 2025
Python

gocolly / colly

Star

Elegant Scraper and Crawler Framework for Golang

go golang crawler scraper framework spider scraping crawling

Updated Jun 18, 2025
Go

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

nodejs javascript npm crawler scraper automation typescript web-crawler headless scraping crawling web-scraping web-crawling headless-chrome apify puppeteer playwright

Updated Jul 18, 2025
TypeScript

codelucas / newspaper

Star

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:

python crawler scraper news crawling news-aggregator

Updated Jul 11, 2025
HTML

Evil0ctal / Douyin_TikTok_Download_API

Sponsor

Star

🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具，支持API调用，在线批量解析及下载。

Updated Mar 23, 2025
Python

getmaxun / maxun

Star

🔥 Open-source no code web data extraction platform. Instantly turn any website into API or spreadsheet 🔥

api scraper automation browser web-scraper self-hosted web-scraping data-extraction webscraping agents browser-automation no-code web-automation rpa robotic-process-automation playwright website-to-api no-code-web-scraper

Updated Jul 18, 2025
TypeScript

pwxcoo / chinese-xinhua

Star

📙 中华新华字典数据库。包括歇后语，成语，词语，汉字。

json data scraper json-data python3 chinese chinese-nlp chinese-characters chinese-simplified chinese-traditional json-dataset chinese-language

Updated Dec 26, 2023
Python

guyueyingmu / avbook

Star

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

crawler scraper laravel database spider magnet-link guzzlehttp magnet adult javbus javlibrary avmoo adult-video

Updated Jun 1, 2024
PHP

TeamWiseFlow / wiseflow

Star

Use LLMs to dig out what you care about from massive amounts of information and a variety of sources daily.

crawler scraper information-gathering focus-stacking llm

Updated Jul 4, 2025
Python

BruceDone / awesome-crawler

Star

A collection of awesome web crawler,spider in different languages

crawler scraper awesome spider web-crawler web-scraper node-crawler

Updated Jun 16, 2024

alirezamika / autoscraper

Sponsor

Star

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

python crawler machine-learning scraper automation ai scraping artificial-intelligence web-scraping scrape webscraping webautomation

Updated Jun 9, 2025
Python

go-rod / rod

Star

A Chrome DevTools Protocol driver for web automation and scraping.

testing go golang scraper automation web chrome-devtools headless devtools crawling web-scraping cdp chrome-headless rod chrome-devtools-protocol devtools-protocol gorod

Updated Dec 7, 2024
Go

apify / crawlee-python

Star

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

python crawler scraper automation web-crawler headless scraping crawling pip web-scraping beautifulsoup web-crawling hacktoberfest headless-chrome apify playwright

Updated Jul 21, 2025
Python

MontFerret / ferret

Star

Declarative web scraping

go cli golang crawler chrome data-mining scraper library tool dsl scraping crawling query-language scraping-websites cdp

Updated Jul 17, 2025
Go

yujiosaka / headless-chrome-crawler

Sponsor

Star

Distributed crawler powered by Headless Chrome

jquery crawler chrome scraper promise scraping crawling chromium headless-chrome puppeteer

Updated Apr 29, 2023
JavaScript

Improve this page

Add a description, image, and links to the scraper topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the scraper topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scraper

Here are 9,695 public repositories matching this topic...

huginn / huginn

mendableai / firecrawl

NaiboWang / EasySpider

iawia002 / lux

cheeriojs / cheerio

feder-cr / Jobs_Applier_AI_Agent_AIHawk

gocolly / colly

apify / crawlee

codelucas / newspaper

Evil0ctal / Douyin_TikTok_Download_API

getmaxun / maxun

pwxcoo / chinese-xinhua

guyueyingmu / avbook

TeamWiseFlow / wiseflow

BruceDone / awesome-crawler

alirezamika / autoscraper

go-rod / rod

apify / crawlee-python

MontFerret / ferret

yujiosaka / headless-chrome-crawler

Improve this page

Add this topic to your repo