You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our project and community is non-profit. In our job board, we don't have a pay-to-post model but rather use a web scraper to collect jobs from various sites.
Criteria:
Svelte related jobs,
Location specific, Vietnam centric, expanding to ASEAN,
We should probably use python for this. A couple of questions to consider & plan:
Where do we host the python source code? In another repo? In this repo as a monorepo?
Where to store the data? plain json files? cloud-based (preferred) like supabase or upslash? But maybe more lightweight, some alternative cloud service that provides SQLite?
-> Let's stick to supabase because it's already good with lots of battery included support like real time subscription (maybe we can expand to do cool stuff like realtime toast for new comer from discord, new sponsor, new mail list registration?)
Which format (schema) to store this data as?
How to run this scraper? As a vercel cron job perhaps?
How to handle language specific texts (because our site is bilingual)?
How to bypass rate-restricted & blockers?
The text was updated successfully, but these errors were encountered:
Context
Our project and community is non-profit. In our job board, we don't have a pay-to-post model but rather use a web scraper to collect jobs from various sites.
Criteria:
Site to scrap:
Implementation
We should probably use python for this. A couple of questions to consider & plan:
Where do we host the python source code? In another repo? In this repo as a monorepo?
Where to store the data? plain json files? cloud-based (preferred) like supabase or upslash?
But maybe more lightweight, some alternative cloud service that provides SQLite?-> Let's stick to supabase because it's already good with lots of battery included support like real time subscription (maybe we can expand to do cool stuff like realtime toast for new comer from discord, new sponsor, new mail list registration?)
Which format (schema) to store this data as?
How to run this scraper? As a vercel cron job perhaps?
How to handle language specific texts (because our site is bilingual)?
How to bypass rate-restricted & blockers?
The text was updated successfully, but these errors were encountered: