A powerful and user-friendly Streamlit web application that scrapes job listings from Dice.com with advanced filtering options and data export capabilities.
- π Smart Job Search: Search for jobs by title with intelligent query processing
- π― Easy Apply Filter: Toggle between Easy Apply jobs only or all available positions
- π Interactive Results Table: View job listings in a clean, sortable table format
- π₯ Excel Export: Download your search results as a professionally formatted Excel file
- β‘ Real-time Progress: Live progress tracking during the scraping process
- π Summary Statistics: Get insights on total jobs, companies, remote positions, and Easy Apply jobs
- π Clickable Links: Direct links to job postings for easy application
- π± Responsive Design: Works seamlessly on desktop and mobile devices
- Python 3.7 or higher
- pip package manager
-
Clone the repository
git clone https://github.com/No0Bitah/dice_job_search.git cd dice_job_search -
Create a virtual environment (recommended)
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Run the application
streamlit run app.py
-
Open your browser and go to
http://localhost:8501
- streamlit >= 1.28.0 - Web framework for the user interface
- pandas >= 1.5.0 - Data manipulation and analysis
- requests >= 2.28.0 - HTTP library for web scraping
- beautifulsoup4 >= 4.11.0 - HTML parsing and web scraping
- python-dateutil >= 2.8.0 - Date parsing utilities
- openpyxl >= 3.0.0 - Excel file creation and manipulation
- lxml >= 4.9.0 - XML and HTML processing
-
Enter Job Title: Type the position you're looking for (e.g., "Python Developer", "Data Scientist", "Software Engineer")
-
Set Number of Jobs: Choose how many job listings you want to scrape (1-100)
-
Toggle Easy Apply Filter:
- ON (default): Only shows jobs with Easy Apply feature
- OFF: Shows all available jobs from Dice.com
-
Click Search: The app will start scraping jobs with real-time progress updates
-
View Results: Browse jobs in an interactive table with sorting and filtering capabilities
-
Download Excel: Click the download button to save results as an Excel file
Each job listing includes the following information:
| Field | Description |
|---|---|
| Job Title | The position title |
| Company | Hiring company name |
| Location | Job location (including remote options) |
| Position Type | Full-time, Contract, Part-time, etc. |
| Compensation | Salary range or hourly rate (when available) |
| Date Posted | When the job was originally posted |
| Application | "Easy Apply" or "External Apply" |
| Job Link | Direct link to the job posting |
| Job Description | Full job description and requirements |
- Frontend: Streamlit for the user interface
- Backend: Custom web scraping engine with concurrent processing
- Data Processing: Pandas for data manipulation and Excel export
- Web Scraping: BeautifulSoup4 + Requests with user agent rotation
- Parallel Processing: Concurrent job detail fetching for faster results
- Caching: LRU cache for repeated URL requests
- Rate Limiting: Built-in delays to respect Dice.com's servers
- Error Handling: Robust error handling with user-friendly messages
- Randomized user agents to distribute requests
- Built-in delays between requests
- Respects robots.txt guidelines
- No excessive server load
-
Fork/Clone this repository to your GitHub account
-
Visit share.streamlit.io
-
Connect your GitHub account
-
Select your repository and set:
- Main file path:
app.py - Python version: 3.9+ (recommended)
- Main file path:
-
Deploy and get your public URL!
- Heroku: Platform-as-a-Service deployment
- Railway: Modern deployment platform
- Render: Easy web service deployment
- Docker: Containerized deployment option
- Rate Limiting: The app includes built-in delays to avoid overwhelming Dice.com's servers
- Dynamic Content: Some job details may not be captured if they're loaded dynamically
- Site Changes: Dice.com may update their structure, which could affect scraping
- Legal Compliance: Always ensure your usage complies with Dice.com's Terms of Service
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Add more job boards (Indeed, LinkedIn, etc.)
- Implement job alerts and notifications
- Add advanced filtering options
- Create data visualization dashboards
- Improve mobile responsiveness
This project is licensed under the MIT License - see the LICENSE file for details.
This tool is for educational and personal use only. Users are responsible for ensuring their usage complies with Dice.com's Terms of Service and applicable laws. The developers are not responsible for any misuse of this application.
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: [email protected]
- Streamlit Team for the amazing framework
- Dice.com for providing job listings data
- Open Source Community for the excellent libraries used in this project
- Add support for more job boards
- Add date filter
- Implement job matching algorithms
- Create mobile app version
- Add data visualization features
- Implement user accounts and job tracking
- Add API endpoints for integration
Made with β€οΈ by No0Bitah
If you find this project helpful, please give it a β on GitHub!

