This guide will help you set up and deploy Paper Pulse on GitHub Pages.
- GitHub account
- DashScope API key (free tier available at https://dashscope.console.aliyun.com/)
- Git installed locally
- Push this code to your GitHub repository:
git add . git commit -m "Initial commit: Paper Pulse" git push origin master
- Go to your repository on GitHub
- Click on Settings → Secrets and variables → Actions
- Click New repository secret
- Add the following secret:
- Name:
DASHSCOPE_API_KEY(orMODELSCOPE_API_KEY) - Value: Your DashScope/ModelScope API key
- Name:
- Go to the Actions tab in your repository
- If prompted, click I understand my workflows, go ahead and enable them
- The workflow should now be enabled
- Go to Settings → Pages
- Under Source, select:
- Source: Deploy from a branch
- Branch: master (or main if that's your default branch)
- Folder: / (root)
- Click Save
- Wait a few minutes for the site to deploy
- Your site will be available at:
https://<username>.github.io/<repo-name>/
- Go to Actions tab
- Click on Fetch Papers workflow
- Click Run workflow → Run workflow
- Wait for the workflow to complete (5-10 minutes depending on number of papers)
The workflow runs automatically every day at 00:00 UTC.
- After the workflow completes, check that
data/papers.jsonhas been created - Visit your GitHub Pages URL
- You should see the papers displayed in card format with bilingual summaries
- Make sure you've added the secret in Step 2
- The secret name must be
DASHSCOPE_API_KEYorMODELSCOPE_API_KEY
- Check the workflow logs to see if papers were fetched
- Papers must match your keyword filters in
keywords.txt - Only papers from the configured time period are kept (default: 7 days)
- Make sure you selected
/ (root)as the folder in Pages settings - Wait a few minutes after enabling Pages for DNS to propagate
- Check that
index.htmlexists in your repository root
- The default delays (3s for arXiv, 1s for summarization) should prevent rate limiting
- If you still hit limits, increase delays in
config.toml
Edit keywords.txt in the repository root to customize which papers are included.
Format:
- Each line is an OR condition
- Multiple words on the same line use AND logic (all must match)
- Lines starting with
#are comments - Empty lines are ignored
Examples:
# Match papers with "transformer" OR "attention"
transformer
attention
# Match papers with BOTH "neural" AND "backdoor" (both words must appear)
neural backdoor
# Match papers with "federated learning" (phrase)
federated learning
A paper will be included if it matches ANY line in the file.
Edit config.toml to customize:
Retention period:
[general]
days_back = 7 # Keep papers from last 7 daysarXiv categories:
[fetchers.arxiv]
categories = ["cs.CR", "cs.AI", "cs.LG", "cs.CL"] # Customize categoriesAI model:
[summarizer]
model = "qwen-plus" # Options: qwen-turbo, qwen-plus, qwen-max
max_tokens = 1500 # For bilingual summariesRate limits:
[fetchers.arxiv]
delay = 3.0 # Delay between arXiv requests
[summarizer]
rate_limit_delay = 1.0 # Delay between summarization callsSee CONFIG_GUIDE.md for more detailed configuration options.
Edit .github/workflows/fetch-papers.yml, line 5:
- cron: '0 0 * * *' # Daily at 00:00 UTCUse crontab.guru to generate different schedules.
Test the fetcher locally before deploying:
# Install dependencies
pip install -r requirements.txt
# Set API key
export DASHSCOPE_API_KEY="your-key-here"
# Run the script
python scripts/main.pyThis will create data/papers.json which you can inspect. Open index.html in a browser to view the results.
MIT