Commit 439b24d
v0.3.0: speed, progress clarity, image extraction, file skip
- Skip HEAD when no min/max image size (faster downloads)
- Lower default delay 1.0→0.5s; more parallel workers (5 assets, 4 head)
- Clear progress: Found N, Downloading N assets, [i/N] per item
- Image extraction: preload links, background-image, path hints, lazy attrs
- File skip: canonical paths, skip already scraped resources
- Auto retry crawl cross-domain if same-domain empty
- GUI: Stop button, status parsing, last URL persisted
- Fix: tqdm unit spacing, multiprocessing warning
Co-authored-by: Cursor <cursoragent@cursor.com>1 parent cacd8f0 commit 439b24d
9 files changed
Lines changed: 1040 additions & 286 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
9 | 30 | | |
10 | 31 | | |
11 | 32 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
| 35 | + | |
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| |||
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
79 | 90 | | |
80 | 91 | | |
81 | 92 | | |
| |||
0 commit comments