Need technical guidance on customizing scrapers 需要关于自定义刮削器的技术指导 #6520

Sichongzou · 2026-01-23T16:14:10Z

Sichongzou
Jan 23, 2026

我从Jellyfin搬迁至Stash，我翻遍了Stash中各个支持的刮削器，我没找到满意的，所以我打算自己写一个，使用python做聚合爬虫，允许我一次性抓多个站点的数据合并成一个数据。
因为没有中文文档，我只能对着翻译来写，我大致翻阅了刮削器社区中的案例，对着翻译后的文档大致写了一个。

这是我的yml主文件

然后我同级目录创建了sichongzou_Metatube.py，然后从社区库引入了py_common。我的思路非常明确，我的本地目录我都是整理好的，所以我直接取本地文件的文件名称作为查找下去拿数据，类似于

至于sceneByFragment方法他会返回一个ScrapedScene也就是

然后我把这两个都是放大stash的刮削器上允许，非常完美，但是我发现我的日志部分一直会报一个Could not set image using URL 和unsupported protocol scheme "data"
然后一次刮削他会打印两边我传递的image中的Base64，日志等级是waring。因为Base64很大导致我每次进日志页面都卡的要死。但是刮削一切正常。我不知道哪里出现了问题

I moved from Jellyfin to Stash and searched through all the supported scraping tools in Stash. I couldn't find a satisfactory one, so I plan to write my own one using Python as an aggregation crawler that allows me to capture data from multiple sites at once and merge them into one data.
Because there is no Chinese document, I can only write based on the translation. I roughly browsed through the cases in the scraper community and wrote a rough version of the translated document.
This is my yml master file

Then I created sichongzou_Setatube.py in the same level directory and imported py_common from the community library. My idea is very clear. I have organized all my local directories, so I directly use the file names of local files to search and retrieve data, similar to

As for the sceneByFragment method, it will return a ScrapedScene, which is

Then I allowed both of these on the scraper that enlarged the stash perfectly, but I found that my log section kept reporting a 'Could not set image using URL' and an unsupported protocol scheme 'data'
Then, after scraping, he will print Base64 from both sides of the image I passed, with a log level of waring. Because Base64 is very large, I get stuck every time I enter the log page. But scraping everything is normal. I don't know where the problem is

Sichongzou · 2026-01-23T16:17:36Z

Sichongzou
Jan 23, 2026
Author

这是日志截图

This is a screenshot of the log

0 replies

feederbox826 · 2026-01-23T20:56:48Z

feederbox826
Jan 23, 2026
Collaborator

could not set image using URL ... unsupported protocol scheme "data" is harmless. Currently it does not reflect any actual error, just misfiring even if it works

1 reply

Sichongzou Jan 24, 2026
Author

could not set image using URL ... unsupported protocol scheme "data" is harmless. Currently it does not reflect any actual error, just misfiring even if it works

Okay, but although it's not harmful, there are a lot of complete images that can be logged directly from Base64. My webpage is indeed visibly stuck

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Need technical guidance on customizing scrapers 需要关于自定义刮削器的技术指导 #6520

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Need technical guidance on customizing scrapers 需要关于自定义刮削器的技术指导 #6520

Uh oh!

Uh oh!

Sichongzou Jan 23, 2026

Replies: 2 comments · 1 reply

Uh oh!

Uh oh!

Sichongzou Jan 23, 2026 Author

Uh oh!

feederbox826 Jan 23, 2026 Collaborator

Uh oh!

Sichongzou Jan 24, 2026 Author

Sichongzou
Jan 23, 2026

Replies: 2 comments 1 reply

Sichongzou
Jan 23, 2026
Author

feederbox826
Jan 23, 2026
Collaborator

Sichongzou Jan 24, 2026
Author