-
Notifications
You must be signed in to change notification settings - Fork 28
Supporting to_return in web-poet rules
#88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #88 +/- ##
==========================================
- Coverage 91.91% 88.87% -3.05%
==========================================
Files 13 14 +1
Lines 569 737 +168
==========================================
+ Hits 523 655 +132
- Misses 46 82 +36
|
tests/test_web_poet_rules.py
Outdated
| has received a different type of item class from the page object. | ||
| """ | ||
| item, deps = yield crawl_item_and_deps(ReplacedProduct) | ||
| assert "raise UndeclaredProvidedTypeError" in caplog.text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should return a more apt exception since UndeclaredProvidedTypeError primarily means that there's no provider available to give the requested dependency (in this case, the item).
However, this scenario is mostly due to an incorrect class that was returned.
Perhaps a better result is raising MalformedProvidedClassesError or creating another exception like IncorrectProvidedClassError.
|
Continuing the today's discussion: my current implementation of a Scrapy command that takes a PO and returns its dependencies and result (like po = load_object(po_name)
# here we could also support running a specific spider to apply its custom_settings etc
spider_cls = spider_for(po)
self.settings.setdict(additional_settings())
crawler = Crawler(spider_cls, self.settings)
self.crawler_process.crawl(crawler, url=url)
self.crawler_process.start()
items = crawler.spider.collected_items[0]
deps = crawler.spider.collected_response_deps[0]So if we keep this implementation, it would need at least |
|
Hi @wRAR , I moved most of the test utilities in However, I haven't moved |
kmike
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 🎉 🚀
|
Any idea what's up with the CI @BurnzZ? I tried to restart jobs, but it didn't help. No logs, nothing :) |
In line with the upcoming development in scrapinghub/web-poet#84.
This is built on top of #89 for now which moves away from the deprecated functionalities of web-poet as introduced in scrapinghub/web-poet#84.
This also addresses #90 .
TODO:
to_returnparameter in web-poet's rules.callback_forSCRAPY_POET_OVERRIDESand theRegistrywith the advent ofto_return.NOTES:
scrapy_poet/utilities/(reference) which did not have any tests to begin with. We can add tests to them in a separate PR.OverridesRegistryintoRulesRegistrySCRAPY_POET_OVERRIDESintoSCRAPY_POET_RULESSCRAPY_POET_OVERRIDES_REGISTRYintoSCRAPY_POET_RULES_REGISTRYweb-poetPR in move some registry functionalities from scrapy-poet web-poet#112Annotated[Item, PickFields("x", "y")]to decide which fields to populate in callback #111