@@ -414,17 +414,17 @@ The spider won't work anymore after the change. The reason is that it
414414is using the new base Page Objects and they are empty.
415415Let's fix it by instructing ``scrapy-poet `` to use the Books To Scrape (BTS)
416416Page Objects for URLs belonging to the domain ``toscrape.com ``. This must
417- be done by configuring ``SCRAPY_POET_OVERRIDES `` into ``settings.py ``:
417+ be done by configuring ``SCRAPY_POET_RULES `` into ``settings.py ``:
418418
419419.. code-block :: python
420420
421- " SCRAPY_POET_OVERRIDES " : [
421+ " SCRAPY_POET_RULES " : [
422422 (" toscrape.com" , BTSBookListPage, BookListPage),
423423 (" toscrape.com" , BTSBookPage, BookPage)
424424 ]
425425
426426 The spider is back to life!
427- ``SCRAPY_POET_OVERRIDES `` contain rules that overrides the Page Objects
427+ ``SCRAPY_POET_RULES `` contain rules that overrides the Page Objects
428428used for a particular domain. In this particular case, Page Objects
429429``BTSBookListPage `` and ``BTSBookPage `` will be used instead of
430430``BookListPage `` and ``BookPage `` for any request whose domain is
@@ -465,16 +465,18 @@ to implement new ones:
465465
466466 The last step is configuring the overrides so that these new Page Objects
467467are used for the domain
468- ``bookpage.com ``. This is how ``SCRAPY_POET_OVERRIDES `` should look like into
468+ ``bookpage.com ``. This is how ``SCRAPY_POET_RULES `` should look like into
469469``settings.py ``:
470470
471471.. code-block :: python
472472
473- " SCRAPY_POET_OVERRIDES" : [
474- (" toscrape.com" , BTSBookListPage, BookListPage),
475- (" toscrape.com" , BTSBookPage, BookPage),
476- (" bookpage.com" , BPBookListPage, BookListPage),
477- (" bookpage.com" , BPBookPage, BookPage)
473+ from web_poet import ApplyRule
474+
475+ " SCRAPY_POET_RULES" : [
476+ ApplyRule(" toscrape.com" , use = BTSBookListPage, instead_of = BookListPage),
477+ ApplyRule(" toscrape.com" , use = BTSBookPage, instead_of = BookPage),
478+ ApplyRule(" bookpage.com" , use = BPBookListPage, instead_of = BookListPage),
479+ ApplyRule(" bookpage.com" , use = BPBookPage, instead_of = BookPage)
478480 ]
479481
480482 The spider is now ready to extract books from both sites 😀.
@@ -490,27 +492,6 @@ for a particular domain, but more complex URL patterns are also possible.
490492For example, the pattern ``books.toscrape.com/cataloge/category/ ``
491493is accepted and it would restrict the override only to category pages.
492494
493- It is even possible to configure more complex patterns by using the
494- :py:class: `web_poet.rules.ApplyRule ` class instead of a triplet in
495- the configuration. Another way of declaring the earlier config
496- for ``SCRAPY_POET_OVERRIDES `` would be the following:
497-
498- .. code-block :: python
499-
500- from url_matcher import Patterns
501- from web_poet import ApplyRule
502-
503-
504- SCRAPY_POET_OVERRIDES = [
505- ApplyRule(for_patterns = Patterns([" toscrape.com" ]), use = BTSBookListPage, instead_of = BookListPage),
506- ApplyRule(for_patterns = Patterns([" toscrape.com" ]), use = BTSBookPage, instead_of = BookPage),
507- ApplyRule(for_patterns = Patterns([" bookpage.com" ]), use = BPBookListPage, instead_of = BookListPage),
508- ApplyRule(for_patterns = Patterns([" bookpage.com" ]), use = BPBookPage, instead_of = BookPage),
509- ]
510-
511- As you can see, this could get verbose. The earlier tuple config simply offers
512- a shortcut to be more concise.
513-
514495.. note ::
515496
516497 Also see the `url-matcher <https://url-matcher.readthedocs.io/en/stable/ >`_
@@ -530,11 +511,11 @@ and store the :py:class:`web_poet.rules.ApplyRule` for you. All of the
530511 # rules from other packages. Otherwise, it can be omitted.
531512 # More info about this caveat on web-poet docs.
532513 consume_modules(" external_package_A" , " another_ext_package.lib" )
533- SCRAPY_POET_OVERRIDES = default_registry.get_rules()
514+ SCRAPY_POET_RULES = default_registry.get_rules()
534515
535516 For more info on this, you can refer to these docs:
536517
537- * ``scrapy-poet ``'s :ref: `overrides ` Tutorial section.
518+ * ``scrapy-poet ``'s :ref: `rules-from-web-poet ` Tutorial section.
538519 * External `web-poet `_ docs.
539520
540521 * Specifically, the :external:ref: `rules-intro ` Tutorial section.
@@ -545,7 +526,8 @@ Next steps
545526Now that you know how ``scrapy-poet `` is supposed to work, what about trying to
546527apply it to an existing or new Scrapy project?
547528
548- Also, please check the :ref: `overrides ` and :ref: `providers ` sections as well as
549- refer to spiders in the "example" folder: https://github.com/scrapinghub/scrapy-poet/tree/master/example/example/spiders
529+ Also, please check the :ref: `rules-from-web-poet ` and :ref: `providers ` sections
530+ as well as refer to spiders in the "example" folder:
531+ https://github.com/scrapinghub/scrapy-poet/tree/master/example/example/spiders
550532
551533.. _Scrapy Tutorial : https://docs.scrapy.org/en/latest/intro/tutorial.html
0 commit comments