Skip to content

Continue where you left off feature is not working robustly #5

@AriaShishegaran

Description

@AriaShishegaran

I believe there's a problem when crawling massive documentation portals in a scenario where the crawler cannot handle a certain condition and exists suddenly, leading to the continuation task to start from the beginning but not intelligently, therefore leading to all the previous pages being checked before moving forward with the new download. This is my constant experience with stripe's documentation portal where every time after around an hour the crawler fails to move forward and there are no option to move forward exactly where you left off.
my suggestion is to switch the default behavior to continue where you left off without previous check, since I assume what you're doing with the current implantation is to also detect if there any changes and apply them as well on the existing files instead of literally sticking to the current state of the progress and attempting to finish the crawl job which is taking massive times.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions