Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Remove cookie banner/pop up from screenshots #647

Closed
ifx-querido opened this issue Sep 9, 2024 · 4 comments
Closed

[Feat] Remove cookie banner/pop up from screenshots #647

ifx-querido opened this issue Sep 9, 2024 · 4 comments
Assignees

Comments

@ifx-querido
Copy link

Problem Description
When taking a full screenshot, the cookie banner / privacy settings pop up hides a significant portion of the image.

Proposed Feature
It would be very handful if any and every irrelevant pop up could be blocked so they don't appear on the screenshot, hiding important content from the user.

Alternatives Considered
The only alternatives I'm aware of are outside of the Firecrawler scope, and involve driver automation, through selenium for example.

Implementation Suggestions
Using some kind of cookie blocker addon might work, assuming the firecrawler incorporates driver automation.

Use Case
This feature can be seen as an improvement of the full page screenshot format that already exists. A clean not obstructed screenshot could be used to visually validate the extracted text content, and could also be used for some kind of AI image classification.

Additional Context
There are browser addons for taking full page screenshots, but not on an automated way.

@nickscamara nickscamara changed the title [Feat]Remove cookie banner/pop up from screenshots [Feat] Remove cookie banner/pop up from screenshots Sep 9, 2024
@tomkosm
Copy link
Collaborator

tomkosm commented Sep 9, 2024

Hey @ifx-querido , we already support this at the cloud version and it will soon be opensource. Are you trying self hosted or via the cloud? If its via the cloud could you provide the link to have a look?

@ifx-querido
Copy link
Author

Hello @tomkosm , I'm not trying self hosted yet but soon will be. For now I'm just experimenting with the Python SDK using an API key with the free plan.

This is the code I'm running:

page_content = app.scrape_url(website, {
  "formats": ['markdown', 'links', 'screenshot@fullPage'],
  "onlyMainContent": True,
})
print(page_content['screenshot'])

And the output screenshot that shows the cookie pop up can be found here

@ifx-querido
Copy link
Author

ifx-querido commented Sep 12, 2024

any help would be appreciated @tomkosm @kevinswiber @nickscamara

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants