File tree 3 files changed +45
-0
lines changed
3 files changed +45
-0
lines changed Original file line number Diff line number Diff line change
1
+ # DOM Extraction Script
2
+
3
+ Extract the DOM elements of a webpage efficiently.
4
+
5
+ ## Installation
6
+
7
+ Use the package manager [ pip] ( https://pip.pypa.io/en/stable/ ) to install the required libraries.
8
+
9
+ ``` bash
10
+ pip install requests beautifulsoup4
11
+
12
+ ```
13
+
14
+ ## Usage
15
+
16
+ ``` python
17
+ url = ' https://example.com'
18
+ ```
19
+ Replace 'https://example.com ' with the URL of the website you want to extract the DOM from.
Original file line number Diff line number Diff line change
1
+ import requests
2
+ from bs4 import BeautifulSoup
3
+
4
+ # Define the URL of the website you want to extract the DOM from
5
+ url = 'https://example.com'
6
+
7
+ response = requests .get (url )
8
+
9
+ if response .status_code == 200 :
10
+ soup = BeautifulSoup (response .text , 'html.parser' )
11
+
12
+
13
+ title = soup .title
14
+ if title :
15
+ print ("Page Title:" , title .text )
16
+ else :
17
+ print ("No title tag found." )
18
+
19
+
20
+ links = soup .find_all ('a' )
21
+ print ("Links in the page:" )
22
+ for link in links :
23
+ print (link .get ('href' ))
24
+
25
+ else :
26
+ print ("Failed to retrieve the page. Status code:" , response .status_code )
You can’t perform that action at this time.
0 commit comments