Skip to content

Commit 45fb08f

Browse files
author
Neagu Marinel
authored
Merge branch 'larymak:main' into main
2 parents 1d08798 + c643b6f commit 45fb08f

File tree

3 files changed

+45
-0
lines changed

3 files changed

+45
-0
lines changed

Diff for: .DS_Store

-10 KB
Binary file not shown.

Diff for: DOM EXTRACTION/README.md

+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# DOM Extraction Script
2+
3+
Extract the DOM elements of a webpage efficiently.
4+
5+
## Installation
6+
7+
Use the package manager [pip](https://pip.pypa.io/en/stable/) to install the required libraries.
8+
9+
```bash
10+
pip install requests beautifulsoup4
11+
12+
```
13+
14+
## Usage
15+
16+
```python
17+
url = 'https://example.com'
18+
```
19+
Replace 'https://example.com' with the URL of the website you want to extract the DOM from.

Diff for: DOM EXTRACTION/main.py

+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
import requests
2+
from bs4 import BeautifulSoup
3+
4+
# Define the URL of the website you want to extract the DOM from
5+
url = 'https://example.com'
6+
7+
response = requests.get(url)
8+
9+
if response.status_code == 200:
10+
soup = BeautifulSoup(response.text, 'html.parser')
11+
12+
13+
title = soup.title
14+
if title:
15+
print("Page Title:", title.text)
16+
else:
17+
print("No title tag found.")
18+
19+
20+
links = soup.find_all('a')
21+
print("Links in the page:")
22+
for link in links:
23+
print(link.get('href'))
24+
25+
else:
26+
print("Failed to retrieve the page. Status code:", response.status_code)

0 commit comments

Comments
 (0)