GitHub

This Python tool extracts and downloads files from an XML-indexed bucket, commonly used by cloud storage services like Amazon S3, DigitalOcean Spaces, or similar CDNs

Usage

run the following command in your terminal:

pip install lxml

python3 Xg.py -u 'target url' -f /folder -b mov mp4 jpg jpeg png

Parameters:

-u or --url: The target URL to crawl. This should be the URL of the XML bucket you want to scrape.
-f or --folder: The folder path where you want to save the downloaded files.
-b or --blacklist: A space-separated list of file extensions to blacklist (exclude from downloading). For example, to exclude video and image files, use: mov mp4 jpg jpeg png.

Example:

python3 Xg.py -u 'http://example.domain' -f /path/to/save/folder -b mov mp4 jpg jpeg png

This command will:

Crawl the XML at http://example.domain.
Save the downloaded files to /path/to/save/folder.
Exclude files with the extensions .mov, .mp4, .jpg, .jpeg, and .png.

Name	Name	Last commit message	Last commit date
Latest commit RHYru9 Update README.md Feb 19, 2025 1a7e871 · Feb 19, 2025 History 5 Commits
README.md	README.md	Update README.md	Feb 19, 2025
Xg.py	Xg.py	Create Xg.py	Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Usage

Parameters:

Example:

About

Releases

Packages

Languages

RHYru9/BucketC

Folders and files

Latest commit

History

Repository files navigation

Usage

Parameters:

Example:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages