The Iceberg Table Analysis CLI Tool evaluates Iceberg tables to identify how Upsolver optimizations can enhance efficiency.
It presents a side-by-side comparison of current metrics against potential improvements in scan durations, file counts, and file sizes, providing a straightforward assessment of optimization opportunities.
iceberg-diag can be installed using either Brew or PIP, as detailed below:
- Python 3.8 or higher: Verify Python's installation:
python3 --version
To install iceberg-diag using PIP, ensure you have the latest version of pip:
pip install --upgrade pipThen, install the package with pip
pip install iceberg-diagExecute the following commands to install iceberg-diag via Brew:
brew tap upsolver/iceberg-diag
brew install iceberg-diagiceberg-diag [options]-h,--help: Display the help message and exit.--profile PROFILE: Set the AWS credentials profile for the session, defaults to the environment's current settings.--region REGION: Set the AWS region for operations, defaults to the specified profile's default region.--database DATABASE: Set the database name, will list all available iceberg tables if no--table-nameprovided.--table-name TABLE_NAME: Enter the table name or a glob pattern (e.g.,'*','tbl_*').--remote: Enable remote diagnostics by sending data to the Upsolver API for processing.
Provides more detailed analytics and includes information about file size reducations.-v, --verbose: Enable verbose logging
-
Displaying help information:
iceberg-diag --help
-
Listing all available databases in profile:
iceberg-diag --profile <profile>
-
Listing all available iceberg tables in a given database:
iceberg-diag --profile <profile> --database <database>
-
Running diagnostics on a specific table in a specific AWS profile and region (completely locally):
iceberg-diag --profile <profile> --region <region> --database <database> --table-name '*'
-
Running diagnostics using
remoteoptioniceberg-diag --profile <profile> --database <database> --table-name 'prod_*' --remote