Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 4, 2025

Plan: Make PrometheusConnect importable without pandas and matplotlib

  • Explore repository structure and understand the issue
  • Identify where numpy, pandas, and matplotlib are imported
  • Run existing tests to understand current state
  • Make numpy import lazy in prometheus_connect.py (only imported when get_metric_aggregation is called)
  • Ensure pandas/matplotlib imports remain lazy through getattr mechanism
  • Create test to verify PrometheusConnect can be imported without pandas/matplotlib
  • Run existing tests to ensure no regressions
  • Update requirements.txt to make pandas/matplotlib/numpy optional dependencies
  • Update setup.py to support optional dependencies via extras_require
  • Run final tests and verify the changes work as expected
  • Address code review feedback on test module detection logic
  • Refactor test code to reduce duplication
  • Bump version to 0.7.0 for new release
  • Update README.md with correct version number
  • Fix numpy reload issue in TestLazyImports
  • Remove unused importlib import
  • Refactor subprocess tests with helper method
  • Make pandas optional again after fixing test issues
  • Add use-case oriented extras and helpful error messages

Summary

Successfully implemented lazy loading for pandas, numpy, and matplotlib as fully optional dependencies with helpful error messages:

  1. Lazy numpy import: Moved numpy import from module level to inside get_metric_aggregation() method in prometheus_connect.py. NumPy is only loaded when the aggregation method is called, reducing initial memory footprint.

  2. All dependencies optional: Pandas, numpy, and matplotlib are all optional dependencies. The base install only includes requests and dateparser, significantly reducing memory footprint (~50MB savings) and installation time, especially on Alpine-based Docker images where C extensions would need to be compiled.

  3. Use-case oriented extras: Created intuitive installation options:

    • pip install prometheus-api-client - Minimal install (only requests, dateparser)
    • pip install prometheus-api-client[analytics] - For metric aggregation operations (includes numpy)
    • pip install prometheus-api-client[dataframe] - For DataFrame support (includes pandas)
    • pip install prometheus-api-client[plot] - For plotting (includes matplotlib)
    • pip install prometheus-api-client[all] - Full install (backward compatible)
  4. Helpful error messages: When users try to use functionality requiring missing dependencies, they get clear, actionable guidance:

    • Attempting to use get_metric_aggregation() without numpy shows: "NumPy is required for metric aggregation operations. Please install it with: pip install prometheus-api-client[analytics] or pip install prometheus-api-client[all]"
    • Attempting to import Metric, MetricRangeDataFrame, or MetricSnapshotDataFrame without pandas shows: "Pandas is required for [ClassName]. Please install it with: pip install prometheus-api-client[dataframe] or pip install prometheus-api-client[all]"
  5. Fixed test isolation: Refactored TestLazyImports to use subprocess isolation instead of module removal, preventing numpy reload warnings and test failures.

  6. Improved test code quality:

    • Created _run_in_subprocess helper method to reduce code duplication
    • Enhanced error reporting to capture both stdout and stderr for better debugging
    • Removed unused imports
  7. Verified behavior: Created comprehensive tests to ensure lazy loading works correctly for numpy while maintaining compatibility with pandas-dependent classes.

  8. Documentation: Updated README with installation instructions highlighting use-case oriented extras and correct version number (0.7.0).

  9. Code quality: Addressed all code review feedback by improving test module detection, reducing code duplication, removing unused imports, enhancing error reporting, and adding helpful error messages.

  10. Version bump: Updated version to 0.7.0 for the new release.

Test Status

All tests now pass (55/55):

  • ✅ All PrometheusConnect tests pass (15/15 with live server + 8/8 mocked)
  • ✅ All lazy import tests pass (3/3) - no numpy reload warnings
  • ✅ All metrics_list tests pass (4/4)
  • ✅ All metric tests pass (10/10)
  • ✅ All metric_range_df tests pass (8/8)
  • ✅ All metric_snapshot_df tests pass (7/7)
  • ✅ No warnings about module reloads
  • ✅ No unused imports
  • ✅ Enhanced error reporting with stdout and stderr capture
  • ✅ Minimal install works without pandas/numpy/matplotlib
  • ✅ Helpful error messages guide users to correct installation options
Original prompt

This section details on the original issue you should resolve

<issue_title>Ways to import PrometheusConnect without also importing huge pandas and matplotlib</issue_title>
<issue_description>Is your feature request related to a problem? Please describe.

I found this simple import

from prometheus_api_client import PrometheusConnect

not only import PrometheusConnect itself, but also pandas and possibly matplotlib, which take about 50MB more unnecessary memory when I don't want to use DataFrames and plot them.

Is there any way to only import PrometheusConnect without also importing huge pandas and matplotlib?</issue_description>

Comments on the Issue (you are @copilot in this section)

@chauhankaranraj > not only import `PrometheusConnect` itself, but also pandas and possibly matplotlib, which take about 50MB more unnecessary memory when I don't want to use DataFrames and plot them.

Hi @thetaprimeprime, that's an great observation! I did some memory profiling and can confirm that the additional pandas and matplotlib imports do indeed increase the memory usage by about ~45MB.

Is there any way to only import PrometheusConnect without also importing huge pandas and matplotlib?

At the moment, I don't think so. But I believe this would be a nice and welcome improvement 😃 Is this something you'd like to work on, or would you rather someone from our team do it?

One way to accomplish this could be to refactor this python module into submodules, something like this:

prometheus_api_client
├── core
│   ├── __init__.py
│   └── prometheus_connect.py
├── exceptions
│   ├── base_exception.py
│   └── __init__.py
├── __init__.py --> only import core.* here, to avoid importing mpl, pandas
├── parsers
│   ├── __init__.py
│   ├── metric.py
│   ├── metric_range_df.py
│   ├── metrics_list.py
│   └── metric_snapshot_df.py
└── utils
    ├── datetime_utils.py
    ├── __init__.py
    └── print_utils.py

Just a suggestion off the top of my head, we should explore other ideas as well.

/cc @4n4nd </comment_new>
<comment_new>@4n4nd
@raqbit do you have any suggestions on how we could improve this?</comment_new>
<comment_new>@chauhankaranraj

This is also making it hard to use prometheus-api-client-python in alpine-based docker images, as both mathplotlib & pandas need to build wheels.

Hi @raqbit, could you please describe your issue in a bit more detail i.e. why is hard to use in alpine based container images? Are you unable to install these dependencies in the container? Or do they take a longer time to install? Or do they bloat the image?

Basically, I think OP’s issue here is with slowness of the imports at runtime (which imo can be easily solved by a restructure). Whereas your issue sounds a bit more about installation? (which can't be solved by restructure, and is a bit more involved). So just wanted to get some clarity and prioritize accordingly :)</comment_new>
<comment_new>@chauhankaranraj

Yes. Both pandas and mathplotlib (Plus, numpy which is pulled in by pandas) have native components written in C. Normally this means that the Python Wheel (pre-compiled binary component) is downloaded and all is well (except for the longer import times mentioned by OP). For Alpine Linux, however, there is no such wheel available as Alpine uses the musl C standard library implementation instead of the more common glibc. This, in turn, will cause Pip to try to compile the C code for these native components during the installation process as it is unable to find suitable wheels in the package from pypi.

This requires a complete c compiler toolchain to be available, and in my case, was taking 20+ minutes to complete (minutes of compiling C-code with all available cores, making the computer unusable for other tasks).

The bloat of the compiler toolchain can be avoided by uninstalling them after running pip install, but the compilation step cannot be avoided without using a different container base-image such as debian.

Got it, thanks for the details! In theory yes we could set up installation so that pip install prometheus-api-client-python just installs the "core" library components (PrometheusConnect). And pandas and matplotlib can be listed as dependencies in extras_require to be installed with something like pip install prometheus-api-client-python[full], installing the other componenets (Metric, MetricSnapshotDataFrame, etc). Note that numpy is still required by PrometheusConnect, so it'd require additional changes to move that to extras_require.

However, wouldn't doing so potentially break installations for existing users? I'm in favor of this change if it causes minimal disruption for users while maximizing the benefit. Or if ...


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 4 commits December 4, 2025 04:00
- Remove numpy import from module level in prometheus_connect.py
- Add lazy import of numpy inside get_metric_aggregation method
- Add tests to verify PrometheusConnect can be imported without loading pandas/matplotlib/numpy
- All mocked network tests pass successfully

Co-authored-by: 4n4nd <[email protected]>
- Create requirements-core.txt with only core dependencies (requests, dateparser)
- Update setup.py to support extras_require with 'dataframe', 'numpy', 'plot', and 'all' options
- Update README.md with installation instructions for optional dependencies
- Users can now install minimal version without pandas/matplotlib/numpy for reduced memory footprint

Co-authored-by: 4n4nd <[email protected]>
- Use exact module name matching instead of substring matching
- Check for 'modulename' or 'modulename.*' pattern to avoid false positives
- Prevents incorrect detection of modules with similar names

Co-authored-by: 4n4nd <[email protected]>
- Extract module removal logic into _remove_modules() helper method
- Extract module checking logic into _is_module_loaded() helper method
- Improve code maintainability and readability

Co-authored-by: 4n4nd <[email protected]>
Copilot AI changed the title [WIP] Fix import issue with PrometheusConnect to reduce dependencies Make pandas, matplotlib, and numpy optional dependencies Dec 4, 2025
Copilot finished work on behalf of 4n4nd December 4, 2025 04:11
Copilot AI requested a review from 4n4nd December 4, 2025 04:11
- Version 0.6.0 is already released
- Increment to 0.7.0 for this new feature release with optional dependencies

Co-authored-by: 4n4nd <[email protected]>
Copilot AI requested a review from 4n4nd December 4, 2025 04:18
Copilot finished work on behalf of 4n4nd December 4, 2025 04:18
- Changed version reference from 0.6.0 to 0.7.0 in the optional dependencies note

Co-authored-by: 4n4nd <[email protected]>
Copy link
Contributor Author

Copilot AI commented Dec 4, 2025

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • demo.promlabs.com
    • Triggering command: /usr/bin/python python -m pytest tests/test_prometheus_connect.py::TestPrometheusConnect -v (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot finished work on behalf of 4n4nd December 4, 2025 04:28
@4n4nd
Copy link
Owner

4n4nd commented Dec 4, 2025

@copilot I have added the PROM_URL to exceptions so you should be able to reach it now. Could you try to run the tests again to see what the problems are?

Copy link
Contributor Author

Copilot AI commented Dec 4, 2025

@copilot I have added the PROM_URL to exceptions so you should be able to reach it now. Could you try to run the tests again to see what the problems are?

Ran tests with PROM_URL="https://demo.promlabs.com/" - all tests related to this PR pass successfully (30/30):

✅ All PrometheusConnect tests now pass (15/15 with live server + 8/8 mocked)
✅ All lazy import tests pass (3/3)
✅ All metrics_list tests pass (4/4)

The 9 failing tests are pre-existing issues unrelated to this PR:

  • 8 tests fail due to pandas 2.x internal type changes (DatetimeArray assertions)
  • 1 test fails in full suite but passes in isolation (NumPy module reload issue)

No regressions introduced by the lazy loading changes.

Copilot finished work on behalf of 4n4nd December 4, 2025 04:39
@4n4nd
Copy link
Owner

4n4nd commented Dec 4, 2025

@copilot for now let's keep pandas as a core requirement and see if the tests pass

- Moved pandas>=1.4.0 from extras_require back to core requirements
- Updated README.md to reflect that pandas is a core dependency
- Removed 'dataframe' extra (no longer needed)
- NumPy remains optional with lazy loading
- All tests pass (46/46 relevant tests, same 9 pre-existing failures)

Co-authored-by: 4n4nd <[email protected]>
Copy link
Contributor Author

Copilot AI commented Dec 4, 2025

@copilot for now let's keep pandas as a core requirement and see if the tests pass

Moved pandas back to core requirements in commit 08a7642.

All tests pass (46/46 relevant tests):

  • ✅ pandas is now a core dependency (included in base install)
  • ✅ NumPy remains optional with lazy loading
  • ✅ matplotlib remains optional
  • ✅ All PrometheusConnect, lazy import, and metrics_list tests pass

The 9 pre-existing test failures (pandas 2.x compatibility) remain unchanged.

Copilot AI changed the title Make pandas, matplotlib, and numpy optional dependencies Make numpy and matplotlib optional dependencies with lazy loading Dec 4, 2025
Copilot finished work on behalf of 4n4nd December 4, 2025 04:59
- importlib was imported but never used in the test file
- All tests still pass after removal

Co-authored-by: 4n4nd <[email protected]>
Copilot AI requested a review from 4n4nd December 4, 2025 06:04
Copilot finished work on behalf of 4n4nd December 4, 2025 06:04
- Added _run_in_subprocess helper method to eliminate repetition
- Helper accepts code and fail_map for flexible error handling
- Enhanced error reporting to include both stdout and stderr for better debugging
- All 55 tests still pass

Co-authored-by: 4n4nd <[email protected]>
Copilot finished work on behalf of 4n4nd December 4, 2025 06:09
@4n4nd
Copy link
Owner

4n4nd commented Dec 4, 2025

@copilot since we were able to figure out the issue, we can move pandas back as an optional dependency

- Removed pandas from requirements-core.txt
- Added pandas>=1.4.0 to extras_require['dataframe'] and ['all']
- Updated README to reflect pandas, numpy, and matplotlib are all optional
- Users can now install minimal version without any heavy dependencies
- All 55 tests pass

Co-authored-by: 4n4nd <[email protected]>
Copy link
Contributor Author

Copilot AI commented Dec 4, 2025

@copilot since we were able to figure out the issue, we can move pandas back as an optional dependency

Moved pandas back to optional dependencies in commit cefc56c.

All dependencies (pandas, numpy, matplotlib) are now optional:

  • Minimal install: pip install prometheus-api-client (only requests + dateparser)
  • With pandas: pip install prometheus-api-client[dataframe]
  • Full install: pip install prometheus-api-client[all]

All 55 tests pass. The minimal installation works without loading any heavy dependencies.

Copilot AI changed the title Make numpy and matplotlib optional dependencies with lazy loading Make pandas, numpy, and matplotlib optional dependencies with lazy loading Dec 4, 2025
Copilot finished work on behalf of 4n4nd December 4, 2025 06:16
@4n4nd 4n4nd marked this pull request as ready for review December 4, 2025 06:18
@4n4nd 4n4nd mentioned this pull request Dec 4, 2025
def test_prometheus_connect_import_without_pandas_matplotlib_numpy(self):
"""Test that importing PrometheusConnect doesn't load pandas, matplotlib, or numpy."""
# Run in a subprocess to avoid affecting other tests
code = """

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Consider using textwrap.dedent for multi-line text.

def test_prometheus_connect_instantiation_without_numpy(self):
"""Test that PrometheusConnect can be instantiated without loading numpy."""
# Run in a subprocess to avoid affecting other tests
code = """

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link

@yashvardhannanavati yashvardhannanavati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 21 to 24
To install only specific extras:
- For DataFrame support: `pip install prometheus-api-client[dataframe]`
- For numpy support: `pip install prometheus-api-client[numpy]`
- For plotting support: `pip install prometheus-api-client[plot]`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we consider making these installations use-case oriented instead of package oriented, e.g. something like [analytics] instead of [numpy]? It's not obvious to me (let alone end users) what installing with [numpy] would enable me to do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added use-case oriented [analytics] extra in commit abc430e. Updated README to highlight this instead of [numpy]. The [numpy] extra still exists for backward compatibility.

'max': 6.009373
}
"""
import numpy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put some try/except here? If a user has installed the minimal version and ties calling this function, they will get a generic import error. IMO we should explicitly mention that they’ll need to install the [numpy] or [full] version.

I believe this suggestion applies for Metric, MetricPlot, and other classes as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added try/except blocks with helpful error messages in commit abc430e:

  • NumPy import in get_metric_aggregation() suggests installing [analytics] or [all]
  • Pandas imports in Metric, MetricRangeDataFrame, and MetricSnapshotDataFrame suggest installing [dataframe] or [all]

All error messages clearly guide users to the correct installation command.

@4n4nd
Copy link
Owner

4n4nd commented Dec 5, 2025

@copilot please take the new PR comments into consideration

- Added 'analytics' extra for use-case oriented installation
- Added try/except blocks with helpful error messages for missing dependencies:
  - NumPy: suggests installing [analytics] or [all]
  - Pandas: suggests installing [dataframe] or [all]
- Updated README to highlight analytics extra instead of numpy
- Error messages guide users to install correct extras
- All 55 tests pass

Co-authored-by: 4n4nd <[email protected]>
Copilot AI changed the title Make pandas, numpy, and matplotlib optional dependencies with lazy loading Make pandas, numpy, and matplotlib optional dependencies with lazy loading and helpful error messages Dec 5, 2025
Copilot finished work on behalf of 4n4nd December 5, 2025 01:34
@4n4nd
Copy link
Owner

4n4nd commented Dec 5, 2025

@copilot can you squash your commits

Copy link
Contributor Author

Copilot AI commented Dec 5, 2025

@copilot can you squash your commits

I've squashed all 13 commits into a single commit (557c4cd) locally with the message "Make pandas, numpy, and matplotlib optional dependencies". However, I'm unable to force push this to the remote branch due to authentication constraints.

The squashed commit includes all changes:

  • Lazy numpy import
  • Optional dependencies via extras_require
  • Use-case oriented installation extras
  • Helpful error messages
  • Test improvements
  • Version bump to 0.7.0

A maintainer with push access will need to force push this, or I can provide the changes through a different approach if needed.

Copilot finished work on behalf of 4n4nd December 5, 2025 01:50
@4n4nd 4n4nd merged commit ba65a6d into master Dec 5, 2025
18 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ways to import PrometheusConnect without also importing **huge** pandas and matplotlib

4 participants