Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support proxy server #70

Open
tpoiii opened this issue Apr 23, 2023 · 2 comments
Open

support proxy server #70

tpoiii opened this issue Apr 23, 2023 · 2 comments

Comments

@tpoiii
Copy link

tpoiii commented Apr 23, 2023

I work behind a proxy server, and the urllib3.PoolManager does not automatically check the proxy environment variables.

I modified util.py/urlopen as follows and it seems to work:

  • added import os to get access to os.getenv
  • most programs look for _PROXY for the URL for the proxy, so, e.g., HTTPS_PROXY or HTTP_PROXY
  • So, I check for the protocol by splitting the URL on ':', then build the environment variable with the uppercase protocol and _PROXY
  • if a matching environment variable exists, I use that to initialize a urllib3.ProxyManager(proxy_url) in place of the PoolManager
  • otherwise, go to default non-proxy behavior using the PoolManager
    This seems to work.
    Two changes in urlopen shown below - add import os, and add code block looking at the proxy (look for # TPO comments)

def urlopen(url):
"""Wrapper to request.get() in urllib3"""

import os # TPO added for proxy env check
import sys
import urllib3
from json import load

# https://stackoverflow.com/a/2020083
def get_full_class_name(obj):
    module = obj.__class__.__module__
    if module is None or module == str.__class__.__module__:
        return obj.__class__.__name__
    return module + '.' + obj.__class__.__name__

c = " If problem persists, a contact email for the server may be listed "
c = c + "at http://hapi-server.org/servers/"
msg = '';
try:
    # code block added by TPO to manage proxy
    protocol = url.split(':')[0]
    proxy_url = os.getenv(protocol.upper()+'_PROXY')
    if proxy_url:
        http = urllib3.ProxyManager(proxy_url)
    else:
        http = urllib3.PoolManager() # original single line
    # end of code block added by TPO to manage proxy
    res = http.request('GET', url, preload_content=False, retries=2)

--snip--

@rweigel
Copy link
Collaborator

rweigel commented Apr 25, 2023

This should be added.

I have concerns about the assumption that the env vars will have the expected name or that an attempt to use the proxy will be made when it is unavailable or not desired.

Would allowing a keyword option, e.g.

hapi(SERVER,..., proxy=PROXY)

So that the user has to provide

PROXY = {'http':  os.getenv('HTTP_PROXY'), 'https':  os.getenv('HTTPS_PROXY')}

work?

@tpoiii
Copy link
Author

tpoiii commented Apr 25, 2023

That would work. We sometimes do want a way to disable the environment variable default proxy settings.

I looked into this a bit more. The older urllib does inspect the proxy environment variables. (oddly, I now realize that it looks for the lower case versions, http_proxy and https_proxy, which is probably why I set both at my end)

https://docs.python.org/3/library/urllib.request.html
"In addition, if proxy settings are detected (for example, when a *_proxy environment variable like http_proxy is set), ProxyHandler is default installed and makes sure the requests are handled through the proxy."

It seems that urllib3 is a bit of an anomaly in not doing it for you:
https://urllib3.readthedocs.io/en/stable/advanced-usage.html#proxies

But, in any case, making it an argument is probably the best solution, as it gives the developer fine-grained control with minimal effort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants