-
Notifications
You must be signed in to change notification settings - Fork 2.9k
NIFI-4199: Consistent proxy support across components #2704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@ijokarumawak haven't had a chance to take a look at this, but have you tried it against Solr and Elastic yet? I think the latter's APIs do their own proxy management so that might need a little finessing here. |
It would be very nice if the initial proxy service also includes a SOCKS Proxy example. Other processors that implement the Proxy Service can then reuse the existing implementation even better. For example we would probably implement that change for the SFTP processors then. |
How will this work with the AWS components? They have proxy as well ( although there is a PR for full support ), but a different builder I think |
@ottobackwards I assume you were talking about this, #2016. That one adds user/password for proxy authentication at abstract AWS processor. This PR adds ProxyConfigurationService, which can be added on top of #2016 for AWS processors proxy configurations to be managed by the centralized Controller Service. Please look at the FTP and HTTP processors in this PR, AWS ones can adopt the CS same way. |
@jugi92 FTPTransfer supports SOCKS proxy. Specifically at these lines:
https://github.com/apache/nifi/pull/2704/files#diff-6e7e715d42f332cbe404edd9afbcaafaL533 For processors those don't support SOCKS proxy, following validation code should be added into their customValidate method, to confirm that ProxyConfigurationService is configured with the supported proxy type(s):
ProxyConfigurationService just holds the centralized proxy settings, each processor is responsible to use the settings with its own relying SDK/API way. I checked #2018 but the PR doesn't look active. I will take a closer look on SFTP processor and #2018 to see if I can include SFTP ones into this PR, too. |
@MikeThomsen We can combine ProxyConfigurationService into ES or Solr, the CS just let users manage proxy settings in a centralized place. I will take a look on #2094 to see how I can help review that one. Thanks. |
Now this PR includes SFTP processors and SOCKS proxy support for SFTP as well. |
Elasticsearch processors are also included in this PR now. |
break; | ||
case SOCKS: | ||
final ProxySOCKS5 proxySOCKS5 = new ProxySOCKS5(proxyConfig.getProxyServerHost(), proxyConfig.getProxyServerPort()); | ||
session.setProxy(proxySOCKS5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add:
if (proxyConfig.hasCredential()) { socksProxy.setUserPasswd(proxyConfig.getProxyUserName(), proxyConfig.getProxyUserPassword()); }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jugi92 Thanks! I actually didn't know SOCKS protocol supports user authentication.. It seems only SFTP processors (thanks to the underlying jsch) support that. I've added your suggestion and confirmed with Dante SOCKS server with authentication.
https://gist.github.com/ijokarumawak/b3a31378bdc0a6c6b9922a138e9ec9c1
I will update this PR shortly.
@ijokarumawak I'm talking about passing around an HttpClientBuilder when not everyone uses that. |
@ottobackwards You are talking about these code specifically?
Then yes, the above util method accepts HttpClientBuilder and useful only for processors those use HttpClient library. It's currently used from only GetHTTP and PostHTTP. It's just a convenient method for those two for now. Other processors who don't use HttpClient, uses ProxyConfiguration directly to get proxy settings. Following snippet is copied from AbstractAWSProcessor:
Does that answer to your question? |
Now this PR also includes AWS related processors. I've tested following processors can utilize HTTP forward proxies and support authentication:
|
I've summarized current capabilities on this PR's description. Please check the table. We can keep expanding the list of processors, but I'd stop here and finish reviewing these processors as the 1st phase. |
- Added ProxyConfigurationService to manage centralized proxy configurations - Adopt ProxyConfigurationService at FTP and HTTP processors
This closes apache#2018. Signed-off-by: Koji Kawamura <[email protected]>
- Fixed check style issue - Use the same proxy related PropertyDescriptors from FTPTransfer and SFTPTransfer - Dropped FlowFile EL evaluation support to make it align with other processors spec, Now it supports VARIABLE_REGISTRY - Added ProxyConfigurationService to SFTP processors - Added SOCKS proxy support to SFTP processors
…ssors - ElasticsearchHttp processors now support SOCKS proxy, too - Added proxy support to PutElasticsearchHttpRecord - Moved more common property descriptors to AbstractElasticsearchHttpProcessor and just return static unmodifiable property descriptor list at each implementation processors
NIFI-4196 - Fix jUnit errors This closes apache#2016. Signed-off-by: Koji Kawamura <[email protected]>
- Applied ProxyConfigService to S3 processors - Added proxy support to following processors: - PutKinesisFirehose, PutKinesisStream - PutDynamoDB, DeleteDynamoDB, GetDynamoDB - PutKinesisStream - All AWS processors support HTTP proxy now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ijokarumawak Thanks for this PR, this is a much needed approach to Proxy configuration.
I have reviewed your changes with respect to the AWS processors only. The code looks good, I recommend only a few minor tweaks. I tested a flow with some S3 processors using the StandardProxyConfigurationService, the separate AWS processor PROXY_HOST properties, and no proxy. Everything worked fine in my tests.
@@ -311,5 +312,10 @@ public void testGetPropertyDescriptors() throws Exception { | |||
assertTrue(pd.contains(ListS3.PREFIX)); | |||
assertTrue(pd.contains(ListS3.USE_VERSIONS)); | |||
assertTrue(pd.contains(ListS3.MIN_AGE)); | |||
assertTrue(pd.contains(ProxyConfigurationService.PROXY_CONFIGURATION_SERVICE)); | |||
assertTrue(pd.contains(ListS3.PROXY_HOST)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor tweak: The check for PROXY_HOST and PROXY_HOST_PORT duplicates checks above on lines 309-310. I believe this is why we add 5 lines of new assertions, but the count of property descriptors only goes up by 3 from 17 to 20. It doesn't make any difference, really, but the math was bothering me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jvwing Good catch, thanks. I removed the duplicated PROXY_HOST and PROXY_HOST_PORT. The missing one was LIST_TYPE. So, 17 + 3 = 20. 3 additions are PROXY_USER, PROXY_PASS and LIST_TYPE.
@@ -92,6 +94,23 @@ | |||
.addValidator(StandardValidators.PORT_VALIDATOR) | |||
.build(); | |||
|
|||
public static final PropertyDescriptor PROXY_USERNAME = new PropertyDescriptor.Builder() | |||
.name("Proxy Username") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend a separate name
vs displayName
for PROXY_USERNAME.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update it.
.build(); | ||
|
||
public static final PropertyDescriptor PROXY_PASSWORD = new PropertyDescriptor.Builder() | ||
.name("Proxy Password") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend a separate name
vs displayName
for PROXY_PASSWORD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update it.
- Each processor has different supporting Proxy specs - Show supported spec to ProxyConfigurationService property doc - Validate not only Proxy type, but also with Authentication
- Fixed TestListS3 property descriptor check - Separate name and displayName
I really stop updating this PR. No more addition from my side. Let's wrap this up. Thanks for reviewing! |
Should the tests for InvokeHTTP be updated to test with the changes? |
@ijokarumawak I'm going to start reviewing this. Once we get this done, I could use a hand with a review on this lookup service I wrote which I'm partly holding back so I can do its proxy support via your changes here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM. 2/3 builds pass so everything looks in order. As mentioned, this ticket may require ongoing refactoring of additional components, so since LGTM now we'll merge so we can keep scope creep down.
@trixpan @ijokarumawak There are two other tickets referenced, 4196 and 4175(?) in the commit list for this PR. Before I keep squashing, I want to confirm that you want me to keep going and put 3 "This closes #ABCD" statements in there to close this, 4196 and 4175. |
@MikeThomsen Thanks for merging this. Although my original intent was keeping commits made by @trixpan separated (not squashed) to retain his credits, it looks good to me because the original PR 4196 and 4175 are closed as I expected. @trixpan Thanks again for originating this improvements! |
I found a bug in this in the aws implementation, I am not sure how you would see it in the other processors, I found it when bringing this code into my Gateway Api PR. The issue is that customValidate validates that both host and port need to be set, but not that both user and password need to be set. Since I test for this ( from the InvokeHttp testProxy ), I fail. |
Once I prove out my fix and update my pr, I'll guess I'll do a PR against master with that fix? |
This PR adds ProxyConfigService into following processors. However, due to the restriction of underlying libraries, SOCKS and SOCKS+Auth are partially supported.
Thank you for submitting a contribution to Apache NiFi.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically master)?
Is your initial contribution a single, squashed commit?
For code changes:
For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.