Skip to content

CSHARP-734: SOCKS5 Proxy Support #1731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 67 commits into
base: main
Choose a base branch
from
Open

Conversation

papafe
Copy link
Contributor

@papafe papafe commented Jul 16, 2025

No description provided.

@papafe papafe added the feature label Jul 16, 2025
@papafe papafe requested a review from sanych-sun August 12, 2025 14:54
@papafe papafe marked this pull request as ready for review August 12, 2025 14:54
@papafe papafe requested a review from a team as a code owner August 12, 2025 14:54
}

_proxyHost = value;
if (_proxyHost.Length == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string.IsNullOrEmpty(_proxyHost)?

Copy link
Contributor Author

@papafe papafe Aug 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we reached here, we already validated the options with a regex, so the strings cannot be null.
They could be empty though if the connection string looks something like (with empty spaces):
....proxyHost= &proxyPort=2020.
So ideally we should only check that is not empty. We could also still use isNullOrEmpty for readability, but it's not necessary here.

This made me also realise that we should just check for null a couple of lines earlier, so I removed IsNullOrEmpty and added a test that verifies it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I would probably prefer comparing to String.Empty then, but I'm not insisting on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I would prefer if (_proxy == "") LOL

}

var proxyPortValue = ParseInt32(name, value);
if (proxyPortValue is < 0 or > 65535)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 0 legit value here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a valid port number, but usually the port numbers less than 1024 are reserved. I tried to be more relaxed here with the validation, just excluding values that are definitely not valid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 is probably wrong value for the port, as it means "random available port" as far as I know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right :)

@@ -374,6 +374,10 @@ public void When_nothing_is_specified(string connectionString)
subject.MaxPoolSize.Should().Be(null);
subject.MinPoolSize.Should().Be(null);
subject.Password.Should().BeNull();
subject.ProxyHost.Should().Be(null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BeNull()?

Copy link
Contributor Author

@papafe papafe Aug 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... BeNull can't be used with proxyPort because of the type constraints on the fluent assertions, but I'll change the others.

return new Hasher()
.Hash(Username)
.Hash(Password)
.GetHashCode();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class looks immutable, we can calculate the hashcode once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we could, but I suppose this method is method is going be called almost never. Do you think it's worth to cache it vs keeping the code shorter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this class is not used as a key (or part of the key) in some dictionaries, then probably it's OK to keep the code as is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should not be used in that way.

@@ -48,19 +48,36 @@ public TcpStreamFactory(TcpStreamSettings settings)
// methods
public Stream CreateStream(EndPoint endPoint, CancellationToken cancellationToken)
{
var socks5ProxySettings = _settings.Socks5ProxySettings;
var useProxy = socks5ProxySettings != null;
var targetEndpoint = useProxy ? new DnsEndPoint(socks5ProxySettings.Host, socks5ProxySettings.Port) : endPoint;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right to think that proxy vs non-proxy scenario is different in 2 points:

  1. What endpoint to connect
  2. Do some additional logic after connecting the socket

If so - can we instead of tweaking the TcpStreamFactory class, create a new wrapper that will replace the endpoint, call the TcpStreamFactory.CreateStream as usual, and then do proxy negotiation after.
We are doing something similar in SslStreamFactory.

TARGET="TestSocks5Proxy" \
evergreen/run-tests.sh
OS=${OS} \
evergreen/cleanup-proxy.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably should add this script to cleanup-test-resources.sh or call the cleanup-proxy.sh from post steps. Otherwise we could skip on cleaning proxy if some previous step failed with an error.


echo "Attempt to kill proxy server process if present on ${OS}"
if [[ "$OS" =~ Windows|windows ]]; then
tasklist -FI "IMAGENAME eq python.exe"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Killing all Python processes might be too much. Let's investigate if there is anything else we can do.

var proxyPortValue = ParseInt32(name, value);
if (proxyPortValue is < 1 or > 65535)
{
throw new MongoConfigurationException("proxyPort must be between 0 and 65535.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably suggest to change the error message to:
Invalid proxy port: {proxyPortValue}: must be between 1 and 65535, inclusive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasonable.

}

_proxyHost = value;
if (_proxyHost.Length == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I would probably prefer comparing to String.Empty then, but I'm not insisting on this.

}

var proxyPortValue = ParseInt32(name, value);
if (proxyPortValue is < 0 or > 65535)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 is probably wrong value for the port, as it means "random available port" as far as I know.

sb.Append(Authentication switch
{
Socks5AuthenticationSettings.UsernamePasswordAuthenticationSettings up =>
$"UsernamePassword (Username: {up.Username}, Password: {up.Password})",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MongoUrlBuilder.ToString is being used to build the Url from provided parameters, so having passwords there is probably the expected behavior (I suppose we ought to have another method for that, something like BuildUrl). However this class is user-facing settings class, there is a bigger chances it could be converted to string and logged or even worse outputted as a part of exception. I've checked MongoClientSettings - it does not look like it can leak passwords in the similar way.

get => _proxyPort;
set
{
if (value is < 0 or > 65535)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have Ensure.IsNullOrBetween if this is what you are looking for.

return new Hasher()
.Hash(Username)
.Hash(Password)
.GetHashCode();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this class is not used as a key (or part of the key) in some dictionaries, then probably it's OK to keep the code as is.

Copy link
Contributor

@BorisDog BorisDog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall!

public static void PerformSocks5Handshake(Stream stream, EndPoint endPoint, Socks5AuthenticationSettings authenticationSettings, CancellationToken cancellationToken)
{
var (targetHost, targetPort) = endPoint.GetHostAndPort();
var buffer = ArrayPool<byte>.Shared.Rent(BufferSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have similar ThreadStaticBuffer that is used for smaller buffers. Do you know how it compares to ArrayPool.Shared ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I didn't see ThreadStaticBuffer before, so it's good that you pointed it 😁
Regarding the use of it, I suppose the main disadvantage is here is that ThreadStaticBuffer is not thread safe, so we can't use it in the async version. Another advantage of ArrayPool.Shared is that you can "rent" as many buffer you need per thread.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ThreadStaticBuffer is designed to be threadsafe, leveraging the [ThreadStatic] technique, which I believe what ArrayPool<byte>.Shared does.
You are correct about the limitation for a single rent per thread, but looks like that's what is needed here.

I am thinking that if we don't have a good reason to use one over another, than we should fallback to consistency principle. But going forward, if ArrayPool<byte>.Shared is as performant, no reason not switch to it in all places in the future.
I do like the option of using in RentedBuffer, which eliminated the need for finally, so we still might want to create wrapper around ArrayPool<byte>.Shared.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to use ConfigureAwait(false) in the async version, then we can't use ThreadStaticBuffer as it's not thread safe.

}
}

private static int CreateGreetingRequest(byte[] buffer, bool useAuth)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor:
Looks like a static data, would it be simpler to create static buffers once?

byte[]  __greetingNoAuth = [ProtocolVersion5, ..],
byte[]  __greetingAuth = [ProtocolVersion5, ..],

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could create the static buffer once, but then we're exchanging a very little time in needing to fill the rented array with the memory necessary to keep the static buffer.
I think it would make sense to keep it as it is.

{
if (version != ProtocolVersion5)
{
throw new IOException("Invalid SOCKS version in response.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add actual and expected versions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

await stream.ReadBytesAsync(buffer, 0, 2, cancellationToken).ConfigureAwait(false);
var acceptsUsernamePasswordAuth = ProcessGreetingResponse(buffer, useAuth);

// If we have username and password, but the proxy doesn't need them, we skip.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: Should comment say "we skip the authentication step" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.


if (buffer[1] != Socks5Success)
{
throw new IOException($"SOCKS5 connect failed");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add buffer[1] value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

{
if (buffer[0] != SubnegotiationVersion || buffer[1] != Socks5Success)
{
throw new IOException("SOCKS5 authentication failed.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add buffer[1] value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

var acceptsUsernamePasswordAuth = ProcessGreetingResponse(buffer, useAuth);

// If we have username and password, but the proxy doesn't need them, we skip.
if (useAuth && acceptsUsernamePasswordAuth)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like in the case of acceptsUsernamePasswordAuth, useAuth is validated in ProcessGreetingResponse.
So technically useAuth doesn't need to be checked here.
Should acceptsUsernamePasswordAuth be named something like useAuthenticationStep ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense!

yield return [id, connectionString, expectedResult, useTls, isAsync];
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: another way to write the same, if you'd prefer:

return  from (connectionString, expectedResult) in testCases
      from useTls in new[] { true, false }
      from isAsync in new[] { true, false }
      select new object[] {..., useTls, isAsync};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used something similar now, hopefully it's readable enough.

@BorisDog BorisDog requested a review from vbabanin August 13, 2025 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants