-
Notifications
You must be signed in to change notification settings - Fork 9
Add support for SSL error handling #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Allow the registration of an optional callback using serf_ssl_error_cb_set(). - If the callback is registered, return a fixed string describing the error as created by the underlying crypto library. Example: [minfrin@rocky9 subversion]$ svn info https://svn.example.com/svn/example/core/ svn: E170013: Unable to connect to a repository at URL 'https://svn.example.com/svn/example/core' svn: E120171: TLS: error:0308010C:digital envelope routines::unsupported svn: E120171: Error running context: An error occurred during SSL communication
buckets/ssl_buckets.c
Outdated
| ssl_ctx->protocol_userdata = NULL; | ||
|
|
||
| ssl_ctx->error_callback = NULL; | ||
| ssl_ctx->error_userdata = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We call those batons, so: error_baton. Hmph, should do the same for protocol_userdata ⟶ protocol_baton.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has been made so.
serf_bucket_types.h
Outdated
| * Callback type for detailed TLS error strings. | ||
| */ | ||
| typedef apr_status_t (*serf_ssl_error_cb_t)( | ||
| void *data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again: void *baton.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
serf_bucket_types.h
Outdated
| void serf_ssl_error_cb_set( | ||
| serf_ssl_context_t *context, | ||
| serf_ssl_error_cb_t callback, | ||
| void *data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
Since this callback is on the context, it's serialised within the context run right? I assume that the idea is to store those messages at the caller and report them if If that's the case, it would be nice to document that in a docstring in |
All of this is pretty much the caller's problem. The callback just fires every time there is an error, it's up to the caller to pass a baton that makes sense, and a callback routine that understands the baton. In existing subversion, the context run keeps going until an SSL error happens, and if an SSL error happens the whole context is shut down and the client gives up. In this case the callback fires first, and subversion (shortly) makes a copy of the message for later. When later comes, any error message from a callback is added to svn_error_t. Other applications could work completely differently, serf shouldn't care. |
|
It's not about Serf's caring, it's about the user knowing the callback will get called (just) before |
Does the added description make sense? It's literally "here is the detail string for the failure you are about to experience, log it, add it to your error stack, do with it as you will". Unfortunately the best option was for there to not be a comment "Detect more specific errors?" in the code followed by no way to return a more specific error. We're doing the best we can with what we have available. |
Add the most detailed underlying crypto library error string to the error stack when the context fails due to an SSL failure. SSL errors are no longer reduced to "an error has occurred". This relies on the serf_ssl_error_cb_t callback as provided by serf in apache/serf#9. Example: [minfrin@rocky9 subversion]$ svn info https://svn.example.com/svn/example/core/ svn: E170013: Unable to connect to a repository at URL 'https://svn.example.com/svn/example/core' svn: E120171: TLS: error:0308010C:digital envelope routines::unsupported svn: E120171: Error running context: An error occurred during SSL communication
|
Maybe this is the same question already asked by Brane, and I'm just not clever enough to understand it. But the whole design seems a little bit convoluted to me. |
|
Serf is asynchronous, and one serf-context can juggle multiple connections. You need a callback if you want to have a hope in hell of keeping all this straight on the error-reporting side. That said, however: the callback doesn't know which connection a given error belongs to. That could be a problem. typical case would be a context with one SSL connection to a server and one connection to an OCSP responder that validates the server cert. It'd be strange if, e.g., OCSP connection errors were reported against a Subversion server. OK, OCSP connections shouldn't use SSL. Which brings up the point: why would this callback only be used to report SSL errors, when there can be other kinds of error descriptions available for other connection types, and they, too, can't know what actually happened. |
The problem with that approach is that it "keeps stock". What you want is to hand over the errors you encounter with the least effort possible. Right now in a trivial implementation you could just write the errors to stderr from inside the callback. No copying, no saving state anywhere, you're done. Subversion is more sophisticated, it has the ability to stack errors. This lines up nicely where the callback can add an additional error to the stack. |
Options are limited inside serf. The two places where errors are generated are https://github.com/apache/serf/blob/trunk/buckets/ssl_buckets.c#L1538 and https://github.com/apache/serf/blob/trunk/buckets/ssl_buckets.c#L1054, and the only context that is available at these locations are serf_ssl_context_t. This isn't a crisis though. At a future date, we could add an error callback tied to a connection, and let the caller decide what scope of errors they want. In other words, we don't need to delay providing error messages today in the hope of being perfect in the future. |
|
|
||
| if (err && ctx->error_callback) { | ||
| char ebuf[256]; | ||
| ERR_error_string_n(err, ebuf, sizeof(ebuf)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just above, you call ERR_error_string() to log the error, while here, you copy the string to the stack. Is this actually necessary? I'd just document to callback implementors that they have to copy the string inside the callback and call ERR_error_string() just once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logging was left unchanged, but really should go.
The logging is the crutch that was supporting the lack of error handling. Libraries should definitely not be logging anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree with the "libraries should not be logging" sentiment. They shouldn't dump to stdout or stderr, but Serf doesn't do that -- it writes to whatever logging sink the application cares to provide, so basically integrates with the application's logging infrastructure. I've found this to be extremely useful for debugging; not everything can be done with error handling, no matter how sophisticated. Because logging by its nature leaves an audit trail for states that are not error conditions and can't even be captured when an error occurs.
In this case, I agree, we shouldn't be doing both.
buckets/ssl_buckets.c
Outdated
| } | ||
| else { | ||
| err = ERR_get_error(); | ||
| ERR_clear_error(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't pass err to log_ssl_error() and clear it here; doesn't that mean that nothing will happen? Why even call ERR_get_error() here, given that it has no effect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was from before we used log_ssl_error(), I've simplified this.
|
If a context is running 4 connections to a server, and they are sharing an SSL context, then this callback attached to the shared context cannot identify the connection, can it? (I kinda recall reading that somewhere) I'm also thinking that this "error callback" being specialized only to SSL is missing a review of how to do this properly for serf in general. And yes, I DO want a general design, rather than tack on a "single solution today" and "generalize later". To generalize it (likely beyond scope of desire of this PR), I'd seek some kind of baton/identifier of "what request is this error attached to?" ... I believe the request is the "unit of work" and any error would be associated with that. Could be a response, but that is in response to a request. ... Hmm. I guess one could also associate an error with a connection, or a shared context. eg. timed out a connection setup before we even tried to deliver a request. In short, I do not think it appropriate to apply a hack to fix one case, when we don't have a conceptual design (even if not fully-implemented) for a serf-wide sophisticated error handling design. |
This is a general limitation of serf that exists already. When serf fails, it returns apr_status_t. What connection that apr_status_t belongs to, serf doesn't tell you.
Right now, I want to contact a vendor and say "I get this message, please help", and not receive from the vendor pages and pages of "it could be this, or it could be that, or maybe this, or maybe that", and then eventually find out that it was actually none of the above, and all of this occurred for the want of an actual error message. Our community is orders of magnitude more important than the code. I entirely get the desire to attach this error to "a unit of work", your choices for a unit of work in this code consist of just one structure, serf_ssl_context_t, and that is it. https://github.com/apache/serf/blob/trunk/buckets/ssl_buckets.c#L136C8-L136C26 The parent structure for serf_ssl_context_t is serf_config_t. There are no other "units of work" today, serf just doesn't work like that. Obviously one can go off on a tangent and rewrite serf to invent new "units of work" and then use this as a basis for a different error handling mechanism. This is however hugely disrespectful to the community, who need proper error messages today, and who cannot be held to ransom until someone is forced to do unrelated work. We must provide error handling for the serf in front of us today, not some future serf that might exist in the future but doesn't. |
|
Please do not lecture me about how serf works. "serf just doesn't work like that" ... I wrote it. |
Looking at this closer: https://github.com/apache/serf/blob/trunk/buckets/ssl_buckets.c#L148 The serf_ssl_context_t structure represents one connection by definition. By passing serf_ssl_context_t as the baton, like we do in apache/subversion#31, we tie the error to the connection, thus solving your problem. |
|
I'm not sold on this design, but the baton would be |
We use svn_ra_serf__connection_t (third parameter conn) as a baton in the svn patch: |
Thanks (to both Brane and Graham) for their explanations. It makes sense... |
send nothing instead and fail because the server said access is denied.
than just a low level openssl error.
dsahlberg-apache-org
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realise this was committed in r1926972, so the PR should really be closed...
However it needs some specific authz in GitHub that I don't seem to have.
| if (err && ctx->error_callback) { | ||
| char ebuf[256]; | ||
| ERR_error_string_n(err, ebuf, sizeof(ebuf)); | ||
| ctx->error_callback(ctx->error_baton, ctx->fatal_err, ebuf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really necessary to use an internal char array and calling ERR_error_string_n to copy the error message to this buffer. The error_callback must copy the message to an application internal buffer anyway. Wouldn't it be enough to:
char *ebuf = ERR_error_string(err, NULL);
ctx->error_callback(ctx->error_baton, ctx->fatal_err, ebuff);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this. The documentation for the callback signature/compact should be that the life of the error string lasts only until the callback returns. Thus, the callback must use it immediately in some fashion, and not retain the pointer (eg. print it, or copy it).
As such, serf does not need to copy a portion of the error into a stack-based buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really necessary to use an internal char array and calling ERR_error_string_n to copy the error message to this buffer. The error_callback must copy the message to an application internal buffer anyway. Wouldn't it be enough to:
char *ebuf = ERR_error_string(err, NULL); ctx->error_callback(ctx->error_baton, ctx->fatal_err, ebuff);
Alas not.
ERR_error_string() uses an openssl-internal buffer that is overwritten on each call, and is not thread safe. ERR_error_string_n() fixed this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, good point. Missed that one, thanks!
Reverted to the original code form the PR in r1927348.
| else { | ||
| int err = ERR_get_error(); | ||
| err = ERR_get_error(); | ||
| ERR_clear_error(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why moving the declaration from here to the top of the function? Better keep scope limited whenever possible, just to catch accidental errors (pun intended).
(Yes it obviously should be unsigned long instead of int, so a change would be needed anyhow).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also agree. All variables should be scoped as tightly as possible.
|
The basic patch has been committed to svn in r1926972 with some followups. There is still the ongoing discussion if we should change to a more generalised error callback structure but it can be handled in SVN since all participants are committers or with a new PR. |
Allow the registration of an optional callback using serf_ssl_error_cb_set().
If the callback is registered, return a fixed string describing the error as created by the underlying crypto library.
Example: