-
Notifications
You must be signed in to change notification settings - Fork 32
fix: considering header field if-unmodified-since for different patch requests #2138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Isn't |
In my opinion this approach is entirely server-side: the server tracks timestamps and controls concurrency. |
|
Looks good! I think its a good compromise to use modified-since only for now, as using Etags would require computing/storing them and be quite a lot more effort to implement as it is right now. @Junjiequan to your concerns:
Well the problem is that the time between these requests may be quite long. Doing locking in the backend is not a good Idea: For doing some locking directly on the database, I think that has the same issues. I don't think you want to hold a database lock while waiting for a client to make a follow-up post/patch request. Optimistic concurrency with the conditional requests is the standard way to do this, at least to my knowledge.
Yes, the client has to set it. For our case at MLZ at least, the clients we write internally for the automatic ingestion will use it. If its merged, I want to have a look at scitacean to implement support there and then most use-cases should be covered. As we probably won't require clients to send it with every request, there will of course still be the option around it, but well behaved clients can use it, which is a better situation than now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please switch around the if statements as suggested.
Also make sure to include some API test to test backward compatibility and the functionality itself.
|
As long as that the feature is backward compatible and it does not effect the updates if it is not used, I'm open to accept it. @joe-baudisch does this applies only to updates, correct? |
@nitrosx : this only applies to this updates. |
|
@cchndl |
|
Jumping in as a complete ignorant here. |
I mean you didn't misinterpreted it, I just wanted to clarify why this would be preferable to other options. I hope I was not overbearing!
Full agreement with you there. Its good to have both supported in the end. |
@nitrosx For the last-modified, if you read and afterwards in the same second an update comes, you wont see that. Its not likely but possible. The specification only says second-accuracy. For example, a last-modified-header that the etag libraty gives could be: In the end, i believe having both would be ideal. The last-modified is "free" in the sense that most of the objects already track it, so it can't hurt. Depending on how we do it, it may also be cheaper to first check the timestamp and then the etag if both are given, but that I can't say now, thats something one would have to check when we build it. As the etag library generates the Etag by hashing the response body (not the headers I think?), and not the object in the database, we would have to do some work there. This is a nice default behaviour for caching page reads and so on, but not for the objects themselves. We would have to hook Etag generation into there somewhere or save the Etag in the db in or next to the object. Hope this helps. |
|
@cchndl thank you for the explanation.
|
|
If you do not allow sub-second updates on a resource there is no concurrency issue using Example: Assuming we have two clients A and B requesting the same resource R concurrently
For our use case this simple implementation is good enough and clients can handle http 412 accordingly. I currently do not see a need for sub-second concurrent updates on our end. |
|
I agree with @cfelder |
Added a test that can be applied in a similar way to the other affected controllers. |
|
Hello @cchndl, @cfelder, dear maintainers, Do you need further testing? The master has evolved, but it hasn't taken into account that the API call this.datasetsController.findByIdAndUpdate in By the way: Now with de6c79a all tests are present and all tests were passed by running |
|
This one 03a1e8a |
|
From my testing, it looks good. Are there any other points, or do you think we can merge this @nitrosx? |
|
It is not my fault that some tests in 8384ba2 failed. Failing API tests with ElasticSearch enabled yes in 8384ba2 because of:
Failing API tests with ElasticSearch enabled no in 8384ba2 because of:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left few comments which might become change requests.
Also, we should think about submitting a following-up PR introducing all the API tests covering all the use cases.
I will be happy to help you defining the use cases.
| async findByIdAndUpdateInternal( | ||
| @Req() request: Request, | ||
| @Param("pid") pid: string, | ||
| @Headers() headers: Record<string, string>, | ||
| @Body() | ||
| updateDatasetDto: PartialUpdateDatasetDto, | ||
| ): Promise<OutputDatasetDto | null> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we define function findByIdAndUpdateInternal and call it only from findByIdAndUpdate?
Can we combine findByIdAndUpdate and findByIdAndUpdateInternal?
| }, | ||
| }) | ||
| export class Instrument { | ||
| export class Instrument extends QueryableClass { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we updated the instrument output dto to provide the additional fields introduced by QueryableClass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that I would introduce this change here!!!
If you are creating a published data record the datasets needs to be public, no exception.
That said, I might not be knowledgeable about the background and context that you are operating in.
Can you explain better why you need the changes in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some change requests here:
- We should cleanup this long string of commits by rebasing and possibly squashing some commits.
- I think the if-modifed check should be implemented as a Pipe or Interceptor, so we do not get so much code doubling. It also would make it easier to later implement a hash check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much agree with @bpedersen2 comment, if this was a pipe or an interceptor a lot of duplication could be avoided
|
|
||
| if (headerDate && headerDate <= foundAttachment.updatedAt) { | ||
| throw new HttpException( | ||
| "Update error due to failed if-modified-since condition", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should raise something more talkative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are alreay the ConflictException and the PreconditionFailed exceptions available from nest.
How about PreconditionFailed('Resource has been changed on server')
See also https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/412 for an ETAG example
|
As for using ETags, the hash should be done over the resource, so we could add that to all GET requests and also return it on POST request via an interceptor. Another question, why is done only for patch ( which should only send fields that you want to change anyway), but not for PUT, which would overwrite the complete entry? |
|
We have already discussed that ETag implementation can be handled separately. I Strongly suggest to concentrate on the if-unmodified-since header implementation here to not further delay reaching a state where we can merge this. |
|
I agree with #2138 (comment) |
We wanted to keep the scope small while still solving our problems. That's why it includes only the patch endpoints.
Of course if these fields are distinct sets it works already with PATCH. But as discussed in the mentioned issue, (linking it here again), the problem is with nested structures like the list of data files. There the patch request will override the changes made by other clients, since there is no merging of lists. And if the changes touch the same fields, the same problem applies anyway. |
integrating if-unmodified-since into dataset, sample if-unmodified-since into attachments,datablocks,proposals,datasets make headerDate from if-unmodified-since more robust resolve some conversations adding test fixing bug by defining controller method without decorator adding attachments.v4.controller test adding more tests adding final test with datasets.v4.controller_if-unmodified-since_.spec.ts merge two test-files into one lint fix lint fix
Description
"if-unmodified-since" header field is considered for patch requests for the following endpoints:
If the "if-unmodified-since" header field is missing, the patch request will still be excuted.
Fixes