fix: considering header field if-unmodified-since for different patch requests #2138

joe-baudisch · 2025-08-22T07:49:11Z

Description

"if-unmodified-since" header field is considered for patch requests for the following endpoints:

datasets (v3/v4)
attachments (v4)
instruments
origdatablocks (v3/v4)
samples
proposals

If the "if-unmodified-since" header field is missing, the patch request will still be excuted.

Fixes

fixes main aspect of Support for conditional requests/Strategy for not missing updates? #1969

Junjiequan · 2025-08-28T08:00:30Z

Isn't if-unmodified-since an optional field in the header? that it has to be set from the client side to make it actually useful.
I see the issues you are trying to resolve here, but it would be nice to do concurrency control on the server side instead of relying on client

joe-baudisch · 2025-08-29T09:52:49Z

Isn't if-unmodified-since an optional field in the header? that it has to be set from the client side to make it actually useful. I see the issues you are trying to resolve here, but it would be nice to do concurrency control on the server side instead of relying on client

In my opinion this approach is entirely server-side: the server tracks timestamps and controls concurrency.
This is a form of optimistic concurrency control, where the client assumes the resource hasn't changed and wants to avoid overwriting newer data. Clients still need to send If-Unmodified-Since, but they don’t manage timestamp-based versioning logic. If clients do not want to use the server-side management of this versioning logic, they omit if-unmodified-since.
Solving this issue is a special request from MLZ.

cchndl · 2025-09-02T11:39:27Z

Looks good! I think its a good compromise to use modified-since only for now, as using Etags would require computing/storing them and be quite a lot more effort to implement as it is right now.

@Junjiequan to your concerns:

I see the issues you are trying to resolve here, but it would be nice to do concurrency control on the server side instead of relying on client

Well the problem is that the time between these requests may be quite long.

Doing locking in the backend is not a good Idea:
In my opinion you don't want to give out locks over resources on a REST endpoint, since its stateless and you are not guaranteed to have it released in a timely manner. And enforcing some timeouts would be bad for expensive operations on the client side and for performance. In the first case, the client would never be able to finish its computation and make the request before the lock is released again and in the latter one, waiting for these timeouts prevents other clients to do anything else.

For doing some locking directly on the database, I think that has the same issues. I don't think you want to hold a database lock while waiting for a client to make a follow-up post/patch request.

Optimistic concurrency with the conditional requests is the standard way to do this, at least to my knowledge.

Isn't if-unmodified-since an optional field in the header? that it has to be set from the client side to make it actually useful.

Yes, the client has to set it. For our case at MLZ at least, the clients we write internally for the automatic ingestion will use it. If its merged, I want to have a look at scitacean to implement support there and then most use-cases should be covered. As we probably won't require clients to send it with every request, there will of course still be the option around it, but well behaved clients can use it, which is a better situation than now.

nitrosx

Please switch around the if statements as suggested.
Also make sure to include some API test to test backward compatibility and the functionality itself.

src/attachments/attachments.v4.controller.ts

src/datasets/datasets.controller.ts

src/datasets/datasets.v4.controller.ts

src/instruments/instruments.controller.ts

src/instruments/schemas/instrument.schema.ts

src/origdatablocks/origdatablocks.controller.ts

src/origdatablocks/origdatablocks.v4.controller.ts

src/proposals/proposals.controller.ts

src/samples/samples.controller.ts

nitrosx · 2025-09-02T13:02:46Z

As long as that the feature is backward compatible and it does not effect the updates if it is not used, I'm open to accept it.

@joe-baudisch does this applies only to updates, correct?

joe-baudisch · 2025-09-02T13:41:50Z

As long as that the feature is backward compatible and it does not effect the updates if it is not used, I'm open to accept it.

@joe-baudisch does this applies only to updates, correct?

@nitrosx : this only applies to this updates.

Junjiequan · 2025-09-02T14:02:13Z

@cchndl
Apologies for my misinterpretation of optimistic concurrency control.
What I wanted to say is actually using ETag with versions to control concurrency requests instead of relying on the date to control patches. For a temporary solution it’s fine, but it is not a very clean solution compared to version ETag. First, there might be timing issues (rare, but still). Second, it’s a fix only for clients that are aware of concurrency issues. I could imagine concurrency requests causing data loss if not handled carefully. That being said, I agree that ETag takes more effort to implement, which is also best practice if I’m not mistaken.

nitrosx · 2025-09-02T15:14:43Z

Jumping in as a complete ignorant here.
how does an ETag be more or strict than a if-modified-before?
It all depends how the information is used by the client.

cchndl · 2025-09-02T15:17:26Z

Apologies for my misinterpretation of optimistic concurrency control.

I mean you didn't misinterpreted it, I just wanted to clarify why this would be preferable to other options. I hope I was not overbearing!

What I wanted to say is actually using ETag with versions to control concurrency requests instead of relying on the date to control patches. For a temporary solution it’s fine, but it is not a very clean solution compared to version ETag. First, there might be timing issues (rare, but still). Second, it’s a fix only for clients that are aware of concurrency issues. I could imagine concurrency requests causing data loss if not handled carefully. That being said, I agree that ETag takes more effort to implement, which is also best practice if I’m not mistaken.

Full agreement with you there. Its good to have both supported in the end.

cchndl · 2025-09-02T15:44:44Z

Jumping in as a complete ignorant here. how does an ETag be more or strict than a if-modified-before? It all depends how the information is used by the client.

@nitrosx
Maybe "more strict" is a little bit too strong, the good thing about Etags is that they are really a version "distinguisher", if that makes sense. If its not the same etag, something changed.

For the last-modified, if you read and afterwards in the same second an update comes, you wont see that. Its not likely but possible. The specification only says second-accuracy. For example, a last-modified-header that the etag libraty gives could be:
last-modified Wed, 30 Apr 2025 08:37:54 GMT
This should be fine for most resources but if you want to be sure for writes, the Etag is in that sense "better" to check for.

In the end, i believe having both would be ideal. The last-modified is "free" in the sense that most of the objects already track it, so it can't hurt. Depending on how we do it, it may also be cheaper to first check the timestamp and then the etag if both are given, but that I can't say now, thats something one would have to check when we build it.

As the etag library generates the Etag by hashing the response body (not the headers I think?), and not the object in the database, we would have to do some work there. This is a nice default behaviour for caching page reads and so on, but not for the objects themselves. We would have to hook Etag generation into there somewhere or save the Etag in the db in or next to the object.

Hope this helps.

nitrosx · 2025-09-03T07:12:43Z

@cchndl thank you for the explanation.
I think there are already ETag functionality in nest.js.
I found the following post that mention ETag in nest.js:

https://stackoverflow.com/questions/68080302/disabling-auto-generated-etags-in-nestjs-app
Although I'm not sure it is relevant to our use case

cfelder · 2025-09-03T18:06:56Z

If you do not allow sub-second updates on a resource there is no concurrency issue using If-Unmodified-Since.

Example: Assuming we have two clients A and B requesting the same resource R concurrently

Client A and B get the same last modified time t0
Client A sends an update with If-Unmodified-Since: t0
Client B sends an update with If-Unmodified-Since: t0
One Client will win, let's assume Client B wins b/c of a shorter round trip time
The Application receives update B first and updates its last modified time to t1 (server based timestamp)
The Application receives request from A and compares the last modified time t0 with t1 and responds with HTTP 412 Precondition Failed.

For our use case this simple implementation is good enough and clients can handle http 412 accordingly. I currently do not see a need for sub-second concurrent updates on our end.

nitrosx · 2025-09-04T06:50:57Z

I agree with @cfelder

joe-baudisch · 2025-09-05T09:15:00Z

Please switch around the if statements as suggested. Also make sure to include some API test to test backward compatibility and the functionality itself.

Added a test that can be applied in a similar way to the other affected controllers.

joe-baudisch · 2025-09-08T14:04:24Z

Hello @cchndl, @cfelder, dear maintainers,

Do you need further testing?
Otherwise, I suggest merging the pull request into the master so I don't run into issues like the one with fdfd2de

The master has evolved, but it hasn't taken into account that the API call this.datasetsController.findByIdAndUpdate in
src/published-data/published-data.v4.controller.ts requires an additional parameter, the headers.
It could be solved with a non decorated call.

By the way:
If you're calling a decorated method just for its logic, it's usually better to extract that logic into a separate function.

Now with de6c79a all tests are present and all tests were passed by running npx jest,
especially taking into account 10a639e .
Therefore, I again suggest merging the pull request into the master.

joe-baudisch · 2025-09-09T10:01:03Z

This one 03a1e8a
was hard

cchndl · 2025-09-12T11:16:44Z

From my testing, it looks good. Are there any other points, or do you think we can merge this @nitrosx?

joe-baudisch · 2025-10-02T17:20:13Z

It is not my fault that some tests in 8384ba2 failed.

Failing API tests with ElasticSearch enabled yes in `8384ba2` because of:

InstrumentFilter Tests (e.g. test/InstrumentsFilter.js:265, 307, 350, 375, 384, 393, 402):
- Error: expected "Content-Type" header field
- Solution: Ensure that your API responses for instrument filtering endpoints always set the "Content-Type" header, especially for JSON responses. In your controllers/middleware, add:
```
res.set('Content-Type', 'application/json');
```
before sending a response.
OrigDatablockForRawDataset & OrigDatablockV4 Tests (e.g. test/OrigDatablockForRawDataset.js:499, 558, test/OrigDatablockV4.js:866, 886):
- Error: expected 200 "OK", got 403 "Forbidden"
- Solution: These tests are being denied permission. Check your authorization logic for the endpoints handling origdatablock updates. Ensure the test users have the required roles/permissions in your test setup or mock data.
Proposal and RawDataset Tests (e.g. test/Proposal.js:359, test/RawDataset.js:310, 326, 342, 358, 371, 384, 397, test/RawDatasetDatablock.js:71):
- Errors: 404 "Not Found", 500 "Internal Server Error"
- Solution: For 404s, make sure the test data is seeded correctly and the IDs used exist. For 500s, check your update logic for null references or missing required fields, and add error handling/logging to clarify root causes.
Sample Authorization Tests (e.g. test/SampleAuthorization.js:3093, 3117, 3141, 3165, 3206, 3247, 3288, 3329, 3353):
- Error: expected 200 "OK", got 404 "Not Found"
- Solution: The sample being updated does not exist. Make sure your test database is seeded with all required samples before running these tests.

Failing API tests with ElasticSearch enabled no in `8384ba2` because of:

"expected 'Content-Type' header field" errors

Files & Lines:
- test/InstrumentsFilter.js:265, 307, 350, 375, 384, 393, 402 (examples)
Root Cause: The API responses are missing the Content-Type header, likely for JSON payloads.
Solution: In your API controller (where these endpoints are defined), ensure you set this header, for example:
```
res.set('Content-Type', 'application/json');
```
Add this line before sending responses related to instrument filtering.

"expected 200 'OK', got 403 'Forbidden'" errors

Files & Lines:
- test/OrigDatablockForRawDataset.js:499, 558
- test/OrigDatablockV4.js:866, 886
Root Cause: The tests are trying to update resources but lack the required permissions, or the authentication/authorization logic is incorrect.
Solution:
- Ensure test users have the right roles/permissions.
- Check your authorization middleware to confirm that users (admin or ownerGroup) are correctly allowed to perform updates.
- Review the related endpoint logic to ensure it respects the test setup.

"expected 200 'OK', got 404 'Not Found'" errors

Files & Lines:
- test/Proposal.js:359
- test/SampleAuthorization.js:3093, 3117, 3141, 3165, 3206, 3247, 3288, 3329, 3353
Root Cause: The test is attempting to update or reference entities that do not exist, possibly due to missing setup steps or cleanup from previous tests.
Solution:
- Make sure the test data (proposals, samples) are created before being updated.
- In your test files, verify that setup hooks (beforeEach, etc.) are actually creating the required entities and not failing silently.
- Example:
```
beforeEach(async () => {
  await createSample({ id: 1, ... });
});
```
  If the entity creation fails, the update will return 404.

"expected 200 'OK', got 500 'Internal Server Error'" and "expected 400 'Bad Request', got 500 'Internal Server Error'"

Files & Lines:
- test/RawDataset.js:310, 326, 342, 358, 371, 384, 397
- test/RawDatasetDatablock.js:71
Root Cause: The server is crashing during update attempts, likely due to missing properties, null references, or bad data.
Solution:
- Add error handling and input validation in the update endpoints for RawDataset and Datablock.
- In the failing tests, log the request data and the error messages to pinpoint what is going wrong.
- Example: Check for null or undefined values before accessing properties or saving to the database.

joe-baudisch · 2025-10-03T09:45:47Z

It's not my fault that some tests failed in 8384ba2 and 1990822

nitrosx

I left few comments which might become change requests.

Also, we should think about submitting a following-up PR introducing all the API tests covering all the use cases.
I will be happy to help you defining the use cases.

nitrosx · 2025-10-03T11:42:08Z

src/datasets/datasets.v4.controller.ts

+  async findByIdAndUpdateInternal(
+    @Req() request: Request,
+    @Param("pid") pid: string,
+    @Headers() headers: Record<string, string>,
+    @Body()
+    updateDatasetDto: PartialUpdateDatasetDto,
+  ): Promise<OutputDatasetDto | null> {


Why do we define function findByIdAndUpdateInternal and call it only from findByIdAndUpdate?
Can we combine findByIdAndUpdate and findByIdAndUpdateInternal?

nitrosx · 2025-10-03T11:46:27Z

src/instruments/schemas/instrument.schema.ts

  },
 })
-export class Instrument {
+export class Instrument extends QueryableClass {


Have we updated the instrument output dto to provide the additional fields introduced by QueryableClass?

nitrosx · 2025-10-03T11:53:57Z

src/published-data/published-data.v4.controller.ts

I'm not sure that I would introduce this change here!!!
If you are creating a published data record the datasets needs to be public, no exception.

That said, I might not be knowledgeable about the background and context that you are operating in.
Can you explain better why you need the changes in this file?

bpedersen2

I have some change requests here:

We should cleanup this long string of commits by rebasing and possibly squashing some commits.
I think the if-modifed check should be implemented as a Pipe or Interceptor, so we do not get so much code doubling. It also would make it easier to later implement a hash check

minottic

I very much agree with @bpedersen2 comment, if this was a pipe or an interceptor a lot of duplication could be avoided

minottic · 2025-11-04T08:41:12Z

src/attachments/attachments.v4.controller.ts

+
+    if (headerDate && headerDate <= foundAttachment.updatedAt) {
+      throw new HttpException(
+        "Update error due to failed if-modified-since condition",


maybe we should raise something more talkative?

There are alreay the ConflictException and the PreconditionFailed exceptions available from nest.
How about PreconditionFailed('Resource has been changed on server')

See also https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/412 for an ETAG example

bpedersen2 · 2025-11-04T09:41:00Z

As for using ETags, the hash should be done over the resource, so we could add that to all GET requests and also return it on POST request via an interceptor.

Another question, why is done only for patch ( which should only send fields that you want to change anyway), but not for PUT, which would overwrite the complete entry?

cfelder · 2025-11-04T09:44:28Z

We have already discussed that ETag implementation can be handled separately. I Strongly suggest to concentrate on the if-unmodified-since header implementation here to not further delay reaching a state where we can merge this.

joe-baudisch · 2025-11-04T11:21:58Z

I agree with #2138 (comment)

cchndl · 2025-11-04T13:50:24Z

Another question, why is done only for patch ( which should only send fields that you want to change anyway), but not for PUT, which would overwrite the complete entry?

We wanted to keep the scope small while still solving our problems. That's why it includes only the patch endpoints.

( which should only send fields that you want to change anyway),

Of course if these fields are distinct sets it works already with PATCH. But as discussed in the mentioned issue, (linking it here again), the problem is with nested structures like the list of data files. There the patch request will override the changes made by other clients, since there is no merging of lists. And if the changes touch the same fields, the same problem applies anyway.

integrating if-unmodified-since into dataset, sample if-unmodified-since into attachments,datablocks,proposals,datasets make headerDate from if-unmodified-since more robust resolve some conversations adding test fixing bug by defining controller method without decorator adding attachments.v4.controller test adding more tests adding final test with datasets.v4.controller_if-unmodified-since_.spec.ts merge two test-files into one lint fix lint fix

Junjiequan force-pushed the master branch from a6b04a5 to fd0c9f8 Compare August 28, 2025 07:56

nitrosx requested changes Sep 2, 2025

View reviewed changes

joe-baudisch mentioned this pull request Sep 5, 2025

Test for considering header field if-unmodified-since for different patch requests #2179

Closed

joe-baudisch requested a review from nitrosx September 15, 2025 12:12

joe-baudisch requested a review from a team as a code owner September 25, 2025 11:54

nitrosx requested changes Oct 3, 2025

View reviewed changes

bpedersen2 requested changes Nov 4, 2025

View reviewed changes

minottic reviewed Nov 4, 2025

View reviewed changes

joe-baudisch force-pushed the master branch from fb0332c to a80f75b Compare November 5, 2025 13:35

joe-baudisch force-pushed the master branch from a80f75b to cf11d43 Compare November 5, 2025 14:19

fix: considering header field if-unmodified-since for different patch requests #2138

Are you sure you want to change the base?

fix: considering header field if-unmodified-since for different patch requests #2138

Conversation

joe-baudisch commented Aug 22, 2025

Description

Fixes

Uh oh!

Junjiequan commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joe-baudisch commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cchndl commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nitrosx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nitrosx commented Sep 2, 2025

Uh oh!

joe-baudisch commented Sep 2, 2025

Uh oh!

Junjiequan commented Sep 2, 2025

Uh oh!

nitrosx commented Sep 2, 2025

Uh oh!

cchndl commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cchndl commented Sep 2, 2025

Uh oh!

nitrosx commented Sep 3, 2025

Uh oh!

cfelder commented Sep 3, 2025

Uh oh!

nitrosx commented Sep 4, 2025

Uh oh!

joe-baudisch commented Sep 5, 2025

Uh oh!

joe-baudisch commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joe-baudisch commented Sep 9, 2025

Uh oh!

cchndl commented Sep 12, 2025

Uh oh!

joe-baudisch commented Oct 2, 2025

Failing API tests with ElasticSearch enabled yes in 8384ba2 because of:

Failing API tests with ElasticSearch enabled no in 8384ba2 because of:

Uh oh!

joe-baudisch commented Oct 3, 2025

Uh oh!

nitrosx left a comment

Choose a reason for hiding this comment

Uh oh!

nitrosx Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

nitrosx Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

nitrosx Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

bpedersen2 left a comment

Choose a reason for hiding this comment

Uh oh!

minottic left a comment

Choose a reason for hiding this comment

Junjiequan commented Aug 28, 2025 •

edited

Loading

joe-baudisch commented Aug 29, 2025 •

edited

Loading

cchndl commented Sep 2, 2025 •

edited

Loading

cchndl commented Sep 2, 2025 •

edited

Loading

joe-baudisch commented Sep 8, 2025 •

edited

Loading

Failing API tests with ElasticSearch enabled yes in `8384ba2` because of:

Failing API tests with ElasticSearch enabled no in `8384ba2` because of:

joe-baudisch commented Nov 4, 2025 •

edited

Loading