fix: cancel timed out requests #65

MQ37 · 2025-03-26T10:35:56Z

Since we cannot remove request that are already being crawled we just return dataset id on timeout so the user knows where to check for results - as we discussed.

closes #31

jirispilka · 2025-03-27T07:39:54Z

On second thought, this goes against the idea of standby mode — it's meant to be request/response. No one will check the dataset, and that's not the goal.

It would be better to fix the issue by terminating the running request and returning empty results.
I can't image a use-case that someone would use it this way.

Can you?

MQ37 · 2025-03-27T08:04:27Z

On second thought, this goes against the idea of standby mode — it's meant to be request/response. No one will check the dataset, and that's not the goal.

It would be better to fix the issue by terminating the running request and returning empty results. I can't image a use-case that someone would use it this way.

Can you?

Agree, not many people will check the dataset and they will consume mainly the response itself. Will check if there is a simple way to skip or prevent the request from crawling.

MQ37 · 2025-03-27T09:44:14Z

@jirispilka Changed the implementation to cancel the requests of timed out response based on the discussion apify/crawlee#1215. This way the request content is not handled, they are only added to the dataset with status failed with reason timed out.

matyascimbulka

I also agree that in standby mode the dataset is irrelevant. Thank you the solutions looks good. I just have small comments.

The biggest issue for me is that the bounded array doesn't survive migrations which could potentially leave some requests running. @jirispilka Do you think this could be an issue?

And there are 2 lint errors.

src/helpers/bouded-array.ts

src/request-handler.ts

src/state.ts

src/responses.ts

matyascimbulka · 2025-03-27T14:50:18Z

@MQ37 I'm looking at the code again and I'm wondering if we can't just check if responseData in responses.ts has the responseId key when handling the request in the handler. That way we wouldn't need to manage another new array and it would survive migrations.

MQ37 · 2025-03-27T14:56:10Z

@MQ37 I'm looking at the code again and I'm wondering if we can't just check if responseData in responses.ts has the responseId key when handling the request in the handler. That way we wouldn't need to manage another new array and it would survive migrations.

This might be simpler but we would still need to handle the migration, right? Or the response data is handled on migration?

matyascimbulka · 2025-03-27T15:23:05Z

I think migrations would work as expected. The responseData map would be initiated empty therefore any resurrected requests wouldn't find their responseId as its key. The missing key would signal the handler to set the request.noRetry = true; and throw the error.

MQ37 · 2025-03-27T15:30:16Z

I think migrations would work as expected. The responseData map would be initiated empty therefore any resurrected requests wouldn't find their responseId as its key. The missing key would signal the handler to set the request.noRetry = true; and throw the error.

I guess the responseData approach would be simpler, thank you for suggestion 👍

What actually happens to the user request to the /search endpoint when the Actor migrates. Does Apify act as a persist the connection that is then handed over to the new Actor container or is it closed? If it is closed we actually do not need to handle this case.

MQ37 · 2025-03-27T15:42:57Z

I think migrations would work as expected. The responseData map would be initiated empty therefore any resurrected requests wouldn't find their responseId as its key. The missing key would signal the handler to set the request.noRetry = true; and throw the error.

I guess the responseData approach would be simpler, thank you for suggestion 👍

What actually happens to the user request to the /search endpoint when the Actor migrates. Does Apify act as a persist the connection that is then handed over to the new Actor container or is it closed? If it is closed we actually do not need to handle this case.

Ahh, I see. We just sent Actor is migrating please try again and cut the connection.

MQ37 · 2025-03-27T15:46:27Z

@matyascimbulka refactored based on your suggestion and the implementation is much simpler, thank you 👍

matyascimbulka

You're welcome. I'm happy to help. This looks good.

jirispilka

@MQ37 @matyascimbulka thank you guys!

@MQ37 please just update the CHANGELOG.md and we are good to go.

This reverts commit d9eddc7.

This reverts commit 7d66686.

* Revert "chore: Revert "fix: cancel timed out requests (#65)" (#67)" This reverts commit 7d66686. * only cancel requests for standby actors * Update CHANGELOG.md * Update CHANGELOG.md

return dataset it on request timeout

a1d1853

MQ37 requested a review from jirispilka March 26, 2025 10:35

Handle timed out responses and cancel requests

bf15792

MQ37 changed the title ~~fix: return dataset it on request timeout~~ fix: cancel timed out requests Mar 27, 2025

Add BoundedArray class to manage timed out responses

11ffac9

MQ37 requested a review from matyascimbulka March 27, 2025 09:40

matyascimbulka requested changes Mar 27, 2025

View reviewed changes

src/helpers/bouded-array.ts Outdated Show resolved Hide resolved

src/request-handler.ts Outdated Show resolved Hide resolved

src/state.ts Outdated Show resolved Hide resolved

src/responses.ts Outdated Show resolved Hide resolved

Remove BoundedArray and refactor timeout handling to use responseData

38db8ce

MQ37 requested a review from matyascimbulka March 27, 2025 15:47

Remove unused constant TIMED_OUT_RESPONSE_ARRAY_SIZE

40c55c0

matyascimbulka approved these changes Mar 27, 2025

View reviewed changes

jirispilka approved these changes Mar 27, 2025

View reviewed changes

Update CHANGELOG for version 1.0.13

144081b

MQ37 merged commit d9eddc7 into master Mar 27, 2025
1 check passed

MQ37 added a commit that referenced this pull request Mar 27, 2025

Revert "fix: cancel timed out requests (#65)"

19bfbb6

This reverts commit d9eddc7.

MQ37 mentioned this pull request Mar 27, 2025

chore: Revert "fix: cancel timed out requests" #67

Merged

MQ37 added a commit that referenced this pull request Mar 27, 2025

chore: Revert "fix: cancel timed out requests (#65)" (#67)

7d66686

This reverts commit d9eddc7.

MQ37 added a commit that referenced this pull request Mar 27, 2025

Revert "chore: Revert "fix: cancel timed out requests (#65)" (#67)"

e1f022e

This reverts commit 7d66686.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: cancel timed out requests #65

fix: cancel timed out requests #65

MQ37 commented Mar 26, 2025

jirispilka commented Mar 27, 2025 •

edited

Loading

MQ37 commented Mar 27, 2025

MQ37 commented Mar 27, 2025

matyascimbulka left a comment

matyascimbulka commented Mar 27, 2025

MQ37 commented Mar 27, 2025 •

edited

Loading

matyascimbulka commented Mar 27, 2025

MQ37 commented Mar 27, 2025 •

edited

Loading

MQ37 commented Mar 27, 2025

MQ37 commented Mar 27, 2025

matyascimbulka left a comment

jirispilka left a comment

fix: cancel timed out requests #65

fix: cancel timed out requests #65

Conversation

MQ37 commented Mar 26, 2025

jirispilka commented Mar 27, 2025 • edited Loading

MQ37 commented Mar 27, 2025

MQ37 commented Mar 27, 2025

matyascimbulka left a comment

Choose a reason for hiding this comment

matyascimbulka commented Mar 27, 2025

MQ37 commented Mar 27, 2025 • edited Loading

matyascimbulka commented Mar 27, 2025

MQ37 commented Mar 27, 2025 • edited Loading

MQ37 commented Mar 27, 2025

MQ37 commented Mar 27, 2025

matyascimbulka left a comment

Choose a reason for hiding this comment

jirispilka left a comment

Choose a reason for hiding this comment

jirispilka commented Mar 27, 2025 •

edited

Loading

MQ37 commented Mar 27, 2025 •

edited

Loading

MQ37 commented Mar 27, 2025 •

edited

Loading