Skip to content

Remove hardcoded rule for handling erroneous status codes differently#1687

Merged
thomas-zahner merged 20 commits intolycheeverse:masterfrom
thomas-zahner:1661-update-accept-behaviour
Jun 11, 2025
Merged

Remove hardcoded rule for handling erroneous status codes differently#1687
thomas-zahner merged 20 commits intolycheeverse:masterfrom
thomas-zahner:1661-update-accept-behaviour

Conversation

@thomas-zahner
Copy link
Copy Markdown
Member

Closes #1661

@thomas-zahner
Copy link
Copy Markdown
Member Author

So I tracked down the code that is responsible for the current behaviour. Turns out it has nothing to do with clap, default values or range parsing. Instead it seems like a design decision made in the past. Also it's really not much code that is responsible for that.

Basically, if a HTTP code is not explicitly accepted (via --accept or its default value) we use error_for_status_ref to handle erroneous status codes in the code.

Removing this explicit handling would also imply that we remove the Status::Redirected and Status::UnknownStatusCode variants.

Also I wasn't sure at first if the new variant should be added to the Status or ErrorKind struct. But since I think it should behave like a "normal" error I've added it to ErrorKind.

This means it's not counted like unknowns as a new category:

🔍 1 Total (in 0s) ✅ 0 OK 🚫 0 Errors ❓ 1 Unknown

But instead like normal errors:

$ echo 'https://httpstat.us/200' | cargo run - --accept 201 --verbose

   [ERROR] https://httpstat.us/200 | Rejected status code: 200 OK (this depends on your "accept" configuration)

Issues found in 1 input. Find details below.

[stdin]:
   [ERROR] https://httpstat.us/200 | Rejected status code: 200 OK (this depends on your "accept" configuration)

🔍 1 Total (in 0s) ✅ 0 OK 🚫 1 Error

@mre what are your thoughts on this. Is this approach okay with you? If so, we should of course update the documentation accordingly and mention the breaking changes in the release notes.

@mre
Copy link
Copy Markdown
Member

mre commented May 9, 2025

I agree.

The rule should be that Status is strictly set by the checking logic and should directly relate to a website's status (as in "status code"). It's nothing that lychee should temper with.
The error part is mostly for expressing how we handle the status that comes back from the website. So, yeah, if a user defines that they don't want to accept a certain status code as "valid", that should surface as an error.

@thomas-zahner thomas-zahner force-pushed the 1661-update-accept-behaviour branch from 9c7c15e to d53ef3b Compare May 9, 2025 15:09
@thomas-zahner thomas-zahner marked this pull request as ready for review May 12, 2025 09:51
@thomas-zahner
Copy link
Copy Markdown
Member Author

@mre The change has more implications than I initially expected. What do you think about the PR?

@thomas-zahner thomas-zahner requested a review from mre May 12, 2025 11:35
Comment thread lychee-lib/src/client.rs Outdated
Comment thread lychee-lib/src/retry.rs
}) = self.github_error()
{
source.should_retry()
} else if let Self::RejectedStatusCode(StatusCode::TOO_MANY_REQUESTS) = self {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And not quite sure if this the optimal approach

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, not sure about this. Doesn't that mean: "if a user decided to reject a 429 Too Many Requests status code, you should still retry"? To me, that sounds like the user should be in control here; if they decided to consider 429s as hard errors, so be it. 😉 That should be the final answer, and we shouldn't retry. So I'd remove that condition and see where it breaks. (Unless I'm misunderstanding.)

Copy link
Copy Markdown
Member Author

@thomas-zahner thomas-zahner May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So actually I think we should leave it like this. Otherwise it means that lychee will never retry requests when a 429 occurs. The reason I noticed it in the first place was because without these lines there are test failures. If we don't keep these two lines that would mean that --max-retries would now only apply to the above and below cases. (octocrab and reqwest errors) The reason for this is again the removal of response.error_for_status_ref as explained in my other comment. So previously we entered the if let Some(r) = self.reqwest_error() case.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Can we add a comment to explain that behavior. I'm sure I'll forget why it's there otherwise. 😅

Copy link
Copy Markdown
Member

@mre mre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some remarks. Overall, I like the approach and would go forward with this.

Comment thread lychee-lib/src/checker/website.rs Outdated
Comment thread lychee-lib/src/client.rs Outdated
Comment thread lychee-lib/src/client.rs
Comment thread lychee-lib/src/retry.rs
}) = self.github_error()
{
source.should_retry()
} else if let Self::RejectedStatusCode(StatusCode::TOO_MANY_REQUESTS) = self {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, not sure about this. Doesn't that mean: "if a user decided to reject a 429 Too Many Requests status code, you should still retry"? To me, that sounds like the user should be in control here; if they decided to consider 429s as hard errors, so be it. 😉 That should be the final answer, and we shouldn't retry. So I'd remove that condition and see where it breaks. (Unless I'm misunderstanding.)

Comment thread lychee-lib/src/types/status.rs
@thomas-zahner thomas-zahner force-pushed the 1661-update-accept-behaviour branch from c783386 to 2a03fd8 Compare May 23, 2025 09:49
@thomas-zahner thomas-zahner requested a review from mre May 23, 2025 14:20
@thomas-zahner thomas-zahner force-pushed the 1661-update-accept-behaviour branch from 7fc073e to d909cff Compare May 23, 2025 14:26
@thomas-zahner thomas-zahner force-pushed the 1661-update-accept-behaviour branch from d909cff to ef79f7b Compare June 4, 2025 12:52
Comment thread lychee-lib/src/types/error.rs Outdated
Co-authored-by: Matthias Endler <matthias@endler.dev>
@thomas-zahner thomas-zahner force-pushed the 1661-update-accept-behaviour branch from d3ac9fa to 98f3dcf Compare June 11, 2025 07:14
Copy link
Copy Markdown
Member

@mre mre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Can we just add one more cli test to ensure that --accept overrides the defaults? Like so?

#[tokio::test]
async fn test_accept_overrides_defaults_not_additive() -> Result<()> {
    let mock_server_200 = mock_server!(StatusCode::OK);
    
    let mut cmd = main_command();
    cmd.arg("--accept")
        .arg("404")  // ONLY accept 404 - should reject normally-accepted 200
        .arg("-")
        .write_stdin(mock_server_200.uri())
        .assert()
        .failure()
        .code(2)
        .stdout(contains(format!(
            r#"[200] {}/ | Rejected status code (this depends on your "accept" configuration): OK"#,
            mock_server_200.uri()
        )));

    Ok(())
}

@thomas-zahner
Copy link
Copy Markdown
Member Author

Thanks for the test. For some reason I forgot about that, this makes a lot of sense!

@thomas-zahner thomas-zahner merged commit e59456b into lycheeverse:master Jun 11, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update --accept behaviour

2 participants