Skip to content

Forbid usage of some complex Option:: methods. #5754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 12 additions & 1 deletion quickwit/clippy.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,17 @@
disallowed-methods = [
"std::path::Path::exists"
# This function is not sound because it does not return a Result
"std::path::Path::exists",
# These functions hurt readability (according to Paul)
"std::option::Option::is_some_and",
"std::option::Option::is_none_or",
"std::option::Option::xor",
# "std::option::Option::and_then",
# .map(..).unwrap_or(..) or let Some(..) else {..}
"std::option::Option::map_or",
# .map(..).unwrap_or_else(..) or let Some(..) else {..}
"std::option::Option::map_or_else",
Comment on lines +4 to +12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't agree with this list (except probably for xor). i generally like functional-style operation chaining. Can other people weight in on what we should do? cc @guilload

Copy link
Member

@guilload guilload Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should vote on this issue, because the U.S. western democracies show us that it does not work. Sorry, I'm feeling cheeky this morning.

I also generally like functional-style operation chaining, so I'm biased. However, I do agree that Rust is taking it too far sometimes.

I also don't think we should take this issue too seriously because once the conversation is about personal taste and preference, it's hard to have a productive exchange.

Generally, I'm fine with allowing none of the functional constructs, some of them, or all of them. In the first case, it might make the code more readable for some people. In the second, we try to strike a balance and code might be more enjoyable to write for people like Trinity and me. Finally, in the third case, it's just beneficial for Paul's brain plasticity ;)

As main contributors of Quickwit, I invite you two to have a face to face call to find a middle ground that works for both of you.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me they are all hard to read except is_some_and.

Generally map + unwrap_or is very readable to me (and also kinda standardized)

I think this is should not be about personal taste, but readability, e.g. is_some_and is readable to me, but I don't like it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, so let's replace .map_or() and its variants with .map().unwrap_or() then

]

ignore-interior-mutability = [
"bytes::Bytes",
"bytestring::ByteString",
Expand Down
3 changes: 1 addition & 2 deletions quickwit/quickwit-common/src/uri.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@

use std::borrow::Cow;
use std::env;
use std::ffi::OsStr;
use std::fmt::{Debug, Display};
use std::hash::Hash;
use std::path::{Component, Path, PathBuf};
Expand Down Expand Up @@ -126,7 +125,7 @@ impl Uri {

/// Returns the extension of the URI.
pub fn extension(&self) -> Option<&str> {
Path::new(&self.uri).extension().and_then(OsStr::to_str)
Path::new(&self.uri).extension()?.to_str()
}

/// Returns the URI as a string slice.
Expand Down
3 changes: 2 additions & 1 deletion quickwit/quickwit-ingest/src/ingest_v2/replication.rs
Original file line number Diff line number Diff line change
Expand Up @@ -893,7 +893,8 @@ mod tests {
let replication_position_inclusive = subrequest
.from_position_exclusive()
.as_usize()
.map_or(batch_len - 1, |pos| pos + batch_len);
.map(|pos| pos + batch_len)
.unwrap_or(batch_len - 1);
ReplicateSuccess {
subrequest_id: subrequest.subrequest_id,
index_uid: subrequest.index_uid.clone(),
Expand Down
1 change: 1 addition & 0 deletions quickwit/quickwit-ingest/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ mod doc_batch;
pub mod error;
mod ingest_api_service;
#[path = "codegen/ingest_service.rs"]
#[allow(clippy::disallowed_methods)]
mod ingest_service;
mod ingest_v2;
mod memory_capacity;
Expand Down
10 changes: 6 additions & 4 deletions quickwit/quickwit-janitor/src/janitor_service.rs
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,13 @@ impl JanitorService {
}

fn is_healthy(&self) -> bool {
self.delete_task_service_handle
.as_ref()
.is_none_or(|delete_task_service_handle| {
let delete_task_is_not_failure: bool =
if let Some(delete_task_service_handle) = &self.delete_task_service_handle {
delete_task_service_handle.state() != ActorState::Failure
})
} else {
true
};
delete_task_is_not_failure
&& self.garbage_collector_handle.state() != ActorState::Failure
&& self.retention_policy_executor_handle.state() != ActorState::Failure
}
Expand Down
2 changes: 1 addition & 1 deletion quickwit/quickwit-proto/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
// limitations under the License.

#![allow(clippy::derive_partial_eq_without_eq)]
#![deny(clippy::disallowed_methods)]
#![allow(clippy::disallowed_methods)]
#![allow(rustdoc::invalid_html_tags)]

use std::cmp::Ordering;
Expand Down
61 changes: 32 additions & 29 deletions quickwit/quickwit-search/src/leaf_cache.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
// See the License for the specific language governing permissions and
// limitations under the License.

use std::ops::Bound;
use std::ops::{Bound, RangeBounds};

use prost::Message;
use quickwit_proto::search::{
Expand Down Expand Up @@ -83,16 +83,16 @@ struct CacheKey {
request: SearchRequest,
/// The effective time range of the request, that is, the intersection of the timerange
/// requested, and the timerange covered by the split.
merged_time_range: Range,
merged_time_range: HalfOpenRange,
}

impl CacheKey {
fn from_split_meta_and_request(
split_info: SplitIdAndFooterOffsets,
mut search_request: SearchRequest,
) -> Self {
let split_time_range = Range::from_bounds(split_info.time_range());
let request_time_range = Range::from_bounds(search_request.time_range());
let split_time_range = HalfOpenRange::from_bounds(split_info.time_range());
let request_time_range = HalfOpenRange::from_bounds(search_request.time_range());
let merged_time_range = request_time_range.intersect(&split_time_range);

search_request.start_timestamp = None;
Expand All @@ -110,28 +110,30 @@ impl CacheKey {
}

/// A (half-open) range bounded inclusively below and exclusively above [start..end).
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
struct Range {
#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)]
struct HalfOpenRange {
start: i64,
end: Option<i64>,
}

impl Range {
/// Create a Range from bounds.
fn from_bounds(range: impl std::ops::RangeBounds<i64>) -> Self {
let empty_range = Range {
impl HalfOpenRange {
fn empty_range() -> HalfOpenRange {
HalfOpenRange {
start: 0,
end: Some(0),
};
}
}

/// Create a Range from bounds.
fn from_bounds(range: impl RangeBounds<i64>) -> Self {
let start = match range.start_bound() {
Bound::Included(start) => *start,
Bound::Excluded(start) => {
// if we exclude i64::MAX from the start bound, the range is necessarily empty
if let Some(start) = start.checked_add(1) {
start
} else {
return empty_range;
return Self::empty_range();
}
}
Bound::Unbounded => i64::MIN,
Expand All @@ -143,44 +145,45 @@ impl Range {
Bound::Unbounded => None,
};

Range { start, end }
HalfOpenRange { start, end }.normalize()
}

fn is_empty(self) -> bool {
!self.contains(&self.start)
}

/// Normalize empty ranges to be 0..0
fn normalize(self) -> Range {
let empty_range = Range {
start: 0,
end: Some(0),
};
match self {
Range {
start,
end: Some(end),
} if start >= end => empty_range,
any => any,
fn normalize(self) -> HalfOpenRange {
if self.is_empty() {
Self::empty_range()
} else {
self
}
}

/// Return the intersection of self and other.
fn intersect(&self, other: &Range) -> Range {
fn intersect(&self, other: &HalfOpenRange) -> HalfOpenRange {
let start = self.start.max(other.start);

let end = match (self.end, other.end) {
(Some(this), Some(other)) => Some(this.min(other)),
(Some(this), None) => Some(this),
(None, other) => other,
};
Range { start, end }.normalize()
HalfOpenRange { start, end }.normalize()
}
}

impl std::ops::RangeBounds<i64> for Range {
impl RangeBounds<i64> for HalfOpenRange {
fn start_bound(&self) -> Bound<&i64> {
Bound::Included(&self.start)
}

fn end_bound(&self) -> Bound<&i64> {
self.end.as_ref().map_or(Bound::Unbounded, Bound::Excluded)
if let Some(end_bound) = &self.end {
Bound::Excluded(end_bound)
} else {
Bound::Unbounded
}
}
}

Expand Down
13 changes: 5 additions & 8 deletions quickwit/quickwit-search/src/root.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1519,10 +1519,8 @@ impl ExtractTimestampRange<'_> {
// a match_none, but the visitor doesn't allow mutation.
lower_bound = lower_bound.saturating_add(1);
}
self.start_timestamp = Some(
self.start_timestamp
.map_or(lower_bound, |current| current.max(lower_bound)),
);

self.start_timestamp = self.start_timestamp.max(Some(lower_bound));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know this is correct, but i find relying on Ord of Option to not be something easy to read. If we're moving away from map_or (which i don't think we should ban), i prefer the more verbose "min" variant line 1540, or even better, a .map(max).unwrap_or(val) (you might not like that though)

rational: to me, this operation is "take the larger of the values, or lower_bound if start_timestamp is unset", and this is how the code used to read. With just max, you need some thinking to make sure this is correct, and there isn't really a human sentence to describe what it does. With .unwrap_or(bound).max(bound) we say "take the current value, or the bound if unset, and compare that to the bound", which is the same, but with a stranger wording

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're optimizing for Paul's definition of readability, we should rename start_timestamp to start_timestamp_opt and then:

self.start_timestamp_opt = if let Some(start_timestamp) = start_timestamp_opt {
  Some(start_timestamp.max(lower_bound))
} else {
  Some(lower_bound)
}

}

fn update_end_timestamp(&mut self, upper_bound: &quickwit_query::JsonLiteral, included: bool) {
Expand All @@ -1537,10 +1535,9 @@ impl ExtractTimestampRange<'_> {
// a match_none, but the visitor doesn't allow mutation.
upper_bound = upper_bound.saturating_add(1);
}
self.end_timestamp = Some(
self.end_timestamp
.map_or(upper_bound, |current| current.min(upper_bound)),
);

let new_end_timestamp = self.end_timestamp.unwrap_or(upper_bound).min(upper_bound);
self.end_timestamp = Some(new_end_timestamp);
}
}

Expand Down
Loading