-
Notifications
You must be signed in to change notification settings - Fork 102
refactor: custom lexer #437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -24,6 +24,7 @@ biome_rowan = "0.5.7" | |||
biome_string_case = "0.5.8" | |||
bpaf = { version = "0.9.15", features = ["derive"] } | |||
crossbeam = "0.8.4" | |||
enum-iterator = "2.1.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wir haben schon strum
, denke mal das macht das selbe
let mut ends_with_semicolon = false; | ||
|
||
// Iterate through tokens in reverse to find the last non-whitespace token | ||
for idx in (0..lexed.len()).rev() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wie wär's hier mit matches(iter.filter(..).next_back(), Some(semi))
?
|
||
/// Returns an iterator over token kinds | ||
pub fn tokens(&self) -> impl Iterator<Item = SyntaxKind> + '_ { | ||
(0..self.len()).map(move |i| self.kind(i)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.kind.iter().copied()
?
|
||
/// Returns the kind of token at the given index | ||
pub fn kind(&self, idx: usize) -> SyntaxKind { | ||
assert!(idx < self.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
willst du hier noch ne message anfügen?
ansonsten wär's vllt. besser, einfach den access in 53
panicen zu lassen, dann kriegt man zumindest eine index-out-of-bounds meldung, oder?
.collect() | ||
} | ||
|
||
pub(crate) fn text_range(&self, i: usize) -> std::ops::Range<usize> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ich glaube in allen fällen wird die std::ops::Range<usize>
zu einer TextRange
gemapped, vllt. dann besser die logic in range(..)
packen?
fn range_text(&self, r: std::ops::Range<usize>) -> &str { | ||
assert!(r.start < r.end && r.end <= self.len()); | ||
let lo = self.start[r.start] as usize; | ||
let hi = self.start[r.end] as usize; | ||
&self.text[lo..hi] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wenn ich richtig sehe wird das hier nur in text
genutzt, vllt. die logik dann da rein packen statt einer indirection?
tokenizer
crate that turns a string into simple tokenslexer
+lexer_codegen
that uses the tokeniser to lex into a newSyntaxKind
enumthe new implementation is
LineEnding
variant comes with a count)in a follow-up, we will be able to:
todos: