Skip to content

Resolved bug in parse_function_arg #1826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

LucaCappelletti94
Copy link
Contributor

This pull request resolves the bug described in issue #1825, which was caused by an incorrect implementation of the named argument parsing. It also adds a few tests to verify that the new implementation is correct.

The previous implementation made the incorrect assumption that arguments name cannot have the same name as types, but the set of types that are parsed as types in sqlparser is a superset of the types that are present in each dialect. Therefore, it is correct syntax to use as argument name for instance int2 for PostgreSQL, while this same argument name would be interpreted as a type elsewhere.

I have changed the parsing to determine via a look-ahead whether the name is a type or not.

Best,
Luca

@@ -5199,13 +5199,20 @@ impl<'a> Parser<'a> {

// parse: [ argname ] argtype
let mut name = None;
let next_token = self.peek_token();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code you proposed does not work as Int2 (or any analogous such type) does not fall in if let DataType::Custom(n, _) = &data_type {

Oh how did you mean here by Int2 in this example not being parsed as a custom datatype, do we get back a different type or does parse_data_type fail in that scenario?

I think ideally we will want to do without this self.peek_token() to avoid the cloning that it includes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument named Int2 (as described in the issue) is not parsed as DataType::Custom, but as a DataType::Int2. Analogously, any other such argument names that collides with data types from other SQL engines would be parsed into a type.

Now, if I were to convert back to string DataType::Int2 I would get some arbitrary capitalization which in this case is INT2 - without the peek_token, I am unsure how we can preserve the initial token from being lost.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see that makes sense! Maybe something like this we can do to restrict the cloning to only when necessary?

let data_type_idx = self.get_current_index();
if let Some(next_data_type) = self.maybe_parse(|parser| {
    name = parser.token_at(data_type_idx).to_string();
   // ...
})

Coming to think about it, would we not need to sanity check that the first token is actually a Token::Word variant? current code seems to assume that to be the case which might not necessarily be true.
For example following how the following sql would be parsed, we can probably have a test case it

function(struct<a,b> int64)

we would call to_string() on only the first token which would be struct even though this query is technically invalid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide a complete example of such a broken case, so that I may add it to the test suite?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something like this potentially

let sql = r#"CREATE OR REPLACE FUNCTION foo(a TIMESTAMP WITH TIMEZONE, b VARCHAR) RETURNS BOOLEAN LANGUAGE plpgsql AS $$
BEGIN
	RETURN TRUE;
END;
$$"#;
pg().verified_stmt(sql);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added that test and refactored the code as you described, plus added the case of struct<a,b> and ensured that the token must be a Word token.

@iffyio iffyio marked this pull request as draft May 14, 2025 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants