-
Notifications
You must be signed in to change notification settings - Fork 3
Clarify that unagreggated variables in SELECT raise an error #327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Makes it clear that inserting SAMPLE is something the user can do, not something SPARQL implementation should do automatically Close #165
Co-authored-by: Olaf Hartig <[email protected]>
|
Is this section even true? I'm trying to square the language in this note (which exists in similar form in the 1.1 spec) with the translation in I read that as supporting I think the language in this section is probably true of the algebra, but I understood that to be the entire point of the translation logic that wraps un-grouped and un-aggregated variables in I also think this note is a bit confusing, because it's not clear if it is describing a restriction on the example just presented (in which the presence of grouping on |
|
@kasei This is a great point! Thank you! I overlooked it even if it was in the issue motivating the MR (#165). I think the SPARQL 1.1 specification is inconsistent and Current quick survey:
|
I don't think agg08 shows this. It has grouping on the expression
That's interesting. Ignoring current implementations, my feeling here is the insertion case makes the most intuitive sense. Both aggregated values and grouped values should be available for use in any (non-aggregation) select expressions. |
Yes, it's not about single variable indeed. My bad.
It seems to me we are not focusing on the same cases. I am focusing on "simple expressions consisting of just a variable" in the projection. I still think SELECT ?P (COUNT(?O) AS ?C)
WHERE { ?S ?P ?O } GROUP BY ?SThere are two possible outcomes:
In a query level which uses aggregates, only expressions consisting of aggregates and constants may be projected, with one exception. When
Then this query is invalid because `?P` in the `SELECT` is not a grouped value.
Replace V with SAMPLE(V)Then the query become equivalent to SELECT (SAMPLE(?P) AS ?P) (COUNT(?O) AS ?C)
WHERE { ?S ?P ?O } GROUP BY ?Sand we get results. Hence my impression that the spec is inconsistent. My MR is taking the first approach and changing algorithm but I don't feel strongly on it. I just think we should be consistent and make a choice here. |
|
In light of the discussion, looking again at the following sentence from 11.4 Aggregate Projection Restrictions [SPARQL 1.1] ..
(which was the sentence that you wanted to improve with this PR), may be this sentence was indeed meant to say that
of the translation algorithm in 18.2.4.1 Grouping and Aggregation [SPARQL 1.1]. However, I would still consider this inconsistent with the following first paragraph of 11.4 Aggregate Projection Restrictions [SPARQL 1.1]:
This one says, indirectly but clearly, that it is not allowed to project a variable unless the variable is a GROUP BY variable. So, in any case, there is an inconsistency in the spec. Regarding the two possible outcomes, I would find the first one (throwing an error, as done by Jena, Blazegraph, Oxigraph, QLever and Communica) to be more intuitive. The PR in its current form (with commit 3f3e302 included) captures this option. Yet, I agree that this is something to be discussed, at least in the TF if not the whole WG, because no matter which of the two options we pick, it is a change for some implementations. |
|
Not completely sure but I think SAMPLE is inserted to put in an aggregate.
agg08/agg09/agg10 are negative syntax tests so, yes, they don't get to the algebra. Does anyone have an example query where it would hit that SAMPLE injection clause? |
|
It's puzzles like this that make me wish for Invited Experts who participated in the previous WG cycle(s) for the specs being worked in the current WG. We can guess at what was intended, but that always runs the risk of guessing wrong, and more of overlooking some significant issue that led to the current spec not choosing path x (though we would hope they would have explicitly ruled it out). Given the lack of clarity in the existing spec, I don't think we should decide based on a poll of the current implementations for what they've done. Even in the time crunch, I think this calls for some exploration of "what might happen if", good and bad, for each option. |
I was there (member organisation, not IE). After 12+ years, it is prudent to not make absolute claims. |
|
(I meant, as IEs in this WG, presuming that they were no longer with member organizations. I don't really care what their status was in the past group(s), nor would be in this group, as long as they participated!) |
|
First off, I agree with @TallTed that it would be better to have the original editors---or, more precisely, the original author(s) of this part of the spec---around here to help us understand what exactly the intention of such ambiguous sentences was. Yet, as they do not seem to be here, we have to interpret the spec text as given. Doing so, I observe that we (several of the new editors) seem to agree that there are different ways to interpret the given text. Additionally, as illustrated by @Tpt's quick test, vendors also came to different interpretations. Therefore, as one of the new editors, I would like to fix the apparent ambiguity. And. without the previous author(s) of this part of the spec being here to help us interpreting their text—in combination with the fact that implementations are already diverging in terms of this aspect of the spec—I would say it is okay to tighten this aspect of the spec by picking the option that we think is most reasonable. |
|
I'm afraid I've somewhat lost track of the exact issue we're discussing here, and worry that some of this may have been caused by my initial confusion about the proposed text. I think the test case examples pointed to earlier by @Tpt (agg08, agg09, agg10 and agg11 tests) all align with my expectations, but I didn't think any of those showed the problem that we were discussing (which I believed was about using Beyond that confusion, I am somewhat alarmed at the appended commit here that raises a new error during translation. I believe it makes some very simple patterns invalid: For example, for a query with |
|
PREFIX : <http://www.example.org>
SELECT ((?o1 + ?o2) AS ?o12)
WHERE { ?s :p ?o1; :q ?o2 }
GROUP BY (?o1 + ?o2)
ORDER BY ?o12The user can get the effect by assigning the expression, which fixes the value as PREFIX : <http://www.example.org>
SELECT (?X AS ?o12)
WHERE { ?s :p ?o1; :q ?o2 }
GROUP BY (?o1 + ?o2 AS ?X)
ORDER BY ?o12SELECT ?x WHERE { ?x :p 123 } GROUP BY ?xthe query is covered by the text:
Greg's example is one of three cases: and also consider: As Olaf quoted: (11.4 Aggregate Projection Restrictions
Do we agree that
In the text:
"unaggregated variable" is not mentioned anywhere else.
"in X" suggests an expression, but X as an arbitrary expression does not work without a further condition. If
In We can make that an erratum if it is not already covered. One charitable reading of "unaggregated variable" is as a variable used as The The opposite to 11.4 is that Overall, I think that we already have a principle that, after grouping, at a given query level, the set of variables allowed outside aggregates is only the variables used directly for group keys values ( Summary:
|
|
It looks like there is a further simplification:
can be changed to and in 18.3.4.4 SELECT Expressions
still applies. |
|
@afs Thank you for your detailed reply! If you want, feel free to take over the MR to apply the changes. |
|
I think —
— has an extra |
Makes it clear that inserting SAMPLE is something the user can do, not something SPARQL implementation should do automatically
Close #165
Preview | Diff