-
-
Notifications
You must be signed in to change notification settings - Fork 262
PERCENTILE_CONT and PERCENTILE_DISC functions #8807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| ERR_post(Arg::Gds(isc_wronumarg)); // WITHIN GROUP should only 1 ORDER item | ||
| } | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| fb_assert(dsqlOrderClause); | ||
| if (dsqlOrderClause->items.getCount() != 1) | ||
| { | ||
| ERR_post(Arg::Gds(isc_wronumarg)); // WITHIN GROUP should only 1 ORDER item |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error has margin to deduce different things.
| AggNode::pass2(tdbb, csb); | ||
|
|
||
| // We need a second descriptor in the impure area for PERCENTILE. | ||
| impure2Offset = csb->allocImpure<impure_value_ex>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you need to extra impure area instead of declare a single struct with all (and only) the things you need?
| if (desc->isNull()) | ||
| return false; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be necessary.
| if (desc->isBlob()) | ||
| ERRD_post(Arg::Gds(isc_blobnotsup) << Arg::Str("ORDER BY")); | ||
|
|
||
| const auto percentile_value = MOV_get_double(tdbb, percenteDesc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name things in the more modern convention, please: percentileValue
| for (auto& nodeOrder : sort->expressions) | ||
| { | ||
| dsc toDesc = *(descOrder++); | ||
| toDesc.dsc_address = data + (IPTR)toDesc.dsc_address; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| toDesc.dsc_address = data + (IPTR)toDesc.dsc_address; | |
| toDesc.dsc_address = data + (IPTR) toDesc.dsc_address; |
| EVL_make_value(tdbb, desc, impure2); | ||
| } | ||
| } | ||
| else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| else { | |
| else | |
| { |
| impure2->vlu_misc.vlu_double += value * (percentileImpure->crn - percentileImpure->rn); | ||
| } | ||
| } | ||
| if (impure2->vlux_count == percentileImpure->crn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if (impure2->vlux_count == percentileImpure->crn) | |
| if (impure2->vlux_count == percentileImpure->crn) |
|
|
||
| private: | ||
| const PercentileType type; | ||
| NestConst<ValueExprNode> valueArg; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need to override getChildren and add these values?
PERCENTILE_DISC and PERCENTILE_CONT functions
The
PERCENTILE_CONTandPERCENTILE_DISCfunctions are known as inverse distribution functions.These functions operate on an ordered set. Both functions can be used as aggregate or window functions.
PERCENTILE_DISC
PERCENTILE_DISCis an inverse distribution function that assumes a discrete distribution model.It takes a percentile value and a sort specification and returns an element from the set.
Nulls are ignored in the calculation.
Syntax for the
PERCENTILE_DISCfunction as an aggregate function.Syntax for the
PERCENTILE_DISCfunction as an window function.The first argument
<percent>must evaluate to a numeric value between 0 and 1, because it is a percentile value.This expression must be constant within each aggregate group.
The
ORDER BYclause takes a single expression that can be of any type that can be sorted.The function
PERCENTILE_DISCreturns a value of the same type as the argument inORDER BY.For a given percentile value
P,PERCENTILE_DISCsorts the values of the expression in theORDER BYclause andreturns the value with the smallest
CUME_DISTvalue (with respect to the same sort specification)that is greater than or equal to
P.Analytic Example
PERCENTILE_CONT
PERCENTILE_CONTis an inverse distribution function that assumes a continuous distribution model.It takes a percentile value and a sort specification and returns an element from the set.
Nulls are ignored in the calculation.
Syntax for the
PERCENTILE_CONTfunction as an aggregate function.Syntax for the
PERCENTILE_CONTfunction as an window function.The first argument
<percent>must evaluate to a numeric value between 0 and 1, because it is a percentile value.This expression must be constant within each aggregate group.
The
ORDER BYclause takes a single expression, which must be of numeric type to perform interpolation.The
PERCENTILE_CONTfunction returns a value of typeDOUBLE PRECISIONorDECFLOAT(34)depending on the typeof the argument in the
ORDER BYclause. A value of typeDECFLOAT(34)is returned ifORDER BYcontainsan expression of one of the types
INT128,NUMERIC(38, x)orDECFLOAT(16 | 34), otherwise -DOUBLE PRECISION.The result of
PERCENTILE_CONTis computed by linear interpolation between values after ordering them.Using the percentile value (
P) and the number of rows (N) in the aggregation group, you can computethe row number you are interested in after ordering the rows with respect to the sort specification.
This row number (
RN) is computed according to the formulaRN = (1 + (P * (N - 1)).The final result of the aggregate function is computed by linear interpolation between the values from rows
at row numbers
CRN = CEILING(RN)andFRN = FLOOR(RN).Analytic Example
An example of using both aggregate functions