Skip to content

Add mode() aggregation function to find most common unique value. #5864

Description

@jeremy-code

The statistical mode is a very common metric for data and very useful with nominal/categorical data. For example, say a table is of days with categories ["Sunny", "Cloudy", "Rainy"] -- knowing most days are "Rainy" is more useful than using unique() and knowing that the weather can be those categories.

A rough implementation of one can be seen below. The reason for the weird Array.from(counts).find(...) is that most implementations of mode from what I can tell tend to prefer finding the first element with the maximum count when there is a tie. I am open to other implementations if this isn't a concern.

const mode: AggregationFn<any> = (columnId, leafRows) => {
  if (!leafRows.length) {
    return
  }

  let maxCount = 0
  const counts = leafRows.reduce((counts, row) => {
    const value = row.getValue(columnId)
    const valueCount = (counts.get(value) ?? 0) + 1
    maxCount = Math.max(maxCount, valueCount)
    return counts.set(value, valueCount)
  }, new Map<unknown, number>())

  return Array.from(counts).find(([, count]) => count === maxCount)![0]
}

And here is a StackBlitz implementation showing it passes unit tests.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions