Skip to content
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.

Add hbase-stats project to contrib/ #131

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Add hbase-stats project to contrib/ #131

wants to merge 1 commit into from

Conversation

jyates
Copy link
Contributor

@jyates jyates commented Apr 10, 2013

This is the basis for supporting equal-depth histograms per region, guideposts, #49.
Its basically a 0.94 backport of the code attached to HBASE-7958. Logically, there is
no real difference, though the implementation is has some slight changes as on trunk there
is no need for a coprocessor as the stats gather is all built in.

Currently, HBASE-7958 is stil open for review and depends on the the system tables (HBASE-7999) or
namespaces (HBASE-8105) patches. We resolve this by just giving the stats table a special name
that should be fairly close to the pending name for the stats table in HBASE-7958. In the event
that we cannot maintain the same table name for the stats table, we do have an opportunity to
copy over the data to the neew table as there is currently a required downtime to upgrade from 0.94
to 0.96.

This is actually a bit more advanced than the posted patch - a lot more work has gone into usability
and verifying correctness. Further, there are some obvious changes as we need to support coprocessors
rather than built options (but that is actually a relatively minor change).

This is the basis for supporting equal-depth histograms per region, guideposts, #49.
Its basically a 0.94 backport of the code attached to HBASE-7958. Logically, there is
no real difference, though the implementation is has some slight changes as on trunk there
is no need for a coprocessor as the stats gather is all built in.

Currently, HBASE-7958 is stil open for review and depends on the the system tables (HBASE-7999) or
namespaces (HBASE-8105) patches. We resolve this by just giving the stats table a special name
that should be fairly close to the pending name for the stats table in HBASE-7958. In the event
that we cannot maintain the same table name for the stats table, we do have an opportunity to
copy over the data to the neew table as there is currently a required downtime to upgrade from 0.94
to 0.96.

This is actually a bit more advanced than the posted patch - a lot more work has gone into usability
and verifying correctness. Further, there are some obvious changes as we need to support coprocessors
rather than built options (but that is actually a relatively minor change).
| primary<region name>col | STAT | min_region_key | 3
```

This is because the MinMaxKey statistic uses the column name (in this case 'col') as the type, we use the only CF on the stats table (STATS) and have to subtypes - info - elements: max_region_key and min_region_key, each with associated values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in this example, would the row look like this?

Row Key Value

primary\0some-var-len-region-name\0min_region_key 3
primary\0some-var-len-region-name\0max_region_key 10

If a column in the PK is variable length, Phoenix expects it to be null terminated. Are region names variable length too?

One thing we'd be after is to be able to query the stats table through Phoenix. It'll definitely make debugging and troubleshooting easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would look like this:

primary\0some-var-len-region-name\0some-var-length-column-name | STAT | max_region_key 10
primary\0some-var-len-region-name\0some-var-length-column-name | STAT | min_region_key 3

Right now the stats reader/writer stuff handles reading it in (albeit is still a bit overly complicated IMO). I'd think we could move to using a phoenix based reader and writer the future when we have a configurable writer. I would want to do the configurable writer work in another patch though - that starts to get even more complicated than it already is

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants