- Name
- Status
- Description
- Synopsis
- Classes
- Methods
- Author
- Copyright and License
rangeset
This library is considered production ready.
Segmented range which is represented in a list of sorted interleaving range.
A range set can be thought as: [[1, 2], [5, 7]]
.
from pykit import rangeset
a = rangeset.RangeSet([[1, 5], [10, 20]])
a.has(1) # True
a.has(8) # False
a.add([5, 7]) # [[1, 7], [10, 20]]
IntIncRange
is similiar to Range
and shares the same set of API, except it
limits value types to int or long, and its right boundary is closed, thus unlike
Range
(right boundary is open), 2 is in [1, 2]
.
It is similiar to RangeSet
and shares the same set of API, except the default
class for element in it is IntIncRange
, not Range
.
syntax:
rangeset.Range(left, right)
A continuous range.
Range is left-close and right-open.
E.g. a range [1, 3]
has 2 elements 1
and 2
, but 3
is not in this range.
arguments:
-
left
: specifies the left close boundary, which meansleft
is in in this range. -
right
: specifies the right open boundary, which meansright
is NOT in this range.
return:
a rangeset.Range
instance.
Overriding operator &
.
a & b
is the same as a.intersect(b)
.
syntax:
Range(1, 3) & Range(2, 4) # Range(2, 3)
.
syntax:
Range(1, 3) & [2, 4] # Range(2, 3)
.
syntax:
[1, 3] & Range(2, 4) # Range(2, 3)
.
syntax:
Range.cmp(other)
Compare this range with other
.
arguments:
other
: is anotherRange
instance.
return:
1
if this range is on the right ofother
.-1
if this range is on the left ofother
.0
if this range overlaps withother
or they are adjacent ranges. E.g.Range(1, 3)
results in 0 when comparing with these following rangeRange(1, 2)
: overlapping.Range(2, 4)
: overlapping.Range(0, 1)
: adjacent.Range(3, 4)
: adjacent.
syntax:
Range.has(val)
Return True if val
is in this range. Otherwise False
.
arguments:
val
: is the value to check.
return:
bool
syntax:
Range.is_adjacent(other)
Check if this range is at left of other
and they are adjacent and can be
merged into one range.
arguments:
other
: is anotherRange
instance.
return:
True
for [1, 2] and [2, 3]
False
for [1, 2] and [3, 4]
or [1, 2] and [1, 3]
syntax:
Range.intersect(rng)
Intersect this range with another range.
E.g.
Range(1, 5).intersect(2, None) # Range(2, 5)
arguments:
rng
: is alist
,Range
orValueRange
return:
None
if it has no itersection with rng
, or a new instance of intersection.
syntax:
Range.substract(rng)
Remove rng
from a range.
The result is a list of 2 Range
.
Because rng
might split it into two separate sub range.
Synopsis:
Range(0, 5).substract([1, 2]) # [[0, 1], [2, 5]]
Range(0, 5).substract([-1, 2]) # [None, [2, 5]]
Range(0, 5).substract([None, 6]) # [None, None]
arguments:
rng
: is anotherRange
orlist
of 2 elements that represents a range.
return:
a list of 2 Range
instance.
syntax:
Range.length()
Return range length if it is a numeric range such as int
, float
or string
range.
If it is a string range, such as ["a", "abc"]
, the length is a float number
between 0 and 1.
It treats a
and b
as two base-257 float va = 0.a[0]a[1]...
and
vb = 0.b[0]b[1]...
.
And we define the length to be vb - va
.
For i >= len(a)
a[i]
is defined to be 0,
Otherwize, a[i]
is defined to be ord(a[i]) + 1
.
Thus:
- empty string
""
is 0, "\0"
is 1,"a"
is0x62
(ascii of"a"
is0x61
).
There are 257 possible value: empty string ""
and 256 chars "\x00" ~ "\xff"
.
In python the length of [str(a), str(b)]
is defined as:
sum([ (b[i] - a[i]) / 257.0**(i+1)
for i in range(0, max([len(a), len(b)]))
])
return: length in the same type of one of its boundary.
If one of left and right boundary is None
, in which case it is an infite
range, float('inf')
is returned.
syntax:
Range.val()
return: The value of range associated.
syntax:
rangeset.ValueRange(left, right, val=None)
It maps a range to a value.
E.g.: ValueRange(0, 1, 'foo')
defines that value for [0, 1)
is "foo"
.
A ValueRange
is left-close and right-open.
ValueRange
has the same methods as Range
.
syntax:
ValueRange.set(v)
arguments:
v
: the value to update to this range.
return: Nothing
syntax:
rangeset.RangeSet(ranges)
A series of int Range
.
All ranges in it are ascending ordered, non-overlapping and non-adjacent.
arguments:
ranges
: is a list of range.
syntax:
RangeSet.add(rng)
It adds a new range into this range set and if possible, collapse it with overlapping or adjacent range.
Synopsis:
RangeSet([[10, 20], [30, 40]].add([19, 30]) # [[10, 40]]
RangeSet([[10, 20], [30, 40]].add([0, 35]) # [[0, 40]]
arguments:
rng
: is aRange
or a 2 element tuple or list.
return: nothing, but modify the range-set in place.
syntax:
RangeSet.find_overlapped(rng)
Find all ranges those overlaps with rng
.
E.g.
RangeSet([[None, 10], [20, 30], [40, None]]).find_overlapped([29, 41])
# RangeSet([[20, 30, 'b'], [40, None, 'c']])
arguments:
rng
: a range iterable with at least 2 elements, suchlist
,tuple
,Range
orValueRange
.
return:
a instance of self.__class__
with Range
or ValueRange
elements those
overlaps with rng
.
syntax:
RangeSet.has(val)
Checks if a val
is in any of the range segments.
Synopsis:
RangeSet([[10, 20], [30, 40]]).has(1) # False
RangeSet([[10, 20], [30, 40]]).has(10) # True
RangeSet([[10, 20], [30, 40]]).has(20) # False
RangeSet([[10, 20], [30, 40]]).has(25) # False
RangeSet([[10, 20], [30, 40]]).has(30) # True
RangeSet([[10, 20], [30, 40]]).has(50) # False
arguments:
val
: is the value to check.
return:
True
if val
is in it. Or False
.
syntax:
rangeset.RangeDict(iterable=None, range_clz=None, dimension=None)
RangeDict
defines a mapping from ranges to values.
E.g.:
rd = RangeDict([(0, 1, 'foo'), (1, 2, 'bar')])
rd.get(0) # 'foo'
rd.get(1) # 'bar'
rd.get(1.5) # 'bar'
rd.get(2) # KeyError
Difference from RangeSet
:
Adjacent ranges such as (0, 1), (1, 2)
can exist in RangeDict
but can not exist in RangeSet
.
Because in RangeDict
each range there is a value bound.
arguments:
are same as RangeSet
except the default value for range_clz
is ValueRange
instead of Range
.
-
dimension
: specifies if to convert the value in it into a nestedRangeDict
. It is used to create multi dimension RangeDict. By default it is1
.Synopsis:
""" A sample of 2d mapped value: time(t) and a string range: This setting split the entire plain into 4 areas. range ^ | d +----+----+ | | cd | c + bd +----+ | | | b +----+ | | ab | ac | a +----+----+ | '' +----+----+--------> t 0 1 2 """ inp = [ [0, 1, [['a', 'b', 'ab'], ['b', 'd', 'bd'], ]], [1, 2, [['a', 'c', 'ac'], ['c', 'd', 'cd'], ]], ] r = rangeset.RangeDict(inp, dimension=2) print r.get(0.5, 'a') # 'ab' print r.get(1.5, 'a') # 'ac'
One of a useful scenario for 2d
RangeDict
is to store sharding info. Because sharding config might change, there might be a migration period a server needs two config.With the following example:
range ^ | d +----+----+ | | cd | c + bd +----+ | | | b +----+ | | ab | ac | a +----+----+ | '' +----+----+--------> t 0 1 2
At first(before time
1
) we store keys start witha ~ b
in databaseab
, and keys start withb ~ d
in databasebd
.At time
1
the administrator decide to re-balance all keys with a new sharding config: storing keys starts witha ~ c
in databaseac
, and keys start withc ~ d
in databasecd
.All API servers must be aware of the config change in order to run a dual-read mechanism. E.g. to read a key
bb
:
- First read it from the latest configured database
ac
, if not found:- Then read it from database
bd
: the previous shard forbb
.This way it let the system to smoothly transfer from one config to another.
Methods of RangeDict
are the same as RangeSet
except the following three:
syntax:
RangeDict.add(rng, val)
Add a mapping from range to value into RangeDict
.
arguments:
-
rng
: a two element iterable that defines a range. -
val
: value of any data type.
return: Nothing
syntax:
RangeDict.get(pos, *positions)
arguments:
-
pos
: position inRangeDict
-
positions
: the nested position to get if thisRangeDict.dimension > 1
.
return:
the value of range that pos
is in.
If pos
is not in any ranges, KeyError
is raised.
syntax:
RangeDict.get_min(is_lt=None)
Get range of the minimum value in the range dict. If minmum value has more than one range, then get the first one.
argument:
is_lt
: is a function that receives 2 argumentsa
andb
, returnsTrue
ifa
is "smaller" thanb
, otherwise returnFalse
. Example:Ifdef is_lt(a, b): return a < b
is_lt
isNone
, usea < b
to decide 'a' is smaller than 'b'.
return:
-
i
: the index of the minimum value in the range dict. -
rng
: aValueRange
, which is the range of the minimum value in the range dict. -
val
: the minimum value in the range dict.
If range dict is empty, ValueError
is raised.
syntax:
RangeDict.normalize()
Reduce number of elements by merging adjacent ranges those have the same value into one continuous range.
Values are compared with ==
, not is
E.g.:
rd = rangeset.RangeDict([(1, 2, '12'), [2, 3, '12']])
rd.normalize() # [[1, 3, '12']]
return: Nothing.
syntax:
rangeset.union(a, *others)
Return a new union set RangeSet
of all a
and others
.
rangeset.union(
rangeset.RangeSet([[None, 10], [20, 30], [40, None]]),
rangeset.RangeSet([[11, 22]])
)
# [[None, 10], [11, 30], [40, None]]
arguments:
-
a
: aRangeSet
instance. -
others
:RangeSet
instances
return:
a new RangeSet
instance.
syntax:
rangeset.substract(a, *others)
Return a new RangeSet
with all ranges in others
removed from a
.
rangeset.substract(
rangeset.RangeSet([[None, 10], [20, 30], [40, None]]),
rangeset.RangeSet([[25, 45]])
)
# [[None, 10], [20, 25], [45, None]]
arguments:
-
a
: aRangeSet
instance. -
others
:RangeSet
instances
return:
a new RangeSet
instance.
syntax:
rangeset.intersect(a, *others)
Return a new intersection set RangeSet
of all a
and others
.
intersect(a, b): a^b is defined with substraction: a^b = a - (a - b)
arguments:
-
a
: aRangeSet
instance. -
others
:RangeSet
instances.
return:
a new RangeSet
instance.
Zhang Yanpo (张炎泼) [email protected]
The MIT License (MIT)
Copyright (c) 2015 Zhang Yanpo (张炎泼) [email protected]