Releases: mars-project/mars
v0.5.0a1
This is the release notes of v0.5.0a1. See here for the complete list of solved issues and merged PRs.
Changes that break compatibility
- Calling
.execute()
will no longer return numpy ndarray, pandas DataFrame and so forth, but will return Mars tensor, DataFrame itself instead. Only corner data will be fetched for display purpose. In order to explicitly convert to numpy ndarray, please call.to_numpy()
, at the same time, call.to_pandas()
to convert to pandas DataFrame. For more details, please refer to #1201.
Highlights
- Remote API is introduced and preliminarily supported in #1238, for more details, refer to proposal #1227.
- Running on Yarn is preliminarily supported in #1210.
New Features
- Tensor
- Implements
mt.trapz
(#1205)
- Implements
- DataFrame
- Learn
- Others
Enhancements
- Make
Tileable.execute()
return tileable itself, fetching corner data only for correctrepr
(#1201) - Allow some operands to fail fast (#1229)
- Rename
LocalClusterSession
toClusterSession
(#1230)
Bug fixes
- Fix serialization for
mars.learn.utils.shuffle
(#1192) - Fix wrong result of column pruning (#1215)
- Fix error in starting local cluster with IPython (#1232)
Documentation
- Add learn docs (#1182)
- Add translation for learn docs (#1183)
- Add documentations for DataFrame arithmetic operands (#1191)
- Add logo in readme and docs (#1213)
Tests
- Workaround for upgraded tiledb (#1195)
v0.4.0
This is the release notes of v0.4.0. See here for the complete list of solved issues and merged PRs.
This release note only covers the difference from v0.4.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:
Changes that break compatibility
- Calling
.execute()
will no longer return numpy ndarray, pandas DataFrame and so forth, but will return Mars tensor, DataFrame itself instead. Only corner data will be fetched for display purpose. In order to explicitly convert to numpy ndarray, please call.to_numpy()
, at the same time, call.to_pandas()
to convert to pandas DataFrame. For more details, please refer to #1201.
Highlights
- Remote API is introduced and preliminarily supported in #1239, for more details, refer to proposal #1227.
New Features
- Tensor
- Implements
mt.trapz
(#1223)
- Implements
- DataFrame
- Learn
- Others
- Add preliminary remote function support (#1239)
Enhancements
Tileable.execute()
now will return Tileable itself, repr will act correctly (#1202)- Rename
LocalClusterSession
toClusterSession
(#1236)
Bug fixes
v0.4.0rc1
This is the release notes of v0.4.0rc1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add support for
isna
,notna
and__dir__
(#1125) - Add support for
md.dropna
(#1129) - Support
groupby.__getitem__
and group by level (#1136) - Implement DataFrame
nunique
(#1137) - Implements
md.cut
(#1139) - Add
plot
and relative functions for DataFrame and Series (#1143) - Implements
{DataFrame, Series}.{shift, tshift}
(#1157) - Add support of
md.expanding
(#1160) - Implements
{DataFrame,Series}.diff
(#1174) - Support modulo operand for DataFrame (#1176)
- Add
Series.value_counts()
support (#1181)
- Add support for
- Tensor
- Learn
Enhancements
- Refactor GroupBy objects (#1127)
Bug fixes
- Support
md.merge
whenon
column is in df.index (#1132) - Fix tokenizing partial function (#1149)
- Allow retrieving shape of a groupby object (#1155)
Documentation
- Add DataFrame docs (#1130)
- Fix requirements for doc (#1135)
- Fix rendering numpy-style documentations (#1179)
- Fix some mistakes in the doc. (#1161, thanks @ueshin!)
Tests
v0.3.4
This is the release notes of v0.3.4. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Add support for
isna
,notna
and__dir__
(#1126) - Add support for
{DataFrame,Series}.agg
(#1128) - Add support for
md.dropna
(#1131) - Implements
{DataFrame, Series}.{shift, tshift}
(#1168) - Add
plot
and relative functions for DataFrame and Series (#1166) - Implement DataFrame
nunique
(#1170) - Implements
{DataFrame,Series}.diff
(#1177) - Support modulo operand for DataFrame (#1180)
- Add support for
- Tensor
Bug fixes
- Support
md.merge
whenon
column is in df.index (#1165)
Tests
v0.3.3
New Features
Enhancements
- Optimize performance of executor when running ops less than number of parallelism (#1099)
Bug fixes
- Fix validate_axis when input tileable has unknown shape (#1092)
- Support creating DataFrame from dict in which scalar exists (#1104)
- Support slice that can be integer or other types on non-int64 index (#1109)
Tests
- Check metadata consistency for output chunks and tileables (#1094)
v0.4.0b2
This is the release notes of v0.4.0b2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
Enhancements
- Optimize performance of executor when running ops less than number of parallelism (#1096)
Bug fixes
- Fix
validate_axis
when input tileable has unknown shape (#1091) - Support creating DataFrame from dict in which scalar exists (#1098)
- Support slice that can be integer or other types on non-int64 index (#1103)
Tests
- Check metadata consistency for output chunks and tileables (#1071)
v0.3.2
This is the release notes of v0.3.2. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implement md.{
cummax
,cummin
,cumprod
,cumsum
} (#1022) - Add support for
md.fillna
(#1031) - Add
DataFrame.loc
support (#1060) - Add
DataFrame.rolling
support (#1061) - Add support for GroupBy.{
cumcount
,cummin
,cummax
,cumprod
,cumsum
} (#1072) - Support string and datetime methods via
Series.str
andSeries.dt
accessor (#1074) - Implement dataframe
append
(#1075) - Implement
DataFrame.concat
andSeries.concat
(#1078) - Add support for DataFrame.sort_values (#1081)
- Support
sort_index
for DataFrame and Series (#1082) - Add
md.date_range
support (#1086) - Logical operators on DataFrame and Series. (#1088)
- Implements
head
/tail
based oniloc
, and fixes bug ingetitem
. (#1089)
- Implement md.{
Enhancements
- Use
mapjoin
to optimize df.merge (#1023) - Refactor tiling of
DataFrame.iloc
withindex_lib
(#1043) - Add
sort_range_index
parameter in readcsv (#1067)
Bug fixes
- Standardize RangeIndex for unknown shape DataFrame (#1066)
- Fix failed cases in distributed mode (#1079)
- Fix wrong dtypes in df.rechunk (#1083)
- Fix consistency between tensor metadata and real outputs (#1087)
Tests
- Fix tests under Python 3.6 as VS2015 is preinstalled (#1015)
v0.4.0b1
This is the release notes of v0.4.0b1. See here for the complete list of solved issues and merged PRs.
New Features
- DataFrame
- Implement md.{
cummax
,cummin
,cumprod
,cumsum
} (#1019) - Implement dataframe
append
(#1026) - Add support for
md.fillna
(#1029) - Implement
DataFrame.concat
andSeries.concat
(#1040) - Support
groupby.agg
with list of functions (#1030) - Implement md.{DataFrame,Series,GroupBy}.
apply
(#1038) - Add support for
DataFrame.sort_values
(#1046) - Add
DataFrame.loc
support (#1042) - Add
DataFrame.rolling
support (#1045) - Add support for {DataFrame,Series}.
agg
(#1054) - Support string and datetime methods via
Series.str
andSeries.dt
accessor (#1063) - Add support for GroupBy.{
cumcount
,cummin
,cummax
,cumprod
,cumsum
} (#1069) - Support
sort_index
for DataFrame and Series (#1053) - Add
md.date_range
support (#1073) - Logical operators on DataFrame and Series. (#1056)
- Implements
head
/tail
based oniloc
, and fixes bug ingetitem
. (#1057)
- Implement md.{
- Others
- Add support for function serialization (#1048)
Enhancements
- Use
mapjoin
to optimizedf.merge
(#1021) - Add
sort_range_index
parameter inread_csv
(#1024) - Refactor tiling of
DataFrame.iloc
withindex_lib
(#1016)
Bug fixes
- Fix KNN so that it can accept input with unknown shape (#1033)
- Support serializing
pd.Timestamp
andpd.Timedelta
(#1065) - Fix failed cases in distributed mode (#1062)
- Fix wrong dtypes in
df.rechunk
(#1080) - Fix failed fit method selection for KNN when input has unknown shape (#1050)
- Fix consistency between tensor metadata and real outputs (#1085)
Tests
- Fix tests under Python 3.6 as VS2015 is preinstalled (#1014)
v0.4.0a2
This is the release notes of v0.4.0a2. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor:
- DataFrame
- Learn
Enhancements
- Refactor tensor indexing (#1011)
Bug fixes
- Fix tile in
nonzero
that tensor instead of tensor data should be used during the process (#954) - Fixes
cdist(x, y)
that creates tensor with wrong nsplits (#960) - Fix the wrong
RangeIndex
inread_csv
(#930) - Stop detecting GPU when no cuda devices are configured (#973)
- Fix wrong behavior of
mt.random.choice
(#976) - Make sure all kwargs are numpy types when inferring dtypes (#987)
- Fix error when
chunk_size
not provided formd.read_sql_table
(#990) - Fix wrong result of
count_nonzero
(#1002) - Add
dtype
property forTensorImread
(#1004) - Fix error when no device detected by CUDA driver (#1007)
Tests
v0.3.1
This is the release notes of v0.3.1. See here for the complete list of solved issues and merged PRs.
New Features
- Tensor
- DataFrame
Enhancements
- Refactor tensor indexing (#1012)
Bug fixes
- Stop detecting GPU when no cuda devices are configured (#975)
- Fix wrong behavior of
choice
(#993) - Make sure all kwargs are numpy types when inferring dtypes (#995)
- Fix wrong result of
count_nonzero
(#1003) - Add
dtype
property forTensorImread
(#1005) - Fix error when no device detected by CUDA driver (#1008)