Skip to content

Releases: mars-project/mars

v0.4.0a1

19 Jan 16:11
Compare
Choose a tag to compare
v0.4.0a1 Pre-release
Pre-release

This is the release notes of v0.4.0a1. See here for the complete list of solved issues and merged PRs.

Announcements

Due to the end-of-life (EOL) of Python 2 in January 1, 2020, from v0.4.0a1 on, v0.4.x series will no longer support Python 2, for Python 2.7 users, please use 0.3.x series.

Changes that break compatibility

  • Operand now supports stages(#934), reduction operands as well as those operands whose tiled chunks contain map or reduce phases cannot be serialized between this and former versions.

New Features

  • Tensor
    • Implements mt.histogram and mt.histogram_bin_edges (#876)
    • Add mt.partition support (#889)
    • Implements mt.{percentile, quantile, median} (#898)
    • Support Einstein summation convention (#888)
    • Add mt.fill_diagonal support (#918)
    • Support mars.tensor.spatial.distance.{pdist, cdist, squareform} (#894)
  • DataFrame
    • Support creating DataFrame from dict whose values are tensors (#903)
    • Support DataFrame and Series count (#900)
    • Implement mean operator for DataFrame and Series (#907)
    • Implements DataFrame.quantile and Series.quantile (#911)
    • Add comparison functions for DataFrame (#921)
    • Support df.reset_index and series.reset_index (#915)
  • Learn
    • Add pairwise distances support for learn (#926)
    • Implement MarsDataset to integrate with PyTorch (#937)
  • Others
    • Add function objects implementation for tokenizer (#893)

Enhancements

  • Use default args for super() (#878)
  • Skip preparing specified chunks when preparing for execution (#891)
  • Accelerate LU when input has one chunk (#905)
  • Add support for AnyReference in serialization (#874)
  • Merge operands representing multiple stages of one single operand (#934)

Tests

  • Add TestExecutor that serde graph every time when executing to ensure all operands work well with serialize (#880)
  • Fix possible failure of testIterativeTilingWithoutEtcd for Python 3.5 in CI (#896)
  • Switch coverage service to codecov (#909)
  • Remove *_pb2.py to reduce chances of code conflict (#913)
  • Fix failures in Windows tests (#938)

Others

  • Drop support for Python 2 (#872)
  • Further remove py27-related imports (#875)

v0.3.0

17 Jan 17:15
Compare
Choose a tag to compare

This is the release notes of v0.3.0. See here for the complete list of solved issues and merged PRs.

This release note only covers the difference from v0.3.0rc1; for all highlights and changes, please refer to the release notes of the pre-releases:

alpha1
alpha2
beta1
beta2
rc1

Announcements

From v0.3.0 on, v0.3.x will be the last series that support Python 2 until release of v0.4.0.

Changes that break compatibility

  • Operand now supports stages(#935), reduction operands as well as those operands whose tiled chunks contain map or reduce phases cannot be serialized between this and former versions.

New Features

  • Tensor
    • Implements mt.histogram and mt.histogram_bin_edges (#914)
    • Add mt.partition support (#916)
    • Implements mt.{percentile, quantile, median} (#919)
    • Support Einstein summation convention (#925)
    • Add mt.fill_diagonal support (#931)
  • DataFrame
    • Support creating DataFrame from dict whose values are tensors (#922)
    • Implements DataFrame.quantile and Series.quantile (#924)
    • Support DataFrame and Series count (#923)
    • Implement mean operator for DataFrame and Series (#927)
    • Add comparison operands for DataFrame (#929)
    • Support df.reset_index and series.reset_index (#933)

Enhancements

  • Add public base class for entity data (#879)
  • Merge operands representing multiple stages of one single operand (#935)

Bug fixes

  • Fix sparse behavior for tensor.min and tensor.max (#936)

Tests

  • Add TestExecutor that serde graph every time when executing to ensure all operands work well with serialize (#881)
  • Fix possible failure of testIterativeTilingWithoutEtcd for Python 3 in CI (#906)
  • Remove *_pb2.py to reduce chances of code conflict (#920)

v0.3.0rc1

15 Dec 17:58
Compare
Choose a tag to compare
v0.3.0rc1 Pre-release
Pre-release

This is the release notes of 0.3.0rc1. See here for the complete list of solved issues and merged PRs.

Highlights

  • Mars now can handle more cases that failed due to tensors with unknown chunk shapes via iterative tiling support introduced in #834.
  • Python 3.8 wheels are supported in this release.

New Features

  • Support iterative tiling (#834)
  • Add experimental column pruning rules for tileable graph optimization (#865)
  • Tensor
    • Add mt.sort support (#827)
  • DataFrame
    • Support DataFrame rechunk (#839)
    • Support Series's setitem and getitem by iloc operation (#843)
    • Add tree reduction method for DataFrame groupby aggregations (#850)
  • Learn
    • Add mars.learn.datasets.samples_generator.make_blobs and update README (#845)
    • Support running PyTorch in Mars cluster via run_pytorch_script (#861)

Enhancements

  • Add ReceiverStatusActor to help listening at receiver end (#833)
  • Assign enqueued operands immediately when no descendants are ready (#854)
  • Support transferring multiple chunks at one time (#841)

Bug fixes

  • Fix incorrect behavior of dataframe arithmetic (#838)
  • Mark resource as processing once allocated (#848)
  • Fix read_csv execution on GPU (#859)
  • Kill process tree when terminating a worker process (#864)

Tests

  • Add separate environment to test HDFS (#829)
  • Add CI/CD for Python 3.8 (#857)
  • Fix distribute error under Py38 (#871)

v0.2.4

14 Dec 17:55
Compare
Choose a tag to compare

This is the release notes of v0.2.4. See here for the complete list of solved issues and merged PRs.

New Features

  • Add mt.sort support (#862)
  • Support DataFrame rechunk (#866)
  • Support Series's setitem and getitem by iloc operation (#868)
  • Add tree reduction method for DataFrame groupby aggregations (#869)

Enhancements

  • Backport CUDA-related changes in utils (#846)
  • Resolve compatibility issue for Python 3.8 (#858)

Bug fixes

  • Fix incorrect behavior of dataframe arithmetic (#840)
  • Mark resource as processing once allocated (#851)
  • Kill process tree when terminating a worker process (#867)
  • Fix read_csv execution on GPU (#870)

Tests

  • Add separate environment to test HDFS (#835)

v0.3.0b2

15 Nov 17:13
Compare
Choose a tag to compare
v0.3.0b2 Pre-release
Pre-release

This is the release notes of v0.3.0b2. See here for the complete list of solved issues and merged PRs.

Highlights

  • Interoperability with XGBoost and TensorFlow are introduced:
    • mars.learn.contrib.xgboost.XGBClassifier and mars.learn.contrib.xgboost.XGBRegressor can be used to do distributed classification and regression mission.
    • mars.learn.contrib.tensorflow.run_tensorflow_script supports running distributed TensorFlow 2.0 training in Mars cluster.

New Features

  • Tensor
    • Add mt.unique support for tensor (#783)
  • DataFrame
    • Support DataFrame subtract operator (#787)
    • Support conversion between series and tensor (#791)
    • Refactor of DataFrame reduction and support more reduction operands (#789)
    • Support DataFrame read_csv (#807)
  • Learn
    • Add XGBoost support (#769)
    • Add ObjectData and ObjectChunk to represent data beyond ndarray, dataframe etc (#805)
    • Add mars.learn.utils.shuffle to support shuffling multiple tileable objects in a consistent way (#808)
    • Support running distributed TensorFlow 2.0 via run_tensorflow_script (#820)

Enhancements

  • Return execution exception info properly to session client (#770)
  • Simplify tiles logic to improve its performance (#792)
  • Support axis argument for permutation and shuffle (#803)
  • Support __iadd__ etc by wrap add with out argument (#813)
  • Handle worker storage in batches (#818)

Bug fixes

  • Correct type checking for DataFrame arithmetic (#815)

Tests

  • Switch CI service to Github Actions (#793)
  • Move tests in Appveyor into Github Actions (#795)

Others

  • Bump copyright year to 2020 (#809)

v0.2.3

15 Nov 17:38
Compare
Choose a tag to compare

This is the release notes of v0.2.3. See here for the complete list of solved issues and merged PRs.

New Features

  • Tensor:
    • Add mt.unique support for tensor (#798)
  • DataFrame
    • Support DataFrame subtract operator (#800)
    • Support conversion between series and tensor (#806)
    • Refactor of DataFrame reduction and support more reduction operands (#816)
    • Support DataFrame read_csv (#826)

Enhancements

  • Simplify tiles logic to improve its performance (#801)
  • Return execution exception info properly to session client. (#821)
  • Support axis argument for permutation and shuffle (#822)
  • Support __iadd__ etc by wrap add with out argument (#824)

Bug fixes

  • Correct type checking for DataFrame arithmetic (#819)
  • Fix stuck issue of GeventThreadPoolExecutor (#823)

Tests

  • Switch CI service to Github Actions (#794)
  • Move tests in Appveyor into Github Actions (#797)
  • Fix etcd cases under macOS Catalina (#811)

v0.3.0b1

22 Oct 04:02
Compare
Choose a tag to compare
v0.3.0b1 Pre-release
Pre-release

This is the release notes of v0.3.0b1. See here for the complete list of solved issues and merged PRs.

New Features

  • Implements numpy.random.shuffle and numpy.random.permutation for tensor (#762)
  • Add preliminary support for distributed execution with CUDA (#776)
  • Add multiple GPU support for local execution (#779)
  • Support DataFrame groupby.agg (#767)

Enhancements

  • Overhaul dataframe/series index alignment. (#737)
  • Add support for controlling data copy across processes (#766)

Bug fixes

  • Fix relocation of plasma error objects (#771)
  • Fix execution of arithmetic on GPU (#775)

v0.2.2

22 Oct 05:46
Compare
Choose a tag to compare

This is the release notes of v0.2.2. See here for the complete list of solved issues and merged PRs.

New Features

  • Add multiple GPU support for local execution (#781)
  • Implements numpy.random.shuffle and numpy.random.permutation for tensor (#780)
  • Support DataFrame groupby.agg (#782)

Enhancements

  • Overhaul dataframe/series index alignment (#778)

Bug fixes

  • Fix execution of arithmetic on GPU (#777)

v0.3.0a2

23 Sep 06:21
Compare
Choose a tag to compare
v0.3.0a2 Pre-release
Pre-release

This is the release notes of v0.3.0a2. See here for the complete list of solved issues and merged PRs.

New Features

  • Add to_gpu and to_cpu support for both tensor and DataFrame (#630)
  • Access column using __getattr__ syntax for DataFrame (#712)

Enhancements

  • Move related files to optimizes module (#640)
  • Add option for plasma path (#699)
  • Wait for graph to finish instead of querying with fixed intervals (#701)
  • Submit initial operands together in one RPC call (#711)
  • Add lock free option for workers (#716)
  • Implements more flexible tileable.cix[] (#731)
  • Submit metas obtained from schedulers (#727)
  • Spawn promise to utilize async network libs (#725)
  • Simplify data transfer protocol (#736)
  • Fuse some operations in cholesky's tile (#742)

Bug fixes

  • Separate flags for initials and terminals for operands (#703)
  • Remove redundant RPC calls for schedulers (#705)
  • Fix incorrect chunk shape in QR decomposition (#719)
  • __setitem__ on a view should be still a view (#733)
  • Processing index and columns seperately (and correctly) in from_tensor (#723)
  • Add a config to use cpuacct.stat to calculate cpu usage (#740)
  • Fix race condition when starting tasks and adding callbacks (#755)

v0.2.1

23 Sep 07:27
Compare
Choose a tag to compare

This is the release notes of v0.2.1. See here for the complete list of solved issues and merged PRs.

New Features

  • Add to_gpu and to_cpu support for both tensor and DataFrame (#706)
  • Access column using __getattr__ syntax for DataFrame (#746)

Enhancements

  • Wait for graph to finish instead of querying with fixed intervals (#707)
  • Spawn promise to utilize async network libs (#735)
  • Submit metas obtained from schedulers (#741)
  • Submit initial operands together in one RPC call (#745)
  • Fuse some operations in cholesky's tile (#749)
  • Simplify data transfer protocol (#744)

Bug fixes

  • Separate flags for initials and terminals (#708)
  • Remove redundant RPC calls for schedulers (#709)
  • Fix incorrect chunk shape in QR (#722)
  • Use cpuacct.stat to calculate cpu usage in Docker containers (#743)
  • Processing index and columns seperately (and correctly) in from_tensor (#747)
  • __setitem__ on a view should be still a view (#748)