Skip to content

Port melt to python datatable

Aditya Samantaray edited this page May 10, 2021 · 9 revisions

Background

datatable is a python package for data reading/manipulation/etc, which occupies the same niche as pandas, but is more geared towards large data sets. Python datatable is a sibling of R’s data.table and attempts to mimic its core APIs and algorithms. However there are several functions in R data.table that have not yet been implemented in python datatable, including the melt function for wide-to-long data reshaping.

Coding project: melt for datatable

This project asks to implement the melt() function for dataset reshaping. Similar functions exist in most data manipulation libraries:

The basic premise of the melt() function is that it takes a frame of size n×k and produces a new frame of size nk×2, where one column contains the column names from the original dataset, and the other column contains all the values from the original dataset.

This project requires knowledge of C++, since the majority of datatable code is written in C++.

Completing this project would require the author to submit a Pull Request (or a series of Pull Requests) directly to the datatable repository.

See also:

Expected impact

The melt() function is one of the most frequently requested features for datatable, and will be therefore be a huge benefit to users of the library.

Mentors

Please get in touch after completing at least one of the tests below.

Tests

Do one or several — doing more hard tests makes you more likely to be selected.

  • Easy: Implement melting functionality and demonstrate that it works by adding the relevant unit tests;
  • Medium: Search for wide-to-long transformation tutorials for different platforms / packages (for example see the links above), and make sure your melt() function is just as capable. Demonstrate this by creating a creating a datatable tutorial (as part of the official documentation) for using the function.

Solutions of tests

Students, please post a link to your test results here.

  • EXAMPLE STUDENT 1 NAME, LINK TO GITHUB PROFILE, LINK TO TEST RESULTS.
  • Aditya Samantaray, Github, Tests
Clone this wiki locally