Skip to content

Commit cd10979

Browse files
committed
AP-7115: Created README.md, LICENCES, .lfsconfig.
1 parent d035443 commit cd10979

File tree

7 files changed

+2937
-0
lines changed

7 files changed

+2937
-0
lines changed

.lfsconfig

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
[lfs]
2+
url = https://bitbucket.org/KNIME/knime-python.git/info/lfs

LICENSE.txt

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
This repository contains parts of the KNIME Analytics Platform as a bundle of various modules. Each module is released under its own license. Please refer to the [project/directory/package] of the module to find it's corresponding license. The KNIME trademark and logo and OPEN FOR INNOVATION are registered in the United States and/or Germany, owned by KNIME GmbH. All other trademarks belong to their respective owners.

README.md

+90
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
# Overview
2+
3+
This repository contains:
4+
5+
* KNIME Python Integration (major version 2)
6+
* KNIME Python Integration (major versions 2&3)
7+
* KNIME Jython Integration
8+
9+
The KNIME Python Integration (major version 2) is the current default Python Integration providing nodes to connect to python in the "Scripting" category of the node browser. Data transfer is based on google-protobuf. This Python Integration is no longer developed actively.
10+
The KNIME Python Integration (major versions 2&3) is a new Python Integration providing more memory-efficient and processing-time-efficient serialization. For this purpose the (de)serialization module is realized as a pluggable component using the Eclipse extension point mechanism on the Java side.
11+
The "KNIME Jython Integration" includes three Snippet nodes providing capabilities to use Jython in KNIME.
12+
13+
# Details
14+
15+
## KNIME Python Integration (major versions 2&3)
16+
17+
### Contained projects
18+
19+
* *org.knime.features.python2*: contains build.properties and feature.xml (only relevant for the KNIME build system)
20+
* *org.knime.python2*: all controller classes managing the communication between KNIME and Python
21+
* *org.knime.python2.nodes*: KNIME node implementations
22+
* *org.knime.python2.serde.arrow*: a serialization library based on Apache Arrow (see detailed explanation)
23+
* *org.knime.python2.serde.csv*: a serialization library using .csv files (see detailed explanation)
24+
* *org.knime.python2.serde.flatbuffers*: : a serialization library based on google-flatbuffers (see detailed explanation)
25+
* *org.knime.python.typeextensions* (shared with "KNIME Python Integration (major version 2)"): custom (de)serializers for more complex KNIME-types
26+
27+
### Explanation
28+
29+
![Configure dialog](https://bitbucket.org/KNIME/knime-python/raw/master/res/python_node_configure.png)
30+
31+
The KNIME Python Integration (major versions 2&3) provides a variety of nodes for executing python code. With them, inputs and outputs can be accessed through "magic variables" inside a python script. The available variables can be checked in the table on the left side of the configure dialog (see image above). KNIME tables are translated into pandas.DataFrame objects
32+
on the python side and vice versa. Flow variables can be accessed via a dictionary. Custom serialization methods for a variety of complex data-types allow transferring them between KNIME and the Python Kernel. The so-called typeextensions are defined in the org.knime.python.typeextensions project. At the moment built-in extensions exist for .png images, .svg images, date&time types, XML cells and bytevector cells. Further typeextensions may be defined using the Eclipse extension point mechanism.
33+
Furthermore, python general options, such as the path to the python executables in major version 2 and the serialization library to use in major version 3, can be configured via the python preference page found in the menu under "Preferences -> KNIME -> Python (Labs)". Serialization libraries define methods for (de)serializing a KNIME table to a byte representation and vice versa on the Java side, and methods for (de)serializing a byte representation into a pandas.DataFrame and vice versa on the Python side. Serialization libraries are implemented as interchangeable modules using the Eclipse Extension point mechanism. Currently three different serialization libraries are implemented in their respective projects org.knime.python2.serde.arrow (based on the Apache Arrow technology; see: [https://arrow.apache.org/](https://arrow.apache.org/)), org.knime.python2.serde.csv (exchanges data using .csv files), and org.knime.python2.serde.flatbuffers (based on the google-flatbuffers technology; see: [https://google.github.io/flatbuffers/](https://google.github.io/flatbuffers/)).
34+
In the node configure dialog window, the python major version to use can be selected in the options tab. Furthermore, missing value handling can be customized for Int- and Long-Columns, as those are converted to double columns by default as soon as they contain missing values. With the options tab, missing values in these columns can be converted to a sentinel-value (an arbitrary replacement value).
35+
36+
![The Python Integration Nodes in action](https://bitbucket.org/KNIME/knime-python/raw/master/res/python_example_workflow.png)
37+
38+
The following nodes are available in the "KNIME Python Integration (major versions 2&3)" plugin:
39+
40+
* **Python Edit Variable (Labs):** edit or append KNIME flow variables
41+
* **Python Source (Labs):** run a python script, build a pandas.DataFrame, and transfer it back to KNIME
42+
* **Python Script (1⇒1) (Labs):** run a python script processing a single KNIME table, build a pandas.DataFrame, and transfer it back to KNIME
43+
* **Python Script (1⇒2) (Labs):** run a python script processing a single KNIME table, build two separate pandas.DataFrames, and transfer them back to KNIME
44+
* **Python Script (2⇒1) (Labs):** run a python script processing two KNIME tables, build a pandas.DataFrame, and transfer it back to KNIME
45+
* **Python Script (2⇒2) (Labs):** run a python script processing two KNIME tables, build two separate pandas.DataFrames, and transfer them back to KNIME
46+
* **Python View (Labs):** run a python script that creates a view, e.g. a diagram
47+
* **Python Object Reader (Labs):** read a python object that was written using the Python Object Writer (Labs). Creates a special python object at the output that may be processed by the Python Predictor (Labs) node
48+
* **Python Object Writer (Labs):** write a python object as a pickle-file. Python objects can be created using the Python Learner (Labs) node
49+
* **Python Learner (Labs):** use python to train a model. The model is returned as a special python object that may be processed by the Python Predictor (Labs).
50+
* **Python Predictor (Labs):** use python to make predictions on the basis of a KNIME table based on a model.
51+
* **Python Script (DB) (Labs):** modify a database query using python. Get back the query results as a pandas.DataFrame.
52+
53+
The node implementations may be found in the project *org.knime.python2.nodes*. All controller classes managing the communication between KNIME and Python can be found in the project *org.knime.python2*.
54+
55+
*NOTE: The name "python2" always refers to the KNIME Python Integration (major versions 2&3).*
56+
57+
A detailed explanation of how to set up python with KNIME can be found here: [https://www.knime.com/blog/setting-up-the-knime-python-extension-revisited-for-python-30-and-20](https://www.knime.com/blog/setting-up-the-knime-python-extension-revisited-for-python-30-and-20)
58+
59+
## KNIME Python Integration (major version 2)
60+
61+
### Contained Projects
62+
63+
* *org.knime.features.python*: contains build.properties and feature.xml (only relevant for the KNIME build system)
64+
* *org.knime.python*: all controller classes managing the communication between KNIME and Python
65+
* *org.knime.python.nodes*: KNIME node implementations
66+
* *org.knime.python.typeextensions* (shared with KNIME Python Integration (major version 2)): custom (de)serializers for more complex KNIME-types
67+
68+
### Explanation
69+
70+
The KNIME Python Integration (major version 2) has very much the same structure as KNIME Python Integration (major versions 2&3) but uses google-protobuf as (de)serialization backend. Currently not under development. The project org.knime.python.nodes contains the implementation of the included nodes. Again, all controller classes managing the communication between KNIME and Python can be found in the project org.knime.python.
71+
72+
*NOTE: The name "python" always refers to the KNIME Python Integration (major version 2).*
73+
74+
A detailed explanation of how to set up this python integration with KNIME can be found here: [https://www.knime.com/blog/how-to-setup-the-python-extension](https://www.knime.com/blog/how-to-setup-the-python-extension)
75+
76+
## KNIME Jython Integration
77+
78+
### Contained Projects
79+
* *org.knime.features.ext.jython*: contains build.properties and feature.xml (only relevant for the KNIME build system)
80+
* *org.knime.ext.jython*: all Java sources relevant for the Jython Integration
81+
82+
### Explanation
83+
84+
The following nodes are available in the "KNIME Jython Integration" plugin:
85+
* **JPython Function:** creates a new table column based on a combination of the input table's columns, using, for instance, mathematical operators.
86+
* **JPython Script 1:1:** run a Jython script processing a single KNIME table and producing a single KNIME output table
87+
* **JPython Script 2:1:** run a Jython script processing two KNIME tables and producing a single KNIME output table
88+
89+
All nodes and underlying code are contained in the project *org.knime.ext.jython*.
90+

0 commit comments

Comments
 (0)