Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Add docs for 2025 releases #2274

Merged
merged 16 commits into from
Jan 27, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 4 additions & 0 deletions 2023.2/.buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 6800a6ffac59c178770a0c0cc83712a8
tags: 645f666f9bcd5a90fca523b33c5a78b7
Binary file added 2023.2/.doctrees/404.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/acceleration.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/algorithms.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/blogs.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/contribute.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/distributed-mode.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/environment.pickle
Binary file not shown.
Binary file added 2023.2/.doctrees/global-patching.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/guide/acceleration.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/index.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/installation.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/kaggle.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/kaggle/automl.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/kaggle/classification.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/kaggle/regression.doctree
Binary file not shown.
Binary file added 2023.2/.doctrees/memory-requirements.doctree
Binary file not shown.
386 changes: 386 additions & 0 deletions 2023.2/.doctrees/nbsphinx/samples/ElasticNet.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,386 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "3768ec43",
"metadata": {},
"source": [
"# Intel® Extension for Scikit-learn ElasticNet for Airlines DepDelay dataset"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "b1b922d1",
"metadata": {},
"outputs": [],
"source": [
"from timeit import default_timer as timer\n",
"from sklearn import metrics\n",
"from sklearn.model_selection import train_test_split\n",
"import warnings\n",
"from sklearn.datasets import fetch_openml\n",
"from sklearn.preprocessing import LabelEncoder\n",
"from IPython.display import HTML\n",
"\n",
"warnings.filterwarnings(\"ignore\")"
]
},
{
"cell_type": "markdown",
"id": "34e460a7",
"metadata": {},
"source": [
"### Download the data"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "00c2277b",
"metadata": {},
"outputs": [],
"source": [
"x, y = fetch_openml(name=\"Airlines_DepDelay_10M\", return_X_y=True)"
]
},
{
"cell_type": "markdown",
"id": "06d309c0",
"metadata": {},
"source": [
"### Preprocessing\n",
"Let's encode categorical features with LabelEncoder"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "2ff35bc2",
"metadata": {},
"outputs": [],
"source": [
"for col in [\"UniqueCarrier\", \"Origin\", \"Dest\"]:\n",
" le = LabelEncoder().fit(x[col])\n",
" x[col] = le.transform(x[col])"
]
},
{
"cell_type": "markdown",
"id": "38637349",
"metadata": {},
"source": [
"Split the data into train and test sets"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "0d332789",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"((9000000, 9), (1000000, 9), (9000000,), (1000000,))"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.1, random_state=0)\n",
"x_train.shape, x_test.shape, y_train.shape, y_test.shape"
]
},
{
"cell_type": "markdown",
"id": "246f819f",
"metadata": {},
"source": [
"Normalize the data"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "454a341c",
"metadata": {},
"outputs": [],
"source": [
"from sklearn.preprocessing import StandardScaler\n",
"\n",
"scaler_y = StandardScaler()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "df400504",
"metadata": {},
"outputs": [],
"source": [
"y_train = y_train.to_numpy().reshape(-1, 1)\n",
"y_test = y_test.to_numpy().reshape(-1, 1)\n",
"\n",
"scaler_y.fit(y_train)\n",
"y_train = scaler_y.transform(y_train).ravel()\n",
"y_test = scaler_y.transform(y_test).ravel()"
]
},
{
"cell_type": "markdown",
"id": "fe1d4fac",
"metadata": {},
"source": [
"### Patch original Scikit-learn with Intel® Extension for Scikit-learn\n",
"Intel® Extension for Scikit-learn (previously known as daal4py) contains drop-in replacement functionality for the stock Scikit-learn package. You can take advantage of the performance optimizations of Intel® Extension for Scikit-learn by adding just two lines of code before the usual Scikit-learn imports:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "ef6938df",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)\n"
]
}
],
"source": [
"from sklearnex import patch_sklearn\n",
"\n",
"patch_sklearn()"
]
},
{
"cell_type": "markdown",
"id": "20c5ab48",
"metadata": {},
"source": [
"Intel® Extension for Scikit-learn patching affects performance of specific Scikit-learn functionality. Refer to the [list of supported algorithms and parameters](https://intel.github.io/scikit-learn-intelex/latest/algorithms.html) for details. In cases when unsupported parameters are used, the package fallbacks into original Scikit-learn. If the patching does not cover your scenarios, [submit an issue on GitHub](https://github.com/intel/scikit-learn-intelex/issues)."
]
},
{
"cell_type": "markdown",
"id": "f80273e7",
"metadata": {},
"source": [
"Training of the ElasticNet algorithm with Intel® Extension for Scikit-learn for Airlines DepDelay dataset"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "a4dd1c7e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Intel® extension for Scikit-learn time: 0.28 s'"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.linear_model import ElasticNet\n",
"\n",
"params = {\n",
" \"alpha\": 0.3,\n",
" \"fit_intercept\": False,\n",
" \"l1_ratio\": 0.7,\n",
" \"random_state\": 0,\n",
" \"copy_X\": False,\n",
"}\n",
"start = timer()\n",
"model = ElasticNet(**params).fit(x_train, y_train)\n",
"train_patched = timer() - start\n",
"f\"Intel® extension for Scikit-learn time: {train_patched:.2f} s\""
]
},
{
"cell_type": "markdown",
"id": "f10b51fc",
"metadata": {},
"source": [
"Predict and get a result of the ElasticNet algorithm with Intel® Extension for Scikit-learn"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "d4295a26",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Patched Scikit-learn MSE: 1.0109113399224974'"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y_predict = model.predict(x_test)\n",
"mse_metric_opt = metrics.mean_squared_error(y_test, y_predict)\n",
"f\"Patched Scikit-learn MSE: {mse_metric_opt}\""
]
},
{
"cell_type": "markdown",
"id": "cbe6db0d",
"metadata": {},
"source": [
"### Train the same algorithm with original Scikit-learn\n",
"In order to cancel optimizations, we use *unpatch_sklearn* and reimport the class ElasticNet"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "6f64ba97",
"metadata": {},
"outputs": [],
"source": [
"from sklearnex import unpatch_sklearn\n",
"\n",
"unpatch_sklearn()"
]
},
{
"cell_type": "markdown",
"id": "f242c6da",
"metadata": {},
"source": [
"Training of the ElasticNet algorithm with original Scikit-learn library for Airlines DepDelay dataset"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "67243849",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Original Scikit-learn time: 3.96 s'"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.linear_model import ElasticNet\n",
"\n",
"start = timer()\n",
"model = ElasticNet(**params).fit(x_train, y_train)\n",
"train_unpatched = timer() - start\n",
"f\"Original Scikit-learn time: {train_unpatched:.2f} s\""
]
},
{
"cell_type": "markdown",
"id": "c85a125c",
"metadata": {},
"source": [
"Predict and get a result of the ElasticNet algorithm with original Scikit-learn"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "cd9e726c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Original Scikit-learn MSE: 1.0109113399545733'"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y_predict = model.predict(x_test)\n",
"mse_metric_original = metrics.mean_squared_error(y_test, y_predict)\n",
"f\"Original Scikit-learn MSE: {mse_metric_original}\""
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "a2edbb65",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<h3>Compare MSE metric of patched Scikit-learn and original</h3>MSE metric of patched Scikit-learn: 1.0109113399224974 <br>MSE metric of unpatched Scikit-learn: 1.0109113399545733 <br>Metrics ratio: 0.9999999999682703 <br><h3>With Scikit-learn-intelex patching you can:</h3><ul><li>Use your Scikit-learn code for training and prediction with minimal changes (a couple of lines of code);</li><li>Fast execution training and prediction of Scikit-learn models;</li><li>Get the similar quality</li><li>Get speedup in <strong>14.2</strong> times.</li></ul>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"HTML(\n",
" f\"<h3>Compare MSE metric of patched Scikit-learn and original</h3>\"\n",
" f\"MSE metric of patched Scikit-learn: {mse_metric_opt} <br>\"\n",
" f\"MSE metric of unpatched Scikit-learn: {mse_metric_original} <br>\"\n",
" f\"Metrics ratio: {mse_metric_opt/mse_metric_original} <br>\"\n",
" f\"<h3>With Scikit-learn-intelex patching you can:</h3>\"\n",
" f\"<ul>\"\n",
" f\"<li>Use your Scikit-learn code for training and prediction with minimal changes (a couple of lines of code);</li>\"\n",
" f\"<li>Fast execution training and prediction of Scikit-learn models;</li>\"\n",
" f\"<li>Get the similar quality</li>\"\n",
" f\"<li>Get speedup in <strong>{(train_unpatched/train_patched):.1f}</strong> times.</li>\"\n",
" f\"</ul>\"\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading