-
Notifications
You must be signed in to change notification settings - Fork 77
/
Copy pathglossary.html
2259 lines (2034 loc) · 212 KB
/
glossary.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html lang="en" data-content_root="./" >
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
<meta property="og:title" content="Glossary of Common Terms and API Elements" />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://scikit-learn/stable/glossary.html" />
<meta property="og:site_name" content="scikit-learn" />
<meta property="og:description" content="This glossary hopes to definitively represent the tacit and explicit conventions applied in Scikit-learn and its API, while providing a reference for users and contributors. It aims to describe the..." />
<meta property="og:image" content="https://scikit-learn.org/stable/_static/scikit-learn-logo-small.png" />
<meta property="og:image:alt" content="scikit-learn" />
<meta name="description" content="This glossary hopes to definitively represent the tacit and explicit conventions applied in Scikit-learn and its API, while providing a reference for users and contributors. It aims to describe the..." />
<title>Glossary of Common Terms and API Elements — scikit-learn 1.6.1 documentation</title>
<script data-cfasync="false">
document.documentElement.dataset.mode = localStorage.getItem("mode") || "";
document.documentElement.dataset.theme = localStorage.getItem("theme") || "";
</script>
<!--
this give us a css class that will be invisible only if js is disabled
-->
<noscript>
<style>
.pst-js-only { display: none !important; }
</style>
</noscript>
<!-- Loaded before other Sphinx assets -->
<link href="_static/styles/theme.css?digest=8878045cc6db502f8baf" rel="stylesheet" />
<link href="_static/styles/pydata-sphinx-theme.css?digest=8878045cc6db502f8baf" rel="stylesheet" />
<link rel="stylesheet" type="text/css" href="_static/pygments.css?v=a746c00c" />
<link rel="stylesheet" type="text/css" href="_static/copybutton.css?v=76b2166b" />
<link rel="stylesheet" type="text/css" href="_static/plot_directive.css" />
<link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Vibur" />
<link rel="stylesheet" type="text/css" href="_static/jupyterlite_sphinx.css?v=e3ca86de" />
<link rel="stylesheet" type="text/css" href="_static/sg_gallery.css?v=d2d258e8" />
<link rel="stylesheet" type="text/css" href="_static/sg_gallery-binder.css?v=f4aeca0c" />
<link rel="stylesheet" type="text/css" href="_static/sg_gallery-dataframe.css?v=2082cf3c" />
<link rel="stylesheet" type="text/css" href="_static/sg_gallery-rendered-html.css?v=1277b6f3" />
<link rel="stylesheet" type="text/css" href="_static/sphinx-design.min.css?v=95c83b7e" />
<link rel="stylesheet" type="text/css" href="_static/styles/colors.css?v=cc94ab7d" />
<link rel="stylesheet" type="text/css" href="_static/styles/custom.css?v=d67e4bb0" />
<!-- So that users can add custom icons -->
<script src="_static/scripts/fontawesome.js?digest=8878045cc6db502f8baf"></script>
<!-- Pre-loaded scripts that we'll load fully later -->
<link rel="preload" as="script" href="_static/scripts/bootstrap.js?digest=8878045cc6db502f8baf" />
<link rel="preload" as="script" href="_static/scripts/pydata-sphinx-theme.js?digest=8878045cc6db502f8baf" />
<script src="_static/documentation_options.js?v=d6a008b6"></script>
<script src="_static/doctools.js?v=9a2dae69"></script>
<script src="_static/sphinx_highlight.js?v=dc90522c"></script>
<script src="_static/clipboard.min.js?v=a7894cd8"></script>
<script src="_static/copybutton.js?v=97f0b27d"></script>
<script src="_static/jupyterlite_sphinx.js?v=d6bdf5f8"></script>
<script src="_static/design-tabs.js?v=f930bc37"></script>
<script data-domain="scikit-learn.org" defer="defer" src="https://views.scientific-python.org/js/script.js"></script>
<script>DOCUMENTATION_OPTIONS.pagename = 'glossary';</script>
<script>
DOCUMENTATION_OPTIONS.theme_version = '0.16.1';
DOCUMENTATION_OPTIONS.theme_switcher_json_url = 'https://scikit-learn.org/dev/_static/versions.json';
DOCUMENTATION_OPTIONS.theme_switcher_version_match = '1.6.1';
DOCUMENTATION_OPTIONS.show_version_warning_banner =
true;
</script>
<script src="_static/scripts/dropdown.js?v=e2048168"></script>
<script src="_static/scripts/version-switcher.js?v=a6dd8357"></script>
<script src="_static/scripts/sg_plotly_resize.js?v=eeb41cab"></script>
<link rel="icon" href="_static/favicon.ico"/>
<link rel="author" title="About these documents" href="about.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Frequently Asked Questions" href="faq.html" />
<link rel="prev" title="Older Versions" href="whats_new/older_versions.html" />
<meta name="viewport" content="width=device-width, initial-scale=1"/>
<meta name="docsearch:language" content="en"/>
<meta name="docsearch:version" content="1.6" />
</head>
<body data-bs-spy="scroll" data-bs-target=".bd-toc-nav" data-offset="180" data-bs-root-margin="0px 0px -60%" data-default-mode="">
<div id="pst-skip-link" class="skip-link d-print-none"><a href="#main-content">Skip to main content</a></div>
<div id="pst-scroll-pixel-helper"></div>
<button type="button" class="btn rounded-pill" id="pst-back-to-top">
<i class="fa-solid fa-arrow-up"></i>Back to top</button>
<dialog id="pst-search-dialog">
<form class="bd-search d-flex align-items-center"
action="search.html"
method="get">
<i class="fa-solid fa-magnifying-glass"></i>
<input type="search"
class="form-control"
name="q"
placeholder="Search the docs ..."
aria-label="Search the docs ..."
autocomplete="off"
autocorrect="off"
autocapitalize="off"
spellcheck="false"/>
<span class="search-button__kbd-shortcut"><kbd class="kbd-shortcut__modifier">Ctrl</kbd>+<kbd>K</kbd></span>
</form>
</dialog>
<div class="pst-async-banner-revealer d-none">
<aside id="bd-header-version-warning" class="d-none d-print-none" aria-label="Version warning"></aside>
</div>
<header class="bd-header navbar navbar-expand-lg bd-navbar d-print-none">
<div class="bd-header__inner bd-page-width">
<button class="pst-navbar-icon sidebar-toggle primary-toggle" aria-label="Site navigation">
<span class="fa-solid fa-bars"></span>
</button>
<div class=" navbar-header-items__start">
<div class="navbar-item">
<a class="navbar-brand logo" href="index.html">
<img src="_static/scikit-learn-logo-small.png" class="logo__image only-light" alt="scikit-learn homepage"/>
<img src="_static/scikit-learn-logo-small.png" class="logo__image only-dark pst-js-only" alt="scikit-learn homepage"/>
</a></div>
</div>
<div class=" navbar-header-items">
<div class="me-auto navbar-header-items__center">
<div class="navbar-item">
<nav>
<ul class="bd-navbar-elements navbar-nav">
<li class="nav-item ">
<a class="nav-link nav-internal" href="install.html">
Install
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="user_guide.html">
User Guide
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="api/index.html">
API
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="auto_examples/index.html">
Examples
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-external" href="https://blog.scikit-learn.org/">
Community
</a>
</li>
<li class="nav-item dropdown">
<button class="btn dropdown-toggle nav-item" type="button"
data-bs-toggle="dropdown" aria-expanded="false"
aria-controls="pst-nav-more-links">
More
</button>
<ul id="pst-nav-more-links" class="dropdown-menu">
<li class=" ">
<a class="nav-link dropdown-item nav-internal" href="getting_started.html">
Getting Started
</a>
</li>
<li class=" ">
<a class="nav-link dropdown-item nav-internal" href="whats_new.html">
Release History
</a>
</li>
<li class=" current active">
<a class="nav-link dropdown-item nav-internal" href="#">
Glossary
</a>
</li>
<li class=" ">
<a class="nav-link dropdown-item nav-external" href="https://scikit-learn.org/dev/developers/index.html">
Development
</a>
</li>
<li class=" ">
<a class="nav-link dropdown-item nav-internal" href="faq.html">
FAQ
</a>
</li>
<li class=" ">
<a class="nav-link dropdown-item nav-internal" href="support.html">
Support
</a>
</li>
<li class=" ">
<a class="nav-link dropdown-item nav-internal" href="related_projects.html">
Related Projects
</a>
</li>
<li class=" ">
<a class="nav-link dropdown-item nav-internal" href="roadmap.html">
Roadmap
</a>
</li>
<li class=" ">
<a class="nav-link dropdown-item nav-internal" href="governance.html">
Governance
</a>
</li>
<li class=" ">
<a class="nav-link dropdown-item nav-internal" href="about.html">
About us
</a>
</li>
</ul>
</li>
</ul>
</nav></div>
</div>
<div class="navbar-header-items__end">
<div class="navbar-item navbar-persistent--container">
<button class="btn btn-sm pst-navbar-icon search-button search-button__button pst-js-only" title="Search" aria-label="Search" data-bs-placement="bottom" data-bs-toggle="tooltip">
<i class="fa-solid fa-magnifying-glass fa-lg"></i>
</button>
</div>
<div class="navbar-item">
<button class="btn btn-sm nav-link pst-navbar-icon theme-switch-button pst-js-only" aria-label="Color mode" data-bs-title="Color mode" data-bs-placement="bottom" data-bs-toggle="tooltip">
<i class="theme-switch fa-solid fa-sun fa-lg" data-mode="light" title="Light"></i>
<i class="theme-switch fa-solid fa-moon fa-lg" data-mode="dark" title="Dark"></i>
<i class="theme-switch fa-solid fa-circle-half-stroke fa-lg" data-mode="auto" title="System Settings"></i>
</button></div>
<div class="navbar-item"><ul class="navbar-icon-links"
aria-label="Icon Links">
<li class="nav-item">
<a href="https://github.com/scikit-learn/scikit-learn" title="GitHub" class="nav-link pst-navbar-icon" rel="noopener" target="_blank" data-bs-toggle="tooltip" data-bs-placement="bottom"><i class="fa-brands fa-square-github fa-lg" aria-hidden="true"></i>
<span class="sr-only">GitHub</span></a>
</li>
</ul></div>
<div class="navbar-item">
<div class="version-switcher__container dropdown pst-js-only">
<button id="pst-version-switcher-button-2"
type="button"
class="version-switcher__button btn btn-sm dropdown-toggle"
data-bs-toggle="dropdown"
aria-haspopup="listbox"
aria-controls="pst-version-switcher-list-2"
aria-label="Version switcher list"
>
Choose version <!-- this text may get changed later by javascript -->
<span class="caret"></span>
</button>
<div id="pst-version-switcher-list-2"
class="version-switcher__menu dropdown-menu list-group-flush py-0"
role="listbox" aria-labelledby="pst-version-switcher-button-2">
<!-- dropdown will be populated by javascript on page load -->
</div>
</div></div>
</div>
</div>
<div class="navbar-persistent--mobile">
<button class="btn btn-sm pst-navbar-icon search-button search-button__button pst-js-only" title="Search" aria-label="Search" data-bs-placement="bottom" data-bs-toggle="tooltip">
<i class="fa-solid fa-magnifying-glass fa-lg"></i>
</button>
</div>
<button class="pst-navbar-icon sidebar-toggle secondary-toggle" aria-label="On this page">
<span class="fa-solid fa-outdent"></span>
</button>
</div>
</header>
<div class="bd-container">
<div class="bd-container__inner bd-page-width">
<dialog id="pst-primary-sidebar-modal"></dialog>
<div id="pst-primary-sidebar" class="bd-sidebar-primary bd-sidebar hide-on-wide">
<div class="sidebar-header-items sidebar-primary__section">
<div class="sidebar-header-items__center">
<div class="navbar-item">
<nav>
<ul class="bd-navbar-elements navbar-nav">
<li class="nav-item ">
<a class="nav-link nav-internal" href="install.html">
Install
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="user_guide.html">
User Guide
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="api/index.html">
API
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="auto_examples/index.html">
Examples
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-external" href="https://blog.scikit-learn.org/">
Community
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="getting_started.html">
Getting Started
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="whats_new.html">
Release History
</a>
</li>
<li class="nav-item current active">
<a class="nav-link nav-internal" href="#">
Glossary
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-external" href="https://scikit-learn.org/dev/developers/index.html">
Development
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="faq.html">
FAQ
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="support.html">
Support
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="related_projects.html">
Related Projects
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="roadmap.html">
Roadmap
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="governance.html">
Governance
</a>
</li>
<li class="nav-item ">
<a class="nav-link nav-internal" href="about.html">
About us
</a>
</li>
</ul>
</nav></div>
</div>
<div class="sidebar-header-items__end">
<div class="navbar-item">
<button class="btn btn-sm nav-link pst-navbar-icon theme-switch-button pst-js-only" aria-label="Color mode" data-bs-title="Color mode" data-bs-placement="bottom" data-bs-toggle="tooltip">
<i class="theme-switch fa-solid fa-sun fa-lg" data-mode="light" title="Light"></i>
<i class="theme-switch fa-solid fa-moon fa-lg" data-mode="dark" title="Dark"></i>
<i class="theme-switch fa-solid fa-circle-half-stroke fa-lg" data-mode="auto" title="System Settings"></i>
</button></div>
<div class="navbar-item"><ul class="navbar-icon-links"
aria-label="Icon Links">
<li class="nav-item">
<a href="https://github.com/scikit-learn/scikit-learn" title="GitHub" class="nav-link pst-navbar-icon" rel="noopener" target="_blank" data-bs-toggle="tooltip" data-bs-placement="bottom"><i class="fa-brands fa-square-github fa-lg" aria-hidden="true"></i>
<span class="sr-only">GitHub</span></a>
</li>
</ul></div>
<div class="navbar-item">
<div class="version-switcher__container dropdown pst-js-only">
<button id="pst-version-switcher-button-3"
type="button"
class="version-switcher__button btn btn-sm dropdown-toggle"
data-bs-toggle="dropdown"
aria-haspopup="listbox"
aria-controls="pst-version-switcher-list-3"
aria-label="Version switcher list"
>
Choose version <!-- this text may get changed later by javascript -->
<span class="caret"></span>
</button>
<div id="pst-version-switcher-list-3"
class="version-switcher__menu dropdown-menu list-group-flush py-0"
role="listbox" aria-labelledby="pst-version-switcher-button-3">
<!-- dropdown will be populated by javascript on page load -->
</div>
</div></div>
</div>
</div>
<div class="sidebar-primary-items__end sidebar-primary__section">
</div>
</div>
<main id="main-content" class="bd-main" role="main">
<div class="bd-content">
<div class="bd-article-container">
<div class="bd-header-article d-print-none">
<div class="header-article-items header-article__inner">
<div class="header-article-items__start">
<div class="header-article-item">
<nav aria-label="Breadcrumb" class="d-print-none">
<ul class="bd-breadcrumbs">
<li class="breadcrumb-item breadcrumb-home">
<a href="index.html" class="nav-link" aria-label="Home">
<i class="fa-solid fa-home"></i>
</a>
</li>
<li class="breadcrumb-item active" aria-current="page"><span class="ellipsis">Glossary of Common Terms and API Elements</span></li>
</ul>
</nav>
</div>
</div>
</div>
</div>
<div id="searchbox"></div>
<article class="bd-article">
<section id="glossary-of-common-terms-and-api-elements">
<span id="glossary"></span><h1>Glossary of Common Terms and API Elements<a class="headerlink" href="#glossary-of-common-terms-and-api-elements" title="Link to this heading">#</a></h1>
<p>This glossary hopes to definitively represent the tacit and explicit
conventions applied in Scikit-learn and its API, while providing a reference
for users and contributors. It aims to describe the concepts and either detail
their corresponding API or link to other relevant parts of the documentation
which do so. By linking to glossary entries from the API Reference and User
Guide, we may minimize redundancy and inconsistency.</p>
<p>We begin by listing general concepts (and any that didn’t fit elsewhere), but
more specific sets of related terms are listed below:
<a class="reference internal" href="#glossary-estimator-types"><span class="std std-ref">Class APIs and Estimator Types</span></a>, <a class="reference internal" href="#glossary-target-types"><span class="std std-ref">Target Types</span></a>,
<a class="reference internal" href="#glossary-methods"><span class="std std-ref">Methods</span></a>, <a class="reference internal" href="#glossary-parameters"><span class="std std-ref">Parameters</span></a>,
<a class="reference internal" href="#glossary-attributes"><span class="std std-ref">Attributes</span></a>, <a class="reference internal" href="#glossary-sample-props"><span class="std std-ref">Data and sample properties</span></a>.</p>
<section id="general-concepts">
<h2>General Concepts<a class="headerlink" href="#general-concepts" title="Link to this heading">#</a></h2>
<dl class="glossary">
<dt id="term-1d">1d<a class="headerlink" href="#term-1d" title="Link to this term">#</a></dt><dt id="term-1d-array">1d array<a class="headerlink" href="#term-1d-array" title="Link to this term">#</a></dt><dd><p>One-dimensional array. A NumPy array whose <code class="docutils literal notranslate"><span class="pre">.shape</span></code> has length 1.
A vector.</p>
</dd>
<dt id="term-2d">2d<a class="headerlink" href="#term-2d" title="Link to this term">#</a></dt><dt id="term-2d-array">2d array<a class="headerlink" href="#term-2d-array" title="Link to this term">#</a></dt><dd><p>Two-dimensional array. A NumPy array whose <code class="docutils literal notranslate"><span class="pre">.shape</span></code> has length 2.
Often represents a matrix.</p>
</dd>
<dt id="term-API">API<a class="headerlink" href="#term-API" title="Link to this term">#</a></dt><dd><p>Refers to both the <em>specific</em> interfaces for estimators implemented in
Scikit-learn and the <em>generalized</em> conventions across types of
estimators as described in this glossary and <a class="reference internal" href="developers/develop.html#api-overview"><span class="std std-ref">overviewed in the
contributor documentation</span></a>.</p>
<p>The specific interfaces that constitute Scikit-learn’s public API are
largely documented in <a class="reference internal" href="api/index.html#api-ref"><span class="std std-ref">API Reference</span></a>. However, we less formally consider
anything as public API if none of the identifiers required to access it
begins with <code class="docutils literal notranslate"><span class="pre">_</span></code>. We generally try to maintain <a class="reference internal" href="#term-backwards-compatibility"><span class="xref std std-term">backwards
compatibility</span></a> for all objects in the public API.</p>
<p>Private API, including functions, modules and methods beginning <code class="docutils literal notranslate"><span class="pre">_</span></code>
are not assured to be stable.</p>
</dd>
<dt id="term-array-like">array-like<a class="headerlink" href="#term-array-like" title="Link to this term">#</a></dt><dd><p>The most common data format for <em>input</em> to Scikit-learn estimators and
functions, array-like is any type object for which
<a class="reference external" href="https://numpy.org/doc/stable/reference/generated/numpy.asarray.html#numpy.asarray" title="(in NumPy v2.2)"><code class="xref py py-func docutils literal notranslate"><span class="pre">numpy.asarray</span></code></a> will produce an array of appropriate shape
(usually 1 or 2-dimensional) of appropriate dtype (usually numeric).</p>
<p>This includes:</p>
<ul class="simple">
<li><p>a numpy array</p></li>
<li><p>a list of numbers</p></li>
<li><p>a list of length-k lists of numbers for some fixed length k</p></li>
<li><p>a <a class="reference external" href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html#pandas.DataFrame" title="(in pandas v2.2.3)"><code class="xref py py-class docutils literal notranslate"><span class="pre">pandas.DataFrame</span></code></a> with all columns numeric</p></li>
<li><p>a numeric <a class="reference external" href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.html#pandas.Series" title="(in pandas v2.2.3)"><code class="xref py py-class docutils literal notranslate"><span class="pre">pandas.Series</span></code></a></p></li>
</ul>
<p>It excludes:</p>
<ul class="simple">
<li><p>a <a class="reference internal" href="#term-sparse-matrix"><span class="xref std std-term">sparse matrix</span></a></p></li>
<li><p>a sparse array</p></li>
<li><p>an iterator</p></li>
<li><p>a generator</p></li>
</ul>
<p>Note that <em>output</em> from scikit-learn estimators and functions (e.g.
predictions) should generally be arrays or sparse matrices, or lists
thereof (as in multi-output <a class="reference internal" href="modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier" title="sklearn.tree.DecisionTreeClassifier"><code class="xref py py-class docutils literal notranslate"><span class="pre">tree.DecisionTreeClassifier</span></code></a>’s
<code class="docutils literal notranslate"><span class="pre">predict_proba</span></code>). An estimator where <code class="docutils literal notranslate"><span class="pre">predict()</span></code> returns a list or
a <code class="docutils literal notranslate"><span class="pre">pandas.Series</span></code> is not valid.</p>
</dd>
<dt id="term-attribute">attribute<a class="headerlink" href="#term-attribute" title="Link to this term">#</a></dt><dt id="term-attributes">attributes<a class="headerlink" href="#term-attributes" title="Link to this term">#</a></dt><dd><p>We mostly use attribute to refer to how model information is stored on
an estimator during fitting. Any public attribute stored on an
estimator instance is required to begin with an alphabetic character
and end in a single underscore if it is set in <a class="reference internal" href="#term-fit"><span class="xref std std-term">fit</span></a> or
<a class="reference internal" href="#term-partial_fit"><span class="xref std std-term">partial_fit</span></a>. These are what is documented under an estimator’s
<em>Attributes</em> documentation. The information stored in attributes is
usually either: sufficient statistics used for prediction or
transformation; <a class="reference internal" href="#term-transductive"><span class="xref std std-term">transductive</span></a> outputs such as <a class="reference internal" href="#term-labels_"><span class="xref std std-term">labels_</span></a> or
<a class="reference internal" href="#term-embedding_"><span class="xref std std-term">embedding_</span></a>; or diagnostic data, such as
<a class="reference internal" href="#term-feature_importances_"><span class="xref std std-term">feature_importances_</span></a>.
Common attributes are listed <a class="reference internal" href="#glossary-attributes"><span class="std std-ref">below</span></a>.</p>
<p>A public attribute may have the same name as a constructor
<a class="reference internal" href="#term-parameter"><span class="xref std std-term">parameter</span></a>, with a <code class="docutils literal notranslate"><span class="pre">_</span></code> appended. This is used to store a
validated or estimated version of the user’s input. For example,
<a class="reference internal" href="modules/generated/sklearn.decomposition.PCA.html#sklearn.decomposition.PCA" title="sklearn.decomposition.PCA"><code class="xref py py-class docutils literal notranslate"><span class="pre">decomposition.PCA</span></code></a> is constructed with an <code class="docutils literal notranslate"><span class="pre">n_components</span></code>
parameter. From this, together with other parameters and the data,
PCA estimates the attribute <code class="docutils literal notranslate"><span class="pre">n_components_</span></code>.</p>
<p>Further private attributes used in prediction/transformation/etc. may
also be set when fitting. These begin with a single underscore and are
not assured to be stable for public access.</p>
<p>A public attribute on an estimator instance that does not end in an
underscore should be the stored, unmodified value of an <code class="docutils literal notranslate"><span class="pre">__init__</span></code>
<a class="reference internal" href="#term-parameter"><span class="xref std std-term">parameter</span></a> of the same name. Because of this equivalence, these
are documented under an estimator’s <em>Parameters</em> documentation.</p>
</dd>
<dt id="term-backwards-compatibility">backwards compatibility<a class="headerlink" href="#term-backwards-compatibility" title="Link to this term">#</a></dt><dd><p>We generally try to maintain backward compatibility (i.e. interfaces
and behaviors may be extended but not changed or removed) from release
to release but this comes with some exceptions:</p>
<dl class="simple">
<dt>Public API only</dt><dd><p>The behavior of objects accessed through private identifiers
(those beginning <code class="docutils literal notranslate"><span class="pre">_</span></code>) may be changed arbitrarily between
versions.</p>
</dd>
<dt>As documented</dt><dd><p>We will generally assume that the users have adhered to the
documented parameter types and ranges. If the documentation asks
for a list and the user gives a tuple, we do not assure consistent
behavior from version to version.</p>
</dd>
<dt>Deprecation</dt><dd><p>Behaviors may change following a <a class="reference internal" href="#term-deprecation"><span class="xref std std-term">deprecation</span></a> period
(usually two releases long). Warnings are issued using Python’s
<a class="reference external" href="https://docs.python.org/3/library/warnings.html#module-warnings" title="(in Python v3.13)"><code class="xref py py-mod docutils literal notranslate"><span class="pre">warnings</span></code></a> module.</p>
</dd>
<dt>Keyword arguments</dt><dd><p>We may sometimes assume that all optional parameters (other than X
and y to <a class="reference internal" href="#term-fit"><span class="xref std std-term">fit</span></a> and similar methods) are passed as keyword
arguments only and may be positionally reordered.</p>
</dd>
<dt>Bug fixes and enhancements</dt><dd><p>Bug fixes and – less often – enhancements may change the behavior
of estimators, including the predictions of an estimator trained on
the same data and <a class="reference internal" href="#term-random_state"><span class="xref std std-term">random_state</span></a>. When this happens, we
attempt to note it clearly in the changelog.</p>
</dd>
<dt>Serialization</dt><dd><p>We make no assurances that pickling an estimator in one version
will allow it to be unpickled to an equivalent model in the
subsequent version. (For estimators in the sklearn package, we
issue a warning when this unpickling is attempted, even if it may
happen to work.) See <a class="reference internal" href="model_persistence.html#persistence-limitations"><span class="std std-ref">Security & Maintainability Limitations</span></a>.</p>
</dd>
<dt><a class="reference internal" href="modules/generated/sklearn.utils.estimator_checks.check_estimator.html#sklearn.utils.estimator_checks.check_estimator" title="sklearn.utils.estimator_checks.check_estimator"><code class="xref py py-func docutils literal notranslate"><span class="pre">utils.estimator_checks.check_estimator</span></code></a></dt><dd><p>We provide limited backwards compatibility assurances for the
estimator checks: we may add extra requirements on estimators
tested with this function, usually when these were informally
assumed but not formally tested.</p>
</dd>
</dl>
<p>Despite this informal contract with our users, the software is provided
as is, as stated in the license. When a release inadvertently
introduces changes that are not backward compatible, these are known
as software regressions.</p>
</dd>
<dt id="term-callable">callable<a class="headerlink" href="#term-callable" title="Link to this term">#</a></dt><dd><p>A function, class or an object which implements the <code class="docutils literal notranslate"><span class="pre">__call__</span></code>
method; anything that returns True when the argument of <a class="reference external" href="https://docs.python.org/3/library/functions.html#callable">callable()</a>.</p>
</dd>
<dt id="term-categorical-feature">categorical feature<a class="headerlink" href="#term-categorical-feature" title="Link to this term">#</a></dt><dd><p>A categorical or nominal <a class="reference internal" href="#term-feature"><span class="xref std std-term">feature</span></a> is one that has a
finite set of discrete values across the population of data.
These are commonly represented as columns of integers or
strings. Strings will be rejected by most scikit-learn
estimators, and integers will be treated as ordinal or
count-valued. For the use with most estimators, categorical
variables should be one-hot encoded. Notable exceptions include
tree-based models such as random forests and gradient boosting
models that often work better and faster with integer-coded
categorical variables.
<a class="reference internal" href="modules/generated/sklearn.preprocessing.OrdinalEncoder.html#sklearn.preprocessing.OrdinalEncoder" title="sklearn.preprocessing.OrdinalEncoder"><code class="xref py py-class docutils literal notranslate"><span class="pre">OrdinalEncoder</span></code></a> helps encoding
string-valued categorical features as ordinal integers, and
<a class="reference internal" href="modules/generated/sklearn.preprocessing.OneHotEncoder.html#sklearn.preprocessing.OneHotEncoder" title="sklearn.preprocessing.OneHotEncoder"><code class="xref py py-class docutils literal notranslate"><span class="pre">OneHotEncoder</span></code></a> can be used to
one-hot encode categorical features.
See also <a class="reference internal" href="modules/preprocessing.html#preprocessing-categorical-features"><span class="std std-ref">Encoding categorical features</span></a> and the
<a class="reference external" href="https://github.com/scikit-learn-contrib/category_encoders">categorical-encoding</a>
package for tools related to encoding categorical features.</p>
</dd>
<dt id="term-clone">clone<a class="headerlink" href="#term-clone" title="Link to this term">#</a></dt><dt id="term-cloned">cloned<a class="headerlink" href="#term-cloned" title="Link to this term">#</a></dt><dd><p>To copy an <a class="reference internal" href="#term-estimator-instance"><span class="xref std std-term">estimator instance</span></a> and create a new one with
identical <a class="reference internal" href="#term-parameters"><span class="xref std std-term">parameters</span></a>, but without any fitted
<a class="reference internal" href="#term-attributes"><span class="xref std std-term">attributes</span></a>, using <a class="reference internal" href="modules/generated/sklearn.base.clone.html#sklearn.base.clone" title="sklearn.base.clone"><code class="xref py py-func docutils literal notranslate"><span class="pre">clone</span></code></a>.</p>
<p>When <code class="docutils literal notranslate"><span class="pre">fit</span></code> is called, a <a class="reference internal" href="#term-meta-estimator"><span class="xref std std-term">meta-estimator</span></a> usually clones
a wrapped estimator instance before fitting the cloned instance.
(Exceptions, for legacy reasons, include
<a class="reference internal" href="modules/generated/sklearn.pipeline.Pipeline.html#sklearn.pipeline.Pipeline" title="sklearn.pipeline.Pipeline"><code class="xref py py-class docutils literal notranslate"><span class="pre">Pipeline</span></code></a> and
<a class="reference internal" href="modules/generated/sklearn.pipeline.FeatureUnion.html#sklearn.pipeline.FeatureUnion" title="sklearn.pipeline.FeatureUnion"><code class="xref py py-class docutils literal notranslate"><span class="pre">FeatureUnion</span></code></a>.)</p>
<p>If the estimator’s <code class="docutils literal notranslate"><span class="pre">random_state</span></code> parameter is an integer (or if the
estimator doesn’t have a <code class="docutils literal notranslate"><span class="pre">random_state</span></code> parameter), an <em>exact clone</em>
is returned: the clone and the original estimator will give the exact
same results. Otherwise, <em>statistical clone</em> is returned: the clone
might yield different results from the original estimator. More
details can be found in <a class="reference internal" href="common_pitfalls.html#randomness"><span class="std std-ref">Controlling randomness</span></a>.</p>
</dd>
<dt id="term-common-tests">common tests<a class="headerlink" href="#term-common-tests" title="Link to this term">#</a></dt><dd><p>This refers to the tests run on almost every estimator class in
Scikit-learn to check they comply with basic API conventions. They are
available for external use through
<a class="reference internal" href="modules/generated/sklearn.utils.estimator_checks.check_estimator.html#sklearn.utils.estimator_checks.check_estimator" title="sklearn.utils.estimator_checks.check_estimator"><code class="xref py py-func docutils literal notranslate"><span class="pre">utils.estimator_checks.check_estimator</span></code></a> or
<a class="reference internal" href="modules/generated/sklearn.utils.estimator_checks.parametrize_with_checks.html#sklearn.utils.estimator_checks.parametrize_with_checks" title="sklearn.utils.estimator_checks.parametrize_with_checks"><code class="xref py py-func docutils literal notranslate"><span class="pre">utils.estimator_checks.parametrize_with_checks</span></code></a>, with most of the
implementation in <code class="docutils literal notranslate"><span class="pre">sklearn/utils/estimator_checks.py</span></code>.</p>
<p>Note: Some exceptions to the common testing regime are currently
hard-coded into the library, but we hope to replace this by marking
exceptional behaviours on the estimator using semantic <a class="reference internal" href="#term-estimator-tags"><span class="xref std std-term">estimator
tags</span></a>.</p>
</dd>
<dt id="term-cross-fitting">cross-fitting<a class="headerlink" href="#term-cross-fitting" title="Link to this term">#</a></dt><dt id="term-0">cross fitting<a class="headerlink" href="#term-0" title="Link to this term">#</a></dt><dd><p>A resampling method that iteratively partitions data into mutually
exclusive subsets to fit two stages. During the first stage, the
mutually exclusive subsets enable predictions or transformations to be
computed on data not seen during training. The computed data is then
used in the second stage. The objective is to avoid having any
overfitting in the first stage introduce bias into the input data
distribution of the second stage.
For examples of its use, see: <a class="reference internal" href="modules/generated/sklearn.preprocessing.TargetEncoder.html#sklearn.preprocessing.TargetEncoder" title="sklearn.preprocessing.TargetEncoder"><code class="xref py py-class docutils literal notranslate"><span class="pre">TargetEncoder</span></code></a>,
<a class="reference internal" href="modules/generated/sklearn.ensemble.StackingClassifier.html#sklearn.ensemble.StackingClassifier" title="sklearn.ensemble.StackingClassifier"><code class="xref py py-class docutils literal notranslate"><span class="pre">StackingClassifier</span></code></a>,
<a class="reference internal" href="modules/generated/sklearn.ensemble.StackingRegressor.html#sklearn.ensemble.StackingRegressor" title="sklearn.ensemble.StackingRegressor"><code class="xref py py-class docutils literal notranslate"><span class="pre">StackingRegressor</span></code></a> and
<a class="reference internal" href="modules/generated/sklearn.calibration.CalibratedClassifierCV.html#sklearn.calibration.CalibratedClassifierCV" title="sklearn.calibration.CalibratedClassifierCV"><code class="xref py py-class docutils literal notranslate"><span class="pre">CalibratedClassifierCV</span></code></a>.</p>
</dd>
<dt id="term-cross-validation">cross-validation<a class="headerlink" href="#term-cross-validation" title="Link to this term">#</a></dt><dt id="term-1">cross validation<a class="headerlink" href="#term-1" title="Link to this term">#</a></dt><dd><p>A resampling method that iteratively partitions data into mutually
exclusive ‘train’ and ‘test’ subsets so model performance can be
evaluated on unseen data. This conserves data as avoids the need to hold
out a ‘validation’ dataset and accounts for variability as multiple
rounds of cross validation are generally performed.
See <a class="reference internal" href="modules/cross_validation.html#cross-validation"><span class="std std-ref">User Guide</span></a> for more details.</p>
</dd>
<dt id="term-deprecation">deprecation<a class="headerlink" href="#term-deprecation" title="Link to this term">#</a></dt><dd><p>We use deprecation to slowly violate our <a class="reference internal" href="#term-backwards-compatibility"><span class="xref std std-term">backwards
compatibility</span></a> assurances, usually to:</p>
<ul class="simple">
<li><p>change the default value of a parameter; or</p></li>
<li><p>remove a parameter, attribute, method, class, etc.</p></li>
</ul>
<p>We will ordinarily issue a warning when a deprecated element is used,
although there may be limitations to this. For instance, we will raise
a warning when someone sets a parameter that has been deprecated, but
may not when they access that parameter’s attribute on the estimator
instance.</p>
<p>See the <a class="reference internal" href="developers/contributing.html#contributing-deprecation"><span class="std std-ref">Contributors’ Guide</span></a>.</p>
</dd>
<dt id="term-dimensionality">dimensionality<a class="headerlink" href="#term-dimensionality" title="Link to this term">#</a></dt><dd><p>May be used to refer to the number of <a class="reference internal" href="#term-features"><span class="xref std std-term">features</span></a> (i.e.
<a class="reference internal" href="#term-n_features"><span class="xref std std-term">n_features</span></a>), or columns in a 2d feature matrix.
Dimensions are, however, also used to refer to the length of a NumPy
array’s shape, distinguishing a 1d array from a 2d matrix.</p>
</dd>
<dt id="term-docstring">docstring<a class="headerlink" href="#term-docstring" title="Link to this term">#</a></dt><dd><p>The embedded documentation for a module, class, function, etc., usually
in code as a string at the beginning of the object’s definition, and
accessible as the object’s <code class="docutils literal notranslate"><span class="pre">__doc__</span></code> attribute.</p>
<p>We try to adhere to <a class="reference external" href="https://www.python.org/dev/peps/pep-0257/">PEP257</a>, and follow <a class="reference external" href="https://numpydoc.readthedocs.io/en/latest/format.html">NumpyDoc
conventions</a>.</p>
</dd>
<dt id="term-double-underscore">double underscore<a class="headerlink" href="#term-double-underscore" title="Link to this term">#</a></dt><dt id="term-double-underscore-notation">double underscore notation<a class="headerlink" href="#term-double-underscore-notation" title="Link to this term">#</a></dt><dd><p>When specifying parameter names for nested estimators, <code class="docutils literal notranslate"><span class="pre">__</span></code> may be
used to separate between parent and child in some contexts. The most
common use is when setting parameters through a meta-estimator with
<a class="reference internal" href="#term-set_params"><span class="xref std std-term">set_params</span></a> and hence in specifying a search grid in
<a class="reference internal" href="modules/grid_search.html#grid-search"><span class="std std-ref">parameter search</span></a>. See <a class="reference internal" href="#term-parameter"><span class="xref std std-term">parameter</span></a>.
It is also used in <a class="reference internal" href="modules/generated/sklearn.pipeline.Pipeline.html#sklearn.pipeline.Pipeline.fit" title="sklearn.pipeline.Pipeline.fit"><code class="xref py py-meth docutils literal notranslate"><span class="pre">pipeline.Pipeline.fit</span></code></a> for passing
<a class="reference internal" href="#term-sample-properties"><span class="xref std std-term">sample properties</span></a> to the <code class="docutils literal notranslate"><span class="pre">fit</span></code> methods of estimators in
the pipeline.</p>
</dd>
<dt id="term-dtype">dtype<a class="headerlink" href="#term-dtype" title="Link to this term">#</a></dt><dt id="term-data-type">data type<a class="headerlink" href="#term-data-type" title="Link to this term">#</a></dt><dd><p>NumPy arrays assume a homogeneous data type throughout, available in
the <code class="docutils literal notranslate"><span class="pre">.dtype</span></code> attribute of an array (or sparse matrix). We generally
assume simple data types for scikit-learn data: float or integer.
We may support object or string data types for arrays before encoding
or vectorizing. Our estimators do not work with struct arrays, for
instance.</p>
<p>Our documentation can sometimes give information about the dtype
precision, e.g. <code class="docutils literal notranslate"><span class="pre">np.int32</span></code>, <code class="docutils literal notranslate"><span class="pre">np.int64</span></code>, etc. When the precision is
provided, it refers to the NumPy dtype. If an arbitrary precision is
used, the documentation will refer to dtype <code class="docutils literal notranslate"><span class="pre">integer</span></code> or <code class="docutils literal notranslate"><span class="pre">floating</span></code>.
Note that in this case, the precision can be platform dependent.
The <code class="docutils literal notranslate"><span class="pre">numeric</span></code> dtype refers to accepting both <code class="docutils literal notranslate"><span class="pre">integer</span></code> and <code class="docutils literal notranslate"><span class="pre">floating</span></code>.</p>
<p>When it comes to choosing between 64-bit dtype (i.e. <code class="docutils literal notranslate"><span class="pre">np.float64</span></code> and
<code class="docutils literal notranslate"><span class="pre">np.int64</span></code>) and 32-bit dtype (i.e. <code class="docutils literal notranslate"><span class="pre">np.float32</span></code> and <code class="docutils literal notranslate"><span class="pre">np.int32</span></code>), it
boils down to a trade-off between efficiency and precision. The 64-bit
types offer more accurate results due to their lower floating-point
error, but demand more computational resources, resulting in slower
operations and increased memory usage. In contrast, 32-bit types
promise enhanced operation speed and reduced memory consumption, but
introduce a larger floating-point error. The efficiency improvement are
dependent on lower level optimization such as like vectorization,
single instruction multiple dispatch (SIMD), or cache optimization but
crucially on the compatibility of the algorithm in use.</p>
<p>Specifically, the choice of precision should account for whether the
employed algorithm can effectively leverage <code class="docutils literal notranslate"><span class="pre">np.float32</span></code>. Some
algorithms, especially certain minimization methods, are exclusively
coded for <code class="docutils literal notranslate"><span class="pre">np.float64</span></code>, meaning that even if <code class="docutils literal notranslate"><span class="pre">np.float32</span></code> is passed, it
triggers an automatic conversion back to <code class="docutils literal notranslate"><span class="pre">np.float64</span></code>. This not only
negates the intended computational savings but also introduces
additional overhead, making operations with <code class="docutils literal notranslate"><span class="pre">np.float32</span></code> unexpectedly
slower and more memory-intensive due to this extra conversion step.</p>
</dd>
<dt id="term-duck-typing">duck typing<a class="headerlink" href="#term-duck-typing" title="Link to this term">#</a></dt><dd><p>We try to apply <a class="reference external" href="https://en.wikipedia.org/wiki/Duck_typing">duck typing</a> to determine how to
handle some input values (e.g. checking whether a given estimator is
a classifier). That is, we avoid using <code class="docutils literal notranslate"><span class="pre">isinstance</span></code> where possible,
and rely on the presence or absence of attributes to determine an
object’s behaviour. Some nuance is required when following this
approach:</p>
<ul>
<li><p>For some estimators, an attribute may only be available once it is
<a class="reference internal" href="#term-fitted"><span class="xref std std-term">fitted</span></a>. For instance, we cannot a priori determine if
<a class="reference internal" href="#term-predict_proba"><span class="xref std std-term">predict_proba</span></a> is available in a grid search where the grid
includes alternating between a probabilistic and a non-probabilistic
predictor in the final step of the pipeline. In the following, we
can only determine if <code class="docutils literal notranslate"><span class="pre">clf</span></code> is probabilistic after fitting it on
some data:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">from</span> <span class="nn">sklearn.model_selection</span> <span class="kn">import</span> <span class="n">GridSearchCV</span>
<span class="gp">>>> </span><span class="kn">from</span> <span class="nn">sklearn.linear_model</span> <span class="kn">import</span> <span class="n">SGDClassifier</span>
<span class="gp">>>> </span><span class="n">clf</span> <span class="o">=</span> <span class="n">GridSearchCV</span><span class="p">(</span><span class="n">SGDClassifier</span><span class="p">(),</span>
<span class="gp">... </span> <span class="n">param_grid</span><span class="o">=</span><span class="p">{</span><span class="s1">'loss'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'log_loss'</span><span class="p">,</span> <span class="s1">'hinge'</span><span class="p">]})</span>
</pre></div>
</div>
<p>This means that we can only check for duck-typed attributes after
fitting, and that we must be careful to make <a class="reference internal" href="#term-meta-estimators"><span class="xref std std-term">meta-estimators</span></a>
only present attributes according to the state of the underlying
estimator after fitting.</p>
</li>
<li><p>Checking if an attribute is present (using <code class="docutils literal notranslate"><span class="pre">hasattr</span></code>) is in general
just as expensive as getting the attribute (<code class="docutils literal notranslate"><span class="pre">getattr</span></code> or dot
notation). In some cases, getting the attribute may indeed be
expensive (e.g. for some implementations of
<a class="reference internal" href="#term-feature_importances_"><span class="xref std std-term">feature_importances_</span></a>, which may suggest this is an API design
flaw). So code which does <code class="docutils literal notranslate"><span class="pre">hasattr</span></code> followed by <code class="docutils literal notranslate"><span class="pre">getattr</span></code> should
be avoided; <code class="docutils literal notranslate"><span class="pre">getattr</span></code> within a try-except block is preferred.</p></li>
<li><p>For determining some aspects of an estimator’s expectations or
support for some feature, we use <a class="reference internal" href="#term-estimator-tags"><span class="xref std std-term">estimator tags</span></a> instead of
duck typing.</p></li>
</ul>
</dd>
<dt id="term-early-stopping">early stopping<a class="headerlink" href="#term-early-stopping" title="Link to this term">#</a></dt><dd><p>This consists in stopping an iterative optimization method before the
convergence of the training loss, to avoid over-fitting. This is
generally done by monitoring the generalization score on a validation
set. When available, it is activated through the parameter
<code class="docutils literal notranslate"><span class="pre">early_stopping</span></code> or by setting a positive <a class="reference internal" href="#term-n_iter_no_change"><span class="xref std std-term">n_iter_no_change</span></a>.</p>
</dd>
<dt id="term-estimator-instance">estimator instance<a class="headerlink" href="#term-estimator-instance" title="Link to this term">#</a></dt><dd><p>We sometimes use this terminology to distinguish an <a class="reference internal" href="#term-estimator"><span class="xref std std-term">estimator</span></a>
class from a constructed instance. For example, in the following,
<code class="docutils literal notranslate"><span class="pre">cls</span></code> is an estimator class, while <code class="docutils literal notranslate"><span class="pre">est1</span></code> and <code class="docutils literal notranslate"><span class="pre">est2</span></code> are
instances:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="bp">cls</span> <span class="o">=</span> <span class="n">RandomForestClassifier</span>
<span class="n">est1</span> <span class="o">=</span> <span class="bp">cls</span><span class="p">()</span>
<span class="n">est2</span> <span class="o">=</span> <span class="n">RandomForestClassifier</span><span class="p">()</span>
</pre></div>
</div>
</dd>
<dt id="term-examples">examples<a class="headerlink" href="#term-examples" title="Link to this term">#</a></dt><dd><p>We try to give examples of basic usage for most functions and
classes in the API:</p>
<ul class="simple">
<li><p>as doctests in their docstrings (i.e. within the <code class="docutils literal notranslate"><span class="pre">sklearn/</span></code> library
code itself).</p></li>
<li><p>as examples in the <a class="reference internal" href="auto_examples/index.html#general-examples"><span class="std std-ref">example gallery</span></a>
rendered (using <a class="reference external" href="https://sphinx-gallery.readthedocs.io/">sphinx-gallery</a>) from scripts in the
<code class="docutils literal notranslate"><span class="pre">examples/</span></code> directory, exemplifying key features or parameters
of the estimator/function. These should also be referenced from the
User Guide.</p></li>
<li><p>sometimes in the <a class="reference internal" href="user_guide.html#user-guide"><span class="std std-ref">User Guide</span></a> (built from <code class="docutils literal notranslate"><span class="pre">doc/</span></code>)
alongside a technical description of the estimator.</p></li>
</ul>
</dd>
<dt id="term-experimental">experimental<a class="headerlink" href="#term-experimental" title="Link to this term">#</a></dt><dd><p>An experimental tool is already usable but its public API, such as
default parameter values or fitted attributes, is still subject to
change in future versions without the usual <a class="reference internal" href="#term-deprecation"><span class="xref std std-term">deprecation</span></a>
warning policy.</p>
</dd>
<dt id="term-evaluation-metric">evaluation metric<a class="headerlink" href="#term-evaluation-metric" title="Link to this term">#</a></dt><dt id="term-evaluation-metrics">evaluation metrics<a class="headerlink" href="#term-evaluation-metrics" title="Link to this term">#</a></dt><dd><p>Evaluation metrics give a measure of how well a model performs. We may
use this term specifically to refer to the functions in <a class="reference internal" href="api/sklearn.metrics.html#module-sklearn.metrics" title="sklearn.metrics"><code class="xref py py-mod docutils literal notranslate"><span class="pre">metrics</span></code></a>
(disregarding <a class="reference internal" href="api/sklearn.metrics.html#module-sklearn.metrics.pairwise" title="sklearn.metrics.pairwise"><code class="xref py py-mod docutils literal notranslate"><span class="pre">pairwise</span></code></a>), as distinct from the
<a class="reference internal" href="#term-score"><span class="xref std std-term">score</span></a> method and the <a class="reference internal" href="#term-scoring"><span class="xref std std-term">scoring</span></a> API used in cross
validation. See <a class="reference internal" href="modules/model_evaluation.html#model-evaluation"><span class="std std-ref">Metrics and scoring: quantifying the quality of predictions</span></a>.</p>
<p>These functions usually accept a ground truth (or the raw data
where the metric evaluates clustering without a ground truth) and a
prediction, be it the output of <a class="reference internal" href="#term-predict"><span class="xref std std-term">predict</span></a> (<code class="docutils literal notranslate"><span class="pre">y_pred</span></code>),
of <a class="reference internal" href="#term-predict_proba"><span class="xref std std-term">predict_proba</span></a> (<code class="docutils literal notranslate"><span class="pre">y_proba</span></code>), or of an arbitrary score
function including <a class="reference internal" href="#term-decision_function"><span class="xref std std-term">decision_function</span></a> (<code class="docutils literal notranslate"><span class="pre">y_score</span></code>).
Functions are usually named to end with <code class="docutils literal notranslate"><span class="pre">_score</span></code> if a greater
score indicates a better model, and <code class="docutils literal notranslate"><span class="pre">_loss</span></code> if a lesser score
indicates a better model. This diversity of interface motivates
the scoring API.</p>
<p>Note that some estimators can calculate metrics that are not included
in <a class="reference internal" href="api/sklearn.metrics.html#module-sklearn.metrics" title="sklearn.metrics"><code class="xref py py-mod docutils literal notranslate"><span class="pre">metrics</span></code></a> and are estimator-specific, notably model
likelihoods.</p>
</dd>
<dt id="term-estimator-tags">estimator tags<a class="headerlink" href="#term-estimator-tags" title="Link to this term">#</a></dt><dd><p>Estimator tags describe certain capabilities of an estimator. This would
enable some runtime behaviors based on estimator inspection, but it
also allows each estimator to be tested for appropriate invariances
while being excepted from other <a class="reference internal" href="#term-common-tests"><span class="xref std std-term">common tests</span></a>.</p>
<p>Some aspects of estimator tags are currently determined through
the <a class="reference internal" href="#term-duck-typing"><span class="xref std std-term">duck typing</span></a> of methods like <code class="docutils literal notranslate"><span class="pre">predict_proba</span></code> and through
some special attributes on estimator objects:</p>
<p>For more detailed info, see <a class="reference internal" href="developers/develop.html#estimator-tags"><span class="std std-ref">Estimator Tags</span></a>.</p>
</dd>
<dt id="term-feature">feature<a class="headerlink" href="#term-feature" title="Link to this term">#</a></dt><dt id="term-features">features<a class="headerlink" href="#term-features" title="Link to this term">#</a></dt><dt id="term-feature-vector">feature vector<a class="headerlink" href="#term-feature-vector" title="Link to this term">#</a></dt><dd><p>In the abstract, a feature is a function (in its mathematical sense)
mapping a sampled object to a numeric or categorical quantity.
“Feature” is also commonly used to refer to these quantities, being the
individual elements of a vector representing a sample. In a data
matrix, features are represented as columns: each column contains the
result of applying a feature function to a set of samples.</p>
<p>Elsewhere features are known as attributes, predictors, regressors, or
independent variables.</p>
<p>Nearly all estimators in scikit-learn assume that features are numeric,
finite and not missing, even when they have semantically distinct
domains and distributions (categorical, ordinal, count-valued,
real-valued, interval). See also <a class="reference internal" href="#term-categorical-feature"><span class="xref std std-term">categorical feature</span></a> and
<a class="reference internal" href="#term-missing-values"><span class="xref std std-term">missing values</span></a>.</p>
<p><code class="docutils literal notranslate"><span class="pre">n_features</span></code> indicates the number of features in a dataset.</p>
</dd>
<dt id="term-fitting">fitting<a class="headerlink" href="#term-fitting" title="Link to this term">#</a></dt><dd><p>Calling <a class="reference internal" href="#term-fit"><span class="xref std std-term">fit</span></a> (or <a class="reference internal" href="#term-fit_transform"><span class="xref std std-term">fit_transform</span></a>, <a class="reference internal" href="#term-fit_predict"><span class="xref std std-term">fit_predict</span></a>,
etc.) on an estimator.</p>
</dd>
<dt id="term-fitted">fitted<a class="headerlink" href="#term-fitted" title="Link to this term">#</a></dt><dd><p>The state of an estimator after <a class="reference internal" href="#term-fitting"><span class="xref std std-term">fitting</span></a>.</p>
<p>There is no conventional procedure for checking if an estimator
is fitted. However, an estimator that is not fitted:</p>
<ul class="simple">
<li><p>should raise <a class="reference internal" href="modules/generated/sklearn.exceptions.NotFittedError.html#sklearn.exceptions.NotFittedError" title="sklearn.exceptions.NotFittedError"><code class="xref py py-class docutils literal notranslate"><span class="pre">exceptions.NotFittedError</span></code></a> when a prediction
method (<a class="reference internal" href="#term-predict"><span class="xref std std-term">predict</span></a>, <a class="reference internal" href="#term-transform"><span class="xref std std-term">transform</span></a>, etc.) is called.
(<a class="reference internal" href="modules/generated/sklearn.utils.validation.check_is_fitted.html#sklearn.utils.validation.check_is_fitted" title="sklearn.utils.validation.check_is_fitted"><code class="xref py py-func docutils literal notranslate"><span class="pre">utils.validation.check_is_fitted</span></code></a> is used internally
for this purpose.)</p></li>
<li><p>should not have any <a class="reference internal" href="#term-attributes"><span class="xref std std-term">attributes</span></a> beginning with an alphabetic
character and ending with an underscore. (Note that a descriptor for
the attribute may still be present on the class, but hasattr should
return False)</p></li>
</ul>
</dd>
<dt id="term-function">function<a class="headerlink" href="#term-function" title="Link to this term">#</a></dt><dd><p>We provide ad hoc function interfaces for many algorithms, while
<a class="reference internal" href="#term-estimator"><span class="xref std std-term">estimator</span></a> classes provide a more consistent interface.</p>
<p>In particular, Scikit-learn may provide a function interface that fits
a model to some data and returns the learnt model parameters, as in
<a class="reference internal" href="modules/generated/sklearn.linear_model.enet_path.html#sklearn.linear_model.enet_path" title="sklearn.linear_model.enet_path"><code class="xref py py-func docutils literal notranslate"><span class="pre">linear_model.enet_path</span></code></a>. For transductive models, this also
returns the embedding or cluster labels, as in
<a class="reference internal" href="modules/generated/sklearn.manifold.spectral_embedding.html#sklearn.manifold.spectral_embedding" title="sklearn.manifold.spectral_embedding"><code class="xref py py-func docutils literal notranslate"><span class="pre">manifold.spectral_embedding</span></code></a> or <a class="reference internal" href="modules/generated/dbscan-function.html#sklearn.cluster.dbscan" title="sklearn.cluster.dbscan"><code class="xref py py-func docutils literal notranslate"><span class="pre">cluster.dbscan</span></code></a>. Many
preprocessing transformers also provide a function interface, akin to
calling <a class="reference internal" href="#term-fit_transform"><span class="xref std std-term">fit_transform</span></a>, as in
<a class="reference internal" href="modules/generated/sklearn.preprocessing.maxabs_scale.html#sklearn.preprocessing.maxabs_scale" title="sklearn.preprocessing.maxabs_scale"><code class="xref py py-func docutils literal notranslate"><span class="pre">preprocessing.maxabs_scale</span></code></a>. Users should be careful to avoid
<a class="reference internal" href="#term-data-leakage"><span class="xref std std-term">data leakage</span></a> when making use of these
<code class="docutils literal notranslate"><span class="pre">fit_transform</span></code>-equivalent functions.</p>
<p>We do not have a strict policy about when to or when not to provide
function forms of estimators, but maintainers should consider
consistency with existing interfaces, and whether providing a function
would lead users astray from best practices (as regards data leakage,
etc.)</p>
</dd>