@@ -6,104 +6,123 @@ The lectures on
6
6
[ Latent Semantic Analysis (LSA)] ( https://en.wikipedia.org/wiki/Latent_semantic_analysis )
7
7
are to be recorded through Wolfram University (Wolfram U) in December 2019 and January-February 2020.
8
8
9
+ -----
9
10
10
11
## The lectures (as live-coding sessions)
11
12
12
- 1 . [X] Overview Latent Semantic Analysis (LSA) typical problems and basic workflows.
13
- Answering preliminary anticipated questions.
14
- Here is
15
- [ the recording of the first session at YouTube] ( https://www.youtube.com/watch?v=d5M54_9AMVQ ) .
13
+ ### 1. [ X] Overview Latent Semantic Analysis (LSA) typical problems and basic workflows
14
+
15
+ Answering preliminary anticipated questions.
16
+
17
+ Here is
18
+ [ the recording of the first session at YouTube] ( https://www.youtube.com/watch?v=d5M54_9AMVQ ) .
16
19
17
- - What are the typical applications of LSA?
18
- - Why use LSA?
19
- - What it the fundamental philosophical or scientific assumption for LSA?
20
- - What is the most important and/or fundamental step of LSA?
21
- - What is the difference between LSA and Latent Semantic Indexing (LSI)?
22
- - What are the alternatives?
23
- - Using Neural Networks instead?
24
- - How is LSA used to derive similarities between two given texts?
25
- - How is LSA used to evaluate the proximity of phrases?
26
- (That have different words, but close semantic meaning.)
27
- - How the main dimension reduction methods compare?
20
+ - What are the typical applications of LSA?
21
+ - Why use LSA?
22
+ - What it the fundamental philosophical or scientific assumption for LSA?
23
+ - What is the most important and/or fundamental step of LSA?
24
+ - What is the difference between LSA and Latent Semantic Indexing (LSI)?
25
+ - What are the alternatives?
26
+ - Using Neural Networks instead?
27
+ - How is LSA used to derive similarities between two given texts?
28
+ - How is LSA used to evaluate the proximity of phrases?
29
+ (That have different words, but close semantic meaning.)
30
+ - How the main dimension reduction methods compare?
28
31
29
- 2 . [X] LSA for document collections.
30
- Here is [ the recording of the second session at YouTube] ( https://www.youtube.com/watch?v=5pX5WAfPNb8 ) .
31
-
32
- - Motivational example -- full blown LSA workflow.
32
+ ------
33
+
34
+ ### 2. [ X] LSA for document collections
35
+
36
+ Here is [ the recording of the second session at YouTube] ( https://www.youtube.com/watch?v=5pX5WAfPNb8 ) .
37
+
38
+ - Motivational example -- full blown LSA workflow.
33
39
34
- - Fundamentals, text transformation (the hard way):
35
- - bag of words model,
36
- - stop words,
37
- - stemming.
40
+ - Fundamentals, text transformation (the hard way):
41
+ - bag of words model,
42
+ - stop words,
43
+ - stemming.
44
+
45
+ - The easy way with
46
+ [ LSAMon] ( https://github.com/antononcube/SimplifiedMachineLearningWorkflows-book/blob/master/Part-2-Monadic-Workflows/A-monad-for-Latent-Semantic-Analysis-workflows.md ) .
38
47
39
- - The easy way with
40
- [ LSAMon] ( https://github.com/antononcube/SimplifiedMachineLearningWorkflows-book/blob/master/Part-2-Monadic-Workflows/A-monad-for-Latent-Semantic-Analysis-workflows.md ) .
48
+ - "Eat your own dog food" example.
41
49
42
- - "Eat your own dog food" example.
50
+ ------
43
51
44
- 3 . [X] Representation of the documents - the fundamental matrix object.
45
- Here is [ the recording of the third session at YouTube] ( https://www.youtube.com/watch?v=MNQR28P8Juc ) .
52
+ ### 3. [ X] Representation of the documents - the fundamental matrix object
53
+
54
+ Here is [ the recording of the third session at YouTube] ( https://www.youtube.com/watch?v=MNQR28P8Juc ) .
46
55
47
- - Review: last session's example.
56
+ - Review: last session's example.
48
57
49
- - Review: the motivational example -- full blown LSA workflow.
58
+ - Review: the motivational example -- full blown LSA workflow.
59
+
60
+ - Linear vector space representation:
61
+ - LSA's most fundamental operation,
62
+ - matrix with named rows and columns.
50
63
51
- - Linear vector space representation:
52
- - LSA's most fundamental operation,
53
- - matrix with named rows and columns.
64
+ - Pareto Principle adherence
65
+ - for a document,
66
+ - for a document collection, and
67
+ - (in general.)
54
68
55
- - Pareto Principle adherence
56
- - for a document,
57
- - for a document collection, and
58
- - (in general.)
69
+ ------
59
70
60
- 4 . [X] Representation of unseen documents.
61
- Here is [ the recording of the fourth session at YouTube] ( https://www.youtube.com/watch?v=ElwOLyd9GC4 ) .
71
+ ### 4. [ X] Representation of unseen documents
72
+
73
+ Here is [ the recording of the fourth session at YouTube] ( https://www.youtube.com/watch?v=ElwOLyd9GC4 ) .
62
74
63
- - Review: last session's matrix object.
64
- - Sparse matrix with named rows and columns.
75
+ - Review: last session's matrix object.
76
+ - Sparse matrix with named rows and columns.
77
+
78
+ - Queries representation.
79
+
80
+ - Representing
81
+ [ rstudio-conf-2019 abstracts] ( ../../Data/RStudio-conf-2019-abstracts.json )
82
+ in the vector space of
83
+ [ WTC-2019 abstracts] ( ../../Data/Wolfram-Technology-Conference-2019-abstracts.json ) .
65
84
66
- - Queries representation.
67
- - Representing
68
- [ rstudio-conf-2019 abstracts] ( ../../Data/RStudio-conf-2019-abstracts.json )
69
- in the vector space of
70
- [ WTC-2019 abstracts] ( ../../Data/Wolfram-Technology-Conference-2019-abstracts.json ) .
85
+ - Making a search engine for
86
+
87
+ - [ ] [ Raku's documentation] ( https://github.com/Raku/doc ) ,
88
+ - [X] WTC-2019 abstracts.
71
89
72
- - Making a search engine for
90
+ - Dimension reduction over an image collection.
73
91
74
- - [ ] [ Raku's documentation ] ( https://github. com/Raku/doc ) ,
75
- - [X] WTC-2019 abstracts .
92
+ - Topics over [ random mandalas ] ( https://resources.wolframcloud. com/FunctionRepository/resources/RandomMandala ) .
93
+ - Representation of unseen mandala images .
76
94
77
- - Dimension reduction over an image collection.
95
+ ------
78
96
79
- - Topics over [ random mandalas] ( https://resources.wolframcloud.com/FunctionRepository/resources/RandomMandala ) .
80
- - Representation of unseen mandala images.
97
+ ### 5. [ X] LSA for image de-noising and classification
81
98
82
- 5 . [X] LSA for image de-noising and classification.
83
- Here is [ the recording of the fifth session at YouTube] ( https://www.youtube.com/watch?v=_KBecGdzoS0 ) .
99
+ Here is [ the recording of the fifth session at YouTube] ( https://www.youtube.com/watch?v=_KBecGdzoS0 ) .
84
100
85
- - Review: last session's image collection topics extraction.
86
- - Let us try that two other datasets:
87
- - handwritten digits, and
88
- - [ Hentaigana] ( https://en.wikipedia.org/wiki/Hentaigana ) (maybe).
101
+ - Review: last session's image collection topics extraction.
102
+ - Let us try that two other datasets:
103
+ - handwritten digits, and
104
+ - [ Hentaigana] ( https://en.wikipedia.org/wiki/Hentaigana ) (maybe).
89
105
90
- - Image de-noising:
91
- - Using handwritten digits (again).
106
+ - Image de-noising:
107
+ - Using handwritten digits (again).
92
108
93
- - Image classification:
94
- - Handwritten digits.
109
+ - Image classification:
110
+ - Handwritten digits.
95
111
96
- 6 . [X] Further use cases.
97
- Here is [ the recording of the sixth session at YouTube] ( https://www.youtube.com/watch?v=Hxawq1O3Oec ) .
98
-
99
- - Derive a custom taxonomy over a document collection.
100
- - Clustering with the reduced dimension.
101
- - Apply LSA to Great Conversation studies.
102
- - Use LSA for translation of natural languages.
103
- - Using Dostoevsky's novel "The Idiot".
104
- - [ Russian chapters breakdown] ( ../../Data/Dostoyevsky-The-Idiot-Russian-chapters.json.zip ) .
105
- - [ English chapters breakdown] ( ../../Data/Dostoyevsky-The-Idiot-English-chapters.json.zip ) .
106
- - Use LSA for making or improving search engines.
107
- - LSA for time series collections.
112
+ ------
113
+
114
+ ### 6. [ X] Further use cases
115
+
116
+ Here is [ the recording of the sixth session at YouTube] ( https://www.youtube.com/watch?v=Hxawq1O3Oec ) .
117
+
118
+ - Derive a custom taxonomy over a document collection.
119
+ - Clustering with the reduced dimension.
120
+ - Apply LSA to Great Conversation studies.
121
+ - Use LSA for translation of natural languages.
122
+ - Using Dostoevsky's novel "The Idiot".
123
+ - [ Russian chapters breakdown] ( ../../Data/Dostoyevsky-The-Idiot-Russian-chapters.json.zip ) .
124
+ - [ English chapters breakdown] ( ../../Data/Dostoyevsky-The-Idiot-English-chapters.json.zip ) .
125
+ - Use LSA for making or improving search engines.
126
+ - LSA for time series collections.
108
127
109
128
0 commit comments