Skip to content

Commit b44f73f

Browse files
committed
docs: add kth largest elements lab
1 parent 1484ee9 commit b44f73f

File tree

1 file changed

+241
-0
lines changed

1 file changed

+241
-0
lines changed
Lines changed: 241 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,241 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {
6+
"graffitiCellId": "id_lyoik70"
7+
},
8+
"source": [
9+
"### Problem Statement\n",
10+
"Given an unsorted array `Arr` with `n` positive integers. Find the $k^{th}$ smallest element in the given array, using Divide & Conquer approach. \n",
11+
"\n",
12+
"**Input**: Unsorted array `Arr` and an integer `k` where $1 \\leq k \\leq n$ <br>\n",
13+
"**Output**: The $k^{th}$ smallest element of array `Arr`<br>\n",
14+
"\n",
15+
"\n",
16+
"**Example 1**<br>\n",
17+
"Arr = `[6, 80, 36, 8, 23, 7, 10, 12, 42, 99]`<br>\n",
18+
"k = `10`<br>\n",
19+
"Output = `99`<br>\n",
20+
"\n",
21+
"**Example 2**<br>\n",
22+
"Arr = `[6, 80, 36, 8, 23, 7, 10, 12, 42, 99]`<br>\n",
23+
"k = `5`<br>\n",
24+
"Output = `12`<br>\n",
25+
"\n",
26+
"---\n",
27+
"\n",
28+
"### The Pseudocode - `fastSelect(Arr, k)`\n",
29+
"1. Break `Arr` into $\\frac{n}{5}$ (actually it is $\\left \\lceil{\\frac{n}{5}} \\right \\rceil $) groups, namely $G_1, G_2, G_3...G_{\\frac{n}{5}}$\n",
30+
"\n",
31+
"\n",
32+
"2. For each group $G_i, \\forall 1 \\leq i \\leq \\frac{n}{5} $, do the following:\n",
33+
" - Sort the group $G_i$\n",
34+
" - Find the middle position i.e., median $m_i$ of group $G_i$\n",
35+
" - Add $m_i$ to the set of medians **$S$**\n",
36+
"\n",
37+
"\n",
38+
"3. The set of medians **$S$** will become as $S = \\{m_1, m_2, m_3...m_{\\frac{n}{5}}\\}$. The \"good\" `pivot` element will be the median of the set **$S$**. We can find it as $pivot = fastSelect(S, \\frac{n}{10})$. \n",
39+
"\n",
40+
"\n",
41+
"4. Partition the original `Arr` into three sub-arrays - `Arr_Less_P`, `Arr_Equal_P`, and `Arr_More_P` having elements less than `pivot`, equal to `pivot`, and bigger than `pivot` **respectively**.\n",
42+
"\n",
43+
"\n",
44+
"5. Recurse based on the **sizes of the three sub-arrays**, we will either recursively search in the small set, or the big set, as defined in the following conditions:\n",
45+
" - If `k <= length(Arr_Less_P)`, then return `fastSelect(Arr_Less_P, k)`. This means that if the size of the \"small\" sub-array is at least as large as `k`, then we know that our desired $k^{th}$ smallest element lies in this sub-array. Therefore recursively call the same function on the \"small\" sub-array. <br><br>\n",
46+
" \n",
47+
" - If `k > (length(Arr_Less_P) + length(Arr_Equal_P))`, then return `fastSelect(Arr_More_P, (k - length(Arr_Less_P) - length(Arr_Equal_P)))`. This means that if `k` is more than the size of \"small\" and \"equal\" sub-arrays, then our desired $k^{th}$ smallest element lies in \"bigger\" sub-array. <br><br>\n",
48+
" \n",
49+
" - Return `pivot` otherwise. This means that if the above two cases do not hold true, then we know that $k^{th}$ smallest element lies in the \"equal\" sub-array.\n",
50+
" \n",
51+
"---\n",
52+
"### Exercise - Write the function definition here"
53+
]
54+
},
55+
{
56+
"cell_type": "code",
57+
"execution_count": 1,
58+
"metadata": {
59+
"graffitiCellId": "id_67f82ik"
60+
},
61+
"outputs": [],
62+
"source": [
63+
"def fastSelect(Arr, k): # k is an index\n",
64+
" n = len(Arr) # length of the original array\n",
65+
" \n",
66+
" if(k>0 and k <= n): # k should be a valid index \n",
67+
" # Helper variables\n",
68+
" setOfMedians = []\n",
69+
" Arr_Less_P = []\n",
70+
" Arr_Equal_P = []\n",
71+
" Arr_More_P = []\n",
72+
" i = 0\n",
73+
" \n",
74+
" # Step 1 - Break Arr into groups of size 5\n",
75+
" # Step 2 - For each group, sort and find median (middle). Add the median to setOfMedians\n",
76+
" while (i < n // 5): # n//5 gives the integer quotient of the division \n",
77+
" median = findMedian(Arr, 5*i, 5) # find median of each group of size 5\n",
78+
" setOfMedians.append(median) \n",
79+
" i += 1\n",
80+
"\n",
81+
" # If n is not a multiple of 5, then a last group with size = n % 5 will be formed\n",
82+
" if (5*i < n): \n",
83+
" median = findMedian(Arr, 5*i, n % 5)\n",
84+
" setOfMedians.append(median)\n",
85+
" \n",
86+
" # Step 3 - Find the median of setOfMedians\n",
87+
" if (len(setOfMedians) == 1): # Base case for this task\n",
88+
" pivot = setOfMedians[0]\n",
89+
" elif (len(setOfMedians)>1):\n",
90+
" pivot = fastSelect(setOfMedians, (len(setOfMedians)//2))\n",
91+
" \n",
92+
" # Step 4 - Partition the original Arr into three sub-arrays\n",
93+
" for element in Arr:\n",
94+
" if (element<pivot):\n",
95+
" Arr_Less_P.append(element)\n",
96+
" elif (element>pivot):\n",
97+
" Arr_More_P.append(element)\n",
98+
" else:\n",
99+
" Arr_Equal_P.append(element)\n",
100+
" \n",
101+
" # Step 5 - Recurse based on the sizes of the three sub-arrays\n",
102+
" if (k <= len(Arr_Less_P)):\n",
103+
" return fastSelect(Arr_Less_P, k)\n",
104+
" \n",
105+
" elif (k > (len(Arr_Less_P) + len(Arr_Equal_P))):\n",
106+
" return fastSelect(Arr_More_P, (k - len(Arr_Less_P) - len(Arr_Equal_P)))\n",
107+
" \n",
108+
" else:\n",
109+
" return pivot \n",
110+
"\n",
111+
"# Helper function\n",
112+
"def findMedian(Arr, start, size): \n",
113+
" myList = [] \n",
114+
" for i in range(start, start + size): \n",
115+
" myList.append(Arr[i]) \n",
116+
" \n",
117+
" # Sort the array \n",
118+
" myList.sort() \n",
119+
" \n",
120+
" # Return the middle element \n",
121+
" return myList[size // 2] "
122+
]
123+
},
124+
{
125+
"cell_type": "markdown",
126+
"metadata": {
127+
"graffitiCellId": "id_dsq4qxt"
128+
},
129+
"source": [
130+
"<span class=\"graffiti-highlight graffiti-id_dsq4qxt-id_29dh0dm\"><i></i><button>Show Solution</button></span>"
131+
]
132+
},
133+
{
134+
"cell_type": "markdown",
135+
"metadata": {
136+
"graffitiCellId": "id_mhdbx0f"
137+
},
138+
"source": [
139+
"### Test - Let's test your function"
140+
]
141+
},
142+
{
143+
"cell_type": "code",
144+
"execution_count": 2,
145+
"metadata": {
146+
"graffitiCellId": "id_bgck2hk"
147+
},
148+
"outputs": [
149+
{
150+
"name": "stdout",
151+
"output_type": "stream",
152+
"text": [
153+
"12\n"
154+
]
155+
}
156+
],
157+
"source": [
158+
"Arr = [6, 80, 36, 8, 23, 7, 10, 12, 42]\n",
159+
"k = 5\n",
160+
"print(fastSelect(Arr, k)) # Outputs 12"
161+
]
162+
},
163+
{
164+
"cell_type": "code",
165+
"execution_count": 3,
166+
"metadata": {
167+
"graffitiCellId": "id_32omxhm"
168+
},
169+
"outputs": [
170+
{
171+
"name": "stdout",
172+
"output_type": "stream",
173+
"text": [
174+
"11\n"
175+
]
176+
}
177+
],
178+
"source": [
179+
"Arr = [5, 2, 20, 17, 11, 13, 8, 9, 11]\n",
180+
"k = 5\n",
181+
"print(fastSelect(Arr, k)) # Outputs 11"
182+
]
183+
},
184+
{
185+
"cell_type": "code",
186+
"execution_count": 4,
187+
"metadata": {
188+
"graffitiCellId": "id_h9nihqx"
189+
},
190+
"outputs": [
191+
{
192+
"name": "stdout",
193+
"output_type": "stream",
194+
"text": [
195+
"99\n"
196+
]
197+
}
198+
],
199+
"source": [
200+
"Arr = [6, 80, 36, 8, 23, 7, 10, 12, 42, 99]\n",
201+
"k = 10\n",
202+
"print(fastSelect(Arr, k)) # Outputs 99"
203+
]
204+
},
205+
{
206+
"cell_type": "code",
207+
"execution_count": null,
208+
"metadata": {
209+
"graffitiCellId": "id_xidprnr"
210+
},
211+
"outputs": [],
212+
"source": []
213+
}
214+
],
215+
"metadata": {
216+
"graffiti": {
217+
"firstAuthorId": "af9e0b36-2ad2-11ea-83c4-a78dc7ef519f",
218+
"id": "id_xuzb5il",
219+
"language": "EN"
220+
},
221+
"kernelspec": {
222+
"display_name": "Python 3",
223+
"language": "python",
224+
"name": "python3"
225+
},
226+
"language_info": {
227+
"codemirror_mode": {
228+
"name": "ipython",
229+
"version": 3
230+
},
231+
"file_extension": ".py",
232+
"mimetype": "text/x-python",
233+
"name": "python",
234+
"nbconvert_exporter": "python",
235+
"pygments_lexer": "ipython3",
236+
"version": "3.6.3"
237+
}
238+
},
239+
"nbformat": 4,
240+
"nbformat_minor": 2
241+
}

0 commit comments

Comments
 (0)