-
Notifications
You must be signed in to change notification settings - Fork 3
Expand file tree
/
Copy pathtodo.txt
More file actions
102 lines (54 loc) · 2.36 KB
/
todo.txt
File metadata and controls
102 lines (54 loc) · 2.36 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
Concepts
inquery-based tutorials
mean, median, mode
variance & covariance
percent
Correlation, line of best fit prediction
DONE Visualization with matplotlib
PDFs
-no partial derivatives
Standard deviation
Linear Regression
Probability—https://www.dataquest.io/blog/basic-statistics-in-python-probability/
Let's use our little function to take a bigger average.
1. Go to the US Census website [Educational Attainment page](https://www.census.gov/data/tables/2019/demo/educational-attainment/cps-detailed-tables.html)
2. Click on "Both Sexes" and download that spreadsheet
3. Look at the row labeled "Total"
4. Write a function that finds out the average educational attainment of an American in 2019
This last step is a bit of a leap so let's break it down.
1. Let's assign a coefficient to each level of attainment (1,2,3,4, etc)
1. Make a new list where each element is the number of people who achieved that level of education times the level coefficient.
1. Then take the average of that new list.
1. Then divide that average by the total people
1. We should, in the end have a coefficient number that stands for the average level of educational attainment.
```py
data = np.array([8603, 13372, 62259, 34690, 22738, 49937, 22214, 3136, 4529])
def mean_edu_attainment(dataset, total):
# create list weighted to index of position
weighted_list = []
for n, i in dataset:
weighted_list.push(n*i)
mean = compute_mean(weighted_list)
# calculate total
total = 0
for n in dataset
total += n
mean_attainment = mean/total
return mean_attainment
mean_edu_attainment(data, total)
```
Linear Regression
- Remove outliers
Mean and standard divination, what was the chance someone got >80
Data detective—tempurature of two cities, difference and sum.
Nan values
Titanic stowaways, try to determine their ages, what class they hid in etc.
Excellent article on types of bar charts
https://towardsdatascience.com/histograms-and-density-plots-in-python-f6bda88f5ac0
Amazing article on this Dataset
https://towardsdatascience.com/predicting-the-survival-of-titanic-passengers-30870ccc7e8
.astype('category').cat.codes)
tips dataset—https://github.com/mwaskom/seaborn-data/blob/master/tips.csv
Visualizing tips dataset—https://seaborn.pydata.org/tutorial/relational.html
Normal distribution and z-score
https://www.youtube.com/watch?v=4Fta6KQ1QHQ