Minor corrections

turukawa · turukawa · commit 1cb737f2ed8d · 2020-11-01T19:24:24.000+01:00
diff --git a/Module 1, Lesson 2 - Research and experiments with data.ipynb b/Module 1, Lesson 2 - Research and experiments with data.ipynb
@@ -1793,7 +1793,7 @@
     "\n",
     "(<a id=\"cit-emanuel_what_2000\" href=\"#call-emanuel_what_2000\">Emanuel, Wendler <em>et al.</em>, 2000</a>) Emanuel Ezekiel J., Wendler David and Grady Christine, ``_What Makes Clinical Research Ethical?_'', JAMA, vol. 283, number 20, pp. 2701--2711, May 2000.  [online](https://jamanetwork.com/journals/jama/fullarticle/192740)\n",
     "\n",
-    "(<a id=\"cit-chait_technical_2014\" href=\"#call-chait_technical_2014\">Chait, 2014</a>) Gavin Chait, ``Technical assessment of open data platforms for national statistical organisations'', World Bank Group, number: ,  December 2014.  [online](https://documents.worldbank.org/en/publication/documents-reports/documentdetail)\n",
+    "(<a id=\"cit-chait_technical_2014\" href=\"#call-chait_technical_2014\">Chait, 2014</a>) Gavin Chait, ``Technical assessment of open data platforms for national statistical organisations'', World Bank Group, number: ,  December 2014.  [online](https://openknowledge.worldbank.org/handle/10986/21111)\n",
     "\n",
     "(<a id=\"cit-downey_think_2014\" href=\"#call-downey_think_2014\">Downey, 2014</a>) Allen B. Downey, ``_Think Stats 2 - Exploratory Data Analysis in Python_'',  2014.  [online](https://greenteapress.com/wp/think-stats-2e/)\n",
     "\n",
diff --git a/Module 1, Lesson 3 - Probability, randomness, and the risk of de-anonymization.ipynb b/Module 1, Lesson 3 - Probability, randomness, and the risk of de-anonymization.ipynb
@@ -76,7 +76,7 @@
     "\n",
     "Since racial classification was crucial to such a system, in 1950 the [Population Registration Act](https://en.wikipedia.org/wiki/Population_Registration_Act,_1950) was introduced along with a suite of tools designed to provide rapid racial classification tests.\n",
     "\n",
-    "One of these involved a [pencil](https://en.wikipedia.org/wiki/Pencil_test_(South_Africa)).\n",
+    "One of these involved a [pencil](https://en.wikipedia.org/wiki/Pencil_test_(South_Africa%29).\n",
     "\n",
     "If a government official felt unable to make a racial classification by observation, they would insert a pencil into a subject's hair. If it fell out, that person was classified as \"white\". If it remained in place, that person was classified as \"coloured\", a social status above \"black\" but below \"white\".\n",
     "\n",
@@ -112,7 +112,7 @@
     "\n",
     "It is seductive to attempt to infer human behaviour from human appearance, and such research isn't always produced in service to brutal states. In \"Tracking historical changes in trustworthiness using machine learning analyses of facial cues in paintings\", French researchers state that they have designed \"an algorithm to automatically generate trustworthiness evaluations for the facial action units (smile, eye brows, etc.) of European portraits in large historical databases.\" They claim their results \"show that trustworthiness in portraits increased over the period 1500–2000 paralleling the decline of interpersonal violence and the rise of democratic values observed in Western Europe.\" \\cite{safra_tracking_2020}\n",
     "\n",
-    "Classifying people based on fallacies takes away __individual autonomy__ and imposes an incredible ethical choice on data scientists who enable such systems to function. A researcher may spend an entire career inside a system that supports unethical choices and never requires them to account for their consequences, or they may find that systems change and ushers in an era of [transparency and accountability](https://en.wikipedia.org/wiki/Truth_and_Reconciliation_Commission_(South_Africa)).\n",
+    "Classifying people based on fallacies takes away __individual autonomy__ and imposes an incredible ethical choice on data scientists who enable such systems to function. A researcher may spend an entire career inside a system that supports unethical choices and never requires them to account for their consequences, or they may find that systems change and ushers in an era of [transparency and accountability](https://en.wikipedia.org/wiki/Truth_and_Reconciliation_Commission_(South_Africa%29).\n",
     "\n",
     "A facial recognition system that can instantly classify people as \"untrustworthy\" based on markers of the distance of eye corners apart, the angle of eyebrows, or the degree of facial roundness, can not be said to be answering a question of \"will this person commit a crime\" unless there is specific research demonstrating beyond any doubt that such characteristics do indicate criminal behaviour. \n",
     "\n",
@@ -135,7 +135,7 @@
     "Unusually for a section on ethics, we're going to dive directly into probability in terms of mathematical definitions and code.\n",
     "\n",
     "<div class=\"alert alert-block alert-warning\">\n",
-    "Ethics grapples with considerations that are inherently inexact or which have ambiguous outcomes. It is essential that the words and concepts we use, and they way they are used, are well-defined, unambiguous and exact. Axiomatic probability gives us a language to explore ambiguous outcomes.\n",
+    "Ethics grapples with considerations that are inherently inexact or which have ambiguous outcomes. It is essential that the words and concepts we use, and the way they are used, are well-defined, unambiguous and exact. Axiomatic probability gives us a language to explore ambiguous outcomes.\n",
     "</div>\n",
     "\n",
     "Probability lends itself to simulation and a classic example is to roll a six-sided die multiple times. For this simulation, let $\\hat{p}_n$ be the proportion of outcomes that are $1$ after the first $n$ rolls:"
@@ -236,6 +236,11 @@
     "\n",
     "$$P(A)$$\n",
     "\n",
+    "<br>\n",
+    "<div class=\"well\">\n",
+    "    <p><b>There will be maths</b>: Any data-driven research process leads to computation for analysis. Formal mathematical notation, axioms and formulae are a concise language which can be converted into programmatic code. Understanding these derivations will not only improve your coding ability, but also permit you to more easily interrogate and review the research and work of others.</p>\n",
+    "</div>\n",
+    "\n",
     "#### Axioms of probability\n",
     "\n",
     "There are a number of axioms that form a foundation for probability and interpretation of these can aid in understanding experimental design, randomisation, outliers and statistical analysis \\cite{pishro-nik_introduction_2014}:\n",
@@ -484,7 +489,9 @@
     "\n",
     "#### Baye's Theorem and the Law of Total Probability\n",
     "\n",
-    "These definitions - the language of probability - are intended to support analysis, but also communication. When we use words with precise meanings, we're able to effectively and rapidly communicate ideas where complexity may obscure our thoughts.\n",
+    "<div class=\"alert alert-block alert-warning\">\n",
+    "These definitions - the language of probability - are intended to support analysis and communication. When we use words with precise meanings, we're able to effectively and rapidly communicate ideas where complexity may obscure our thoughts.\n",
+    "</div>\n",
     "\n",
     "Tree diagrams and step-by-step axiomatic derivations are useful while you are learning, but eventually you will work faster with less explanation. It is critical to remember these axioms, though, as researchers are often called upon to explain their work to non-specialists.\n",
     "\n",
@@ -1761,7 +1768,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "And now you see it ... There are 8 convicted felons in this chart, and the algorithm has identified about 20 of them. In reality, it missed what it was intended to look for almost entirely.\n",
+    "There are 8 convicted felons in this chart, and the algorithm has identified about 20 of them. In reality, it missed what it was intended to look for almost entirely. This is not a fault of the algorithm, but of a human research-driven decision to treat a test for ethnicity as if it is a test for trustworthiness.\n",
     "\n",
     "This is a useful illustration of fallacies in action, but even when research is on your side, prevalence can let you down. Let's demonstrate this with our synthetic population and our screening test for breast cancer. This time _green_ for true positive, _red_ for false positive, and _orange_ for a false negative.\n",
     "\n",
@@ -1946,7 +1953,7 @@
     "\n",
     "(<a id=\"cit-vu_introductory_2020\" href=\"#call-vu_introductory_2020\">Vu and Harrington, 2020</a>) Julie Vu and David Harrington, ``_Introductory Statistics for the Life and Biomedical Sciences_'', July 2020.  [online](https://www.openintro.org/book/biostat/)\n",
     "\n",
-    "(<a id=\"cit-chait_technical_2014\" href=\"#call-chait_technical_2014\">Chait, 2014</a>) Gavin Chait, ``Technical assessment of open data platforms for national statistical organisations'', World Bank Group, number: ,  December 2014.  [online](https://documents.worldbank.org/en/publication/documents-reports/documentdetail)\n",
+    "(<a id=\"cit-chait_technical_2014\" href=\"#call-chait_technical_2014\">Chait, 2014</a>) Gavin Chait, ``Technical assessment of open data platforms for national statistical organisations'', World Bank Group, number: ,  December 2014.  [online](https://openknowledge.worldbank.org/handle/10986/21111)\n",
     "\n",
     "(<a id=\"cit-anwar_francisella_2009\" href=\"#call-anwar_francisella_2009\">Anwar and Hunt, 2009</a>) Anwar Nadia and Hunt Ela, ``_Francisella tularensis novicida proteomic and transcriptomic data integration and annotation based on semantic web technologies_'', BMC Bioinformatics, vol. 10, number 10, pp. S3, October 2009.  [online](https://doi.org/10.1186/1471-2105-10-S10-S3)\n",
     "\n",
diff --git a/data-as-a-science.bib b/data-as-a-science.bib