From 02fce49415313411562e12d5d551c6b20e22baed Mon Sep 17 00:00:00 2001
From: David Mertz <mertz@gnosis.cx>
Date: Wed, 19 Jul 2023 10:11:18 -0400
Subject: [PATCH 1/2] Improve clarity in docs and make a few details explicit

Signed-off-by: David Mertz <mertz@gnosis.cx>
---
 docs/environment_setup.md                | 32 +++++++---
 docs/getting_started/package_register.md |  4 +-
 docs/index.md                            | 71 +++++++++++++---------
 examples/basics/basics/hello_world.py    | 17 +++---
 examples/basics/basics/task.py           | 75 ++++++++++++++++--------
 5 files changed, 130 insertions(+), 69 deletions(-)
diff --git a/docs/environment_setup.md b/docs/environment_setup.md
index 3b259788a..c42f2d56f 100644
--- a/docs/environment_setup.md
+++ b/docs/environment_setup.md
@@ -30,6 +30,10 @@ In this setup guide, let's run the `examples/basics` project.
 
 ```{prompt} bash
 git clone https://github.com/flyteorg/flytesnacks
+# ... or if your SSH key is registered on GitHub:
+# git clone git@github.com:flyteorg/flytesnacks.git
+# Or if you use the `gh` tool:
+# gh repo clone flyteorg/flytesnacks
 cd flytesnacks/examples/basics
 pip install -r requirements.txt
 ```
@@ -67,8 +71,9 @@ pyflyte run basics/hello_world.py my_wf
 ```
 
 :::{note}
-The first couple arguments of `pyflyte run` is in the form of `path/to/script.py <workflow_name>`, where
-`<workflow_name>` is the function decorated with `@workflow` that you want to run.
+The first two arguments to `pyflyte run` have the form of 
+`path/to/script.py <workflow_name>`, where `<workflow_name>` is the function 
+decorated with `@workflow` that you want to run.
 :::
 
 To run the workflow on the demo Flyte cluster, all you need to do is supply the `--remote` flag:
@@ -103,7 +108,11 @@ option as `--arg-name`.
 
 ## Visualizing Workflows
 
-Workflows can be visualized as DAGs on the UI. However, you can visualize workflows on the browser and in the terminal by *just* using your terminal.
+Workflows can be visualized as DAGs in the UI. You can also visualize workflows
+from your terminal that will be displayed in your default web browser. This
+visualization uses the service at graph.flyte.org to render Graphviz diagrams,
+and hence shares your DAG (but not your data or code) with an outside party 
+(security hint 🔐).
 
 To view workflow on the browser:
 
@@ -127,15 +136,20 @@ flytectl get workflows \
     basics.basic_workflow.my_wf
 ```
 
-Replace `<version>` with version from console UI, it may look something like `BLrGKJaYsW2ME1PaoirK1g==`
+Replace `<version>` with the base64-encoded version shown in the console UI,
+that looks something like `BLrGKJaYsW2ME1PaoirK1g==`.
 
 :::{tip}
-Running most of the examples in the **User Guide** only requires the default Docker image that ships with Flyte.
-Many examples in the {ref}`tutorials` and {ref}`integrations` section depend on additional libraries, `sklearn`,
-`pytorch`, or `tensorflow`, which will not work with the default docker image used by `pyflyte run`.
 
-These examples will explicitly show you which images to use for running these examples by passing in the docker
-image you want to use with the `--image` option in `pyflyte run`.
+Running most of the examples in the **User Guide** only requires the default
+Docker image that ships with Flyte. Many examples in the {ref}`tutorials` and
+{ref}`integrations` section depend on additional libraries such as `sklearn`,
+`pytorch`, or `tensorflow`, which will not work with the default docker image 
+used by `pyflyte run`.
+
+These examples will explicitly show you which images to use for running these
+examples by passing in the docker image you want to use with the `--image`
+option in `pyflyte run`.
 :::
 
 🎉 Congrats! Now you can run all the examples in the {ref}`userguide` 🎉
diff --git a/docs/getting_started/package_register.md b/docs/getting_started/package_register.md
index 7985060b3..3425990af 100644
--- a/docs/getting_started/package_register.md
+++ b/docs/getting_started/package_register.md
@@ -269,7 +269,7 @@ By default, the `docker_build.sh` script:
 - Uses the `PROJECT_NAME` specified in the `pyflyte init` command, which in
   this case is `my_project`.
 - Will not use any remote registry.
-- Uses the git sha to version your tasks and workflows.
+- Uses the git revision SHA1 to version your tasks and workflows.
 ```
 
 You can override the default values with the following flags:
@@ -367,7 +367,7 @@ Let's break down what each flag is doing here:
 - `--archive`: This argument allows you to pass in a package file, which in
   this case is `flyte-package.tgz`.
 - `--version`: This is a version string that can be any string, but we recommend
-  using the git sha in general, especially in production use cases.
+  using the git revision in general, especially in production use cases.
 
 ### Using `pyflyte register` versus `pyflyte package` + `flytectl register`
 
diff --git a/docs/index.md b/docs/index.md
index 71b43a7fd..f9e5cb288 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -33,8 +33,8 @@ on your local machine.
 :title: text-muted
 :animate: fade-in-slide-down
 
-The introduction below is also available on a hosted sandbox environment, where
-you can get started with Flyte without installing anything locally.
+Union.ai provides a hosted sandbox environment, free of charge, where you can
+get started with Flyte without installing anything locally.
 
 ```{link-button} https://sandbox.union.ai/
 ---
@@ -73,10 +73,10 @@ First install [flytekit](https://pypi.org/project/flytekit/), Flyte's Python SDK
 pip install flytekit flyekitplugins-deck-standard scikit-learn
 ```
 
-Then install [flytectl](https://docs.flyte.org/projects/flytectl/en/latest/),
+Next install [flytectl](https://docs.flyte.org/projects/flytectl/en/latest/),
 which the command-line interface for interacting with a Flyte backend.
 
-````{tabbed} Homebrew
+````{tabbed} Homebrew (macOS)
 
 ```{prompt} bash $
 brew install flyteorg/homebrew-tap/flytectl
@@ -84,7 +84,7 @@ brew install flyteorg/homebrew-tap/flytectl
 
 ````
 
-````{tabbed} Curl
+````{tabbed} Curl (Unix-like)
 
 ```{prompt} bash $
 curl -sL https://ctl.flyte.org/install | sudo bash -s -- -b /usr/local/bin
@@ -92,6 +92,15 @@ curl -sL https://ctl.flyte.org/install | sudo bash -s -- -b /usr/local/bin
 
 ````
 
+````{tabbed} Windows
+
+```{prompt} C:\>
+TODO
+```
+
+````
+
+
 ## Creating a Workflow
 
 The first workflow we'll create is a simple model training workflow that consists
@@ -99,13 +108,13 @@ of three steps that will:
 
 1. 🍷 Get the classic [wine dataset](https://scikit-learn.org/stable/datasets/toy_dataset.html#wine-recognition-dataset)
    using [sklearn](https://scikit-learn.org/stable/).
-2. 📊 Process the data that simplifies the 3-class prediction problem into a
-   binary classification problem by consolidating class labels `1` and `2` into
-   a single class.
-3. 🤖 Train a `LogisticRegression` model to learn a binary classifier.
+2. 📊 Process the data by simplifying its 3-class prediction problem into a binary
+   classification problem by consolidating class labels 1 and 2 into a single
+   class.
+3. 🤖 Train a `LogisticRegression` model to create a binary classifier.
 
-First, we'll define three tasks for each of these steps. Create a file called
-`example.py` and copy the following code into it.
+Let's define three tasks, corresponding to each of these steps. Create a
+file called example.py and copy the following code into it.
 
 ```{code-cell} python
 :tags: [remove-output]
@@ -126,7 +135,9 @@ def get_data() -> pd.DataFrame:
 @task
 def process_data(data: pd.DataFrame) -> pd.DataFrame:
     """Simplify the task from a 3-class to a binary classification problem."""
-    return data.assign(target=lambda x: x["target"].where(x["target"] == 0, 1))
+    df = data.copy()
+    df.loc[df.target == 0, "target"] = 1
+    return df
 
 @task
 def train_model(data: pd.DataFrame, hyperparameters: dict) -> LogisticRegression:
@@ -139,10 +150,11 @@ def train_model(data: pd.DataFrame, hyperparameters: dict) -> LogisticRegression
 As we can see in the code snippet above, we defined three tasks as Python
 functions: `get_data`, `process_data`, and `train_model`.
 
-In Flyte, **tasks** are the most basic unit of compute and serve as the building
-blocks 🧱 for more complex applications. A task is a function that takes some
-inputs and produces an output. We can use these tasks to define a simple model
-training workflow:
+In Flyte, **tasks** are the most basic "unit of compute" (per Kubernetes
+jargon) and serve as the building blocks 🧱 for more complex applications. 
+At its core, a task is simply a function: it takes inputs and produces and 
+output. We can use these tasks to define a simple model training workflow:
+
 
 ```{code-cell} python
 @workflow
@@ -165,7 +177,7 @@ is typically written with inputs and outputs.
 A **workflow** is also defined as a Python function, and it specifies the flow
 of data between tasks and, more generally, the dependencies between tasks 🔀.
 
-::::{dropdown} {fa}`info-circle` The code above looks like Python, but what do `@task` and `@workflow` do exactly?
+::::{dropdown} {fa}`info-circle` This looks like typical Python, but what do `@task` and `@workflow` do?
 :title: text-muted
 :animate: fade-in-slide-down
 
@@ -173,7 +185,7 @@ Flyte `@task` and `@workflow` decorators are designed to work seamlessly with
 your code-base, provided that the *decorated function is at the top-level scope
 of the module*.
 
-This means that you can invoke tasks and workflows as regular Python methods and
+This means that you can invoke tasks and workflows as regular Python functions and
 even import and use them in other Python modules or scripts.
 
 :::{note}
@@ -202,16 +214,19 @@ pyflyte run example.py training_workflow \
 :animate: fade-in-slide-down
 
 If you're using Bash, you can ignore this 🙂
-You may need to add .local/bin to your PATH variable if it's not already set,
-as that's not automatically added for non-bourne shells like fish or xzsh.
-
-To use pyflyte, make sure to set the /.local/bin directory in PATH
+You may need to add .local/bin to your PATH variable if it's not already set;
+it may not automatically get added for non-bourne shells.  For example, if you 
+use `fish` or `csh`, you can set this with:
 
 :::{code-block} fish
-set -gx PATH $PATH ~/.local/bin
+set -gx PATH $PATH ~/.local/bin  # fish
+:::
+
+:::{code-block} csh
+set path = ($path $HOME/.local/bin)  # csh/tcsh
 :::
-:::::
 
+:::::
 
 
 :::::{dropdown} {fa}`info-circle` Why use `pyflyte run` rather than `python example.py`?
@@ -223,7 +238,9 @@ set -gx PATH $PATH ~/.local/bin
 
 Keyword arguments can be supplied to ``pyflyte run`` by passing in options in
 the format ``--kwarg value``, and in the case of ``snake_case_arg`` argument
-names, you can pass in options in the form of ``--snake-case-arg value``.
+names, you can optionally spell them as "kebab case," for example as
+``--snake-case-arg value``.
+
 
 ::::{note}
 If you want to run a workflow with `python example.py`, you would have to write
@@ -347,8 +364,8 @@ There are a few features about FlyteConsole worth pointing out in the GIF above:
 ## What's Next?
 
 Follow the rest of the sections in the documentation to get a better
-understanding of the key constructs that make Flyte such a powerful
-orchestration tool 💪.
+understanding of the key constructs that make Flyte a powerful orchestration
+tool 💪.
 
 ```{admonition} Recommendation
 :class: tip
diff --git a/examples/basics/basics/hello_world.py b/examples/basics/basics/hello_world.py
index 19e4cfac5..219f6da73 100644
--- a/examples/basics/basics/hello_world.py
+++ b/examples/basics/basics/hello_world.py
@@ -12,8 +12,9 @@
 from flytekit import task, workflow
 
 # %% [markdown]
-# You can change the signature of the workflow to take in an argument like this:
-
+# You can change the signature of the task to take in an argument like this:
+# def say_hello(name: str) -> str: 
+#     return f"hello {name}"
 # %%
 @task
 def say_hello() -> str:
@@ -21,10 +22,12 @@ def say_hello() -> str:
 
 
 # %% [markdown]
-# You can treat the outputs of a task as you normally would a Python function. Assign the output to two variables
-# and use them in subsequent tasks as normal. See {py:func}`flytekit.workflow`
+# You can treat the outputs of a task as you normally would a Python function. 
+# Assign the output to two variables and use them in subsequent tasks as normal. 
+# See {py:func}`flytekit.workflow`
 # You can change the signature of the workflow to take in an argument like this:
-
+# def my_wf(name: str) -> str:
+#     ...
 # %%
 @workflow
 def my_wf() -> str:
@@ -49,5 +52,5 @@ def my_wf() -> str:
 
 
 # %% [markdown]
-# In the next few examples you'll learn more about the core ideas of Flyte, which are tasks, workflows, and launch
-# plans.
+# In the next few examples you'll learn more about the core ideas of Flyte, 
+# which are tasks, workflows, and launch plans.
diff --git a/examples/basics/basics/task.py b/examples/basics/basics/task.py
index d0f360988..edfe36f0b 100644
--- a/examples/basics/basics/task.py
+++ b/examples/basics/basics/task.py
@@ -7,9 +7,10 @@
 # .. tags:: Basic
 # ```
 #
-# Task is a fundamental building block and an extension point of Flyte, which encapsulates the users' code. They possess the following properties:
+# Task is a fundamental building block and an extension point of Flyte, which 
+# encapsulates the users' code. They possess the following properties:
 #
-# 1. Versioned (usually tied to the `git sha`)
+# 1. Versioned (usually tied to the `git revision`)
 # 2. Strong interfaces (specified inputs and outputs)
 # 3. Declarative
 # 4. Independently executable
@@ -17,12 +18,17 @@
 #
 # A task in Flytekit can be of two types:
 #
-# 1. A task that has a Python function associated with it. The execution of the task is equivalent to the execution of this function.
-# 2. A task that doesn't have a Python function, e.g., an SQL query or any portable task like Sagemaker prebuilt algorithms, or a service that invokes an API.
+# 1. A task that has a Python function associated with it. The execution of the 
+#    task is equivalent to the execution of this function.
+# 2. A task that doesn't have a Python function, e.g., an SQL query or any 
+#    portable task like Sagemaker prebuilt algorithms, or a service that 
+#    invokes an API.
 #
-# Flyte provides multiple plugins for tasks, which can be a backend plugin as well ([Athena](https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-aws-athena/flytekitplugins/athena/task.py)).
+# Multiple plugins for tasks--including backend plugins--are available in Flyte.
+# See also ([Athena](https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-aws-athena/flytekitplugins/athena/task.py)).
 #
-# In this example, you will learn how to write and execute a `Python function task`. Other types of tasks will be covered in the later sections.
+# In this example, you will learn how to write and execute a `Python function task`. 
+# Other types of tasks are covered in the later sections.
 # %% [markdown]
 # For any task in Flyte, there is one necessary import, which is:
 # %%
@@ -35,13 +41,19 @@
 from sklearn.model_selection import train_test_split
 
 # %% [markdown]
-# The use of the {py:func}`flytekit.task` decorator is mandatory for a ``PythonFunctionTask``.
-# A task is essentially a regular Python function, with the exception that all inputs and outputs must be clearly annotated with their types.
-# These types are standard Python types, which will be further explained in the {ref}`type-system section <flytekit_to_flyte_type_mapping>`.
+# The use of the {py:func}`flytekit.task` decorator is mandatory for
+# a ``PythonFunctionTask``.
+# A task is a regular Python function, but with a requirement that
+# all inputs and outputs must be clearly annotated with their types.
+# These types are standard Python types, which are further explained
+# in the {ref}`type-system section <flytekit_to_flyte_type_mapping>`.
+
 
 # %%
 @task
-def train_model(hyperparameters: dict, test_size: float, random_state: int) -> LogisticRegression:
+def train_model(
+    hyperparameters: dict, test_size: float, random_state: int
+) -> LogisticRegression:
     """
     Parameters:
         hyperparameters (dict): A dictionary containing the hyperparameters for the model.
@@ -55,7 +67,9 @@ def train_model(hyperparameters: dict, test_size: float, random_state: int) -> L
     iris = load_iris()
 
     # Splitting the data into train and test sets
-    X_train, _, y_train, _ = train_test_split(iris.data, iris.target, test_size=test_size, random_state=random_state)
+    X_train, _, y_train, _ = train_test_split(
+        iris.data, iris.target, test_size=test_size, random_state=random_state
+    )
 
     # Creating and training the logistic regression model with the given hyperparameters
     clf = LogisticRegression(**hyperparameters)
@@ -74,7 +88,9 @@ def train_model(hyperparameters: dict, test_size: float, random_state: int) -> L
 # You can execute a Flyte task as any normal function.
 # %%
 if __name__ == "__main__":
-    print(train_model(hyperparameters={"C": 0.1}, test_size=0.2, random_state=42))
+    print(
+        train_model(hyperparameters={"C": 0.1}, test_size=0.2, random_state=42)
+    )
 
 # %% [markdown]
 # ## Invoke a Task within a Workflow
@@ -87,44 +103,55 @@ def train_model(hyperparameters: dict, test_size: float, random_state: int) -> L
 
 @workflow
 def train_model_wf(
-    hyperparameters: dict = {"C": 0.1}, test_size: float = 0.2, random_state: int = 42
+    hyperparameters: dict = {"C": 0.1},
+    test_size: float = 0.2,
+    random_state: int = 42,
 ) -> LogisticRegression:
     """
-    This workflow invokes the train_model task with the given hyperparameters, test size and random state.
+    This workflow invokes the train_model task with the given hyperparameters,
+    test size and random state.
     """
-    return train_model(hyperparameters=hyperparameters, test_size=test_size, random_state=random_state)
+    return train_model(
+        hyperparameters=hyperparameters,
+        test_size=test_size,
+        random_state=random_state,
+    )
 
 
 # %% [markdown]
 # ```{note}
-# When invoking the `train_model` task, you need to use keyword arguments to specify the values for the corresponding parameters.
+# When invoking the `train_model` task, you need to use keyword arguments to 
+# specify the values for the corresponding parameters.
 # ````
 #
 # ## Use `partial` to provide default arguments to tasks
 #
-# You can use the {py:func}`functools.partial` function to assign default or constant values to the parameters of your tasks.
+# You can use the {py:func}`functools.partial` function to assign default or 
+# constant values to the parameters of your tasks.
 
 # %%
 import functools
 
 
 @workflow
-def train_model_wf_with_partial(test_size: float = 0.2, random_state: int = 42) -> LogisticRegression:
+def train_model_wf_with_partial(
+    test_size: float = 0.2, random_state: int = 42
+) -> LogisticRegression:
     partial_task = functools.partial(train_model, hyperparameters={"C": 0.1})
     return partial_task(test_size=test_size, random_state=random_state)
 
 
-# %% [markdown]
-# In this toy example, we're calling the `square` task twice and returning the result.
-
 # %% [markdown]
 # (single_task_execution)=
 #
 # :::{dropdown} Execute a single task *without* a workflow
 #
-# While workflows are typically composed of multiple tasks with dependencies defined by shared inputs and outputs,
-# there are cases where it can be beneficial to execute a single task in isolation during the process of developing and iterating on its logic.
-# Writing a new workflow definition every time for this purpose can be cumbersome, but executing a single task without a workflow provides a convenient way to iterate on task logic easily.
+# While workflows are typically composed of multiple tasks with dependencies 
+# defined by shared inputs and outputs, there are cases where it can be beneficial 
+# to execute a single task in isolation during the process of developing and 
+# iterating on its logic. Writing a new workflow definition every time for this 
+# purpose can be cumbersome, but executing a single task without a workflow 
+# provides a convenient way to iterate on task logic easily.
 #
 # To run a task without a workflow, use the following command:
 #

From b0bd57297bf8152b19b7e322e1e09b8e80349419 Mon Sep 17 00:00:00 2001
From: David Q Mertz <mertz@gnosis.cx>
Date: Tue, 25 Jul 2023 15:55:46 -0400
Subject: [PATCH 2/2] Update docs/environment_setup.md

Co-authored-by: Niels Bantilan <niels.bantilan@gmail.com>
Signed-off-by: David Q Mertz <mertz@gnosis.cx>
---
 docs/environment_setup.md | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/docs/environment_setup.md b/docs/environment_setup.md
index c42f2d56f..cb5e3c134 100644
--- a/docs/environment_setup.md
+++ b/docs/environment_setup.md
@@ -30,10 +30,12 @@ In this setup guide, let's run the `examples/basics` project.
 
 ```{prompt} bash
 git clone https://github.com/flyteorg/flytesnacks
-# ... or if your SSH key is registered on GitHub:
-# git clone git@github.com:flyteorg/flytesnacks.git
-# Or if you use the `gh` tool:
-# gh repo clone flyteorg/flytesnacks
+
+# or if your SSH key is registered on GitHub:
+git clone git@github.com:flyteorg/flytesnacks.git
+
+# or if you use the `gh` tool:
+gh repo clone flyteorg/flytesnacks
 cd flytesnacks/examples/basics
 pip install -r requirements.txt
 ```