diff --git a/21-Reinforcement-Learning/index.html b/21-Reinforcement-Learning/index.html index 86d21c5..994c75a 100644 --- a/21-Reinforcement-Learning/index.html +++ b/21-Reinforcement-Learning/index.html @@ -6,24 +6,24 @@ - + - + - + - + - + @@ -35,8 +35,12 @@

Reinforcement Learning

+

Table of contents

+

Policy Evaluation

- + - +
diff --git a/3-Solving-Problems-By-Searching/index.html b/3-Solving-Problems-By-Searching/index.html index c70abc6..60ace82 100644 --- a/3-Solving-Problems-By-Searching/index.html +++ b/3-Solving-Problems-By-Searching/index.html @@ -42,8 +42,20 @@

Solving Problems By Searching

- -

Node Expansion

+

Table of contents

+ +

Node Expansion

To search through a graph, a Search Agent needs to expand nodes. The nodes which can be expanded by the Agent together forms the frontier. Expanding a node refers to marking the node as 'expanded' or 'visited' and adding its immediate neighbors to the frontier. @@ -82,7 +94,7 @@

Frontier Nodes

-

Getting in the shoes of a Search Agent

+

Getting in the shoes of a Search Agent

Let's see the prespective of a Search Agent as it searches through the graph.
Remember, the Agent can only see the nodes which are either expanded or currently present in the frontier. @@ -104,7 +116,7 @@

Getting in the shoes of a Search Agent

-

Breadth First Search

+

In Breadth First Search, the node which was discovered the earliest is expanded next i.e. the node which joined the frontier earlier, is expanded earlier.

To achieve this, Breadth First Search Algorithm uses a FIFO(First In First Out) Queue. The following graph is explored by a Breadth First Search Algorithm with 'A' as the initial node. @@ -140,7 +152,7 @@

FIFO Queue

-

Depth First Search

+

In Depth First Search, the node which was discovered the latest is expanded next i.e. the node which joined the frontier later, is expanded later.

To achieve this, Depth First Search Algorithm uses a LIFO(Last In First Out) Queue. The following graph is explored by a Depth First Search Algorithm with 'A' as the initial node. @@ -177,7 +189,7 @@

LIFO Queue

-

Step Costs

+

Step Costs

Until now, all the edges in our graph had the same cost.(That's why we didn't bother to mention the cost on the graph). For those kind of graphs, Breadth First Search is optimal because it always pops the shallowest node first. For the case when some step costs are involved when exploring the nodes, the BFS algorithm needs be extended. @@ -231,7 +243,7 @@

Uniform Cost Search (Extension of BFS)
-

Uniform Cost Search

+

For Unifrom Cost Search, instead of using a simple LIFO queue, A priority Queue is used where the cost of reaching that node from the initial node is considered as its priority. On each iteration, the node with the smallest cost is extracted from the frontier for expansion. @@ -290,7 +302,7 @@

costs <= 0
-

Depth Limited Search

+

The Depth Limited Search is the same as Depth First Search except that there is an upper limit to the depth of the nodes which the algorithm traverses. The nodes which have depths greater than this limit is not expanded by the Depth Limited Search.

@@ -331,7 +343,7 @@

Depth Limit :

-

Iterative Deepening Depth-First Search

+

Iterative Deepening Depth-First Search

Iterative Deepening Depth-First Search is a general strategy that is used to find the best depth limit. It does this by applying Depth Limited Search to the given problem with increasing depth limit. (0, 1, 2, 3 and so on.)

@@ -370,7 +382,7 @@

Depth Limit :

-

Bi-directional BFS

+

Bi-directional BFS

In bi-directional BFS, we run two simultaneous searches, one from the initial state and one from the goal state. We stop when these searches meet in the middle (ie, a node is explored by both the BFS)

The motivation behind this is that bd/2 + bd/2 is smaller than bd

In the diagram below, we compare the performance between bi-directional BFS and standard BFS side by side. The total number of nodes generated for each strategy is given below the diagram.

@@ -394,7 +406,7 @@

-

A Star Search

+

Click on the canvas to restart the simulation

diff --git a/4-Beyond-Classical-Search/index.html b/4-Beyond-Classical-Search/index.html index d272877..f857c4b 100644 --- a/4-Beyond-Classical-Search/index.html +++ b/4-Beyond-Classical-Search/index.html @@ -33,8 +33,21 @@

Beyond classical search

- -

Optimization Problem

+

Table of contents

+ +

Optimization Problem

In many optimization problems, the path to the solution is irrelevant. In pure optimization problems, the best state is defined by the objective function. To represent such problems, a state-space landscape is used. It has a location (state) represented by x-axis and elevation(objective function value) represented by y-axis. The best state is hence the state with the highest objective value

The given diagram is a state-space representation of an objective function. You can click anywhere inside the box to reveal the elevation there. You are allowed 25 moves to find the highest peak before the hill is revealed.

@@ -51,7 +64,7 @@

-

Hill Climbing Search

+

Hill Climbing Search

In hill climbing search, the current node is replaced by the best neighbor. In this case, the objective function is represented by elevation, neighbors of a state are the states to the left and right of it and the best neigbor is the neigbor state with the highest elevation.

The represents global maximas and represents the states from where the hill climbing search can reach a global maxima.

@@ -65,7 +78,7 @@

Hill Climbing Search

-

Simulated Annealing

+

Simulated Annealing

Simulated Annealing is a combination of Hill Climbing and Random Walk to gain more efficiency and completeness. In this procedure, instead of always moving to the best neighbor, a random neighbor is chosen. If the new state has better objective value than the current state, it is always chosen. If not, the algorithm accept the new state with a probability less than one. The probability of choosing a bad state depends on :

    @@ -88,16 +101,16 @@

    Simulated Annealing

    -

    Genetic Algorithm

    +

    Genetic Algorithm

    Little critters change the color of their fur to match the background to camouflage themselves from predators.

    Click on the canvas to generate next generation. Keep clicking to generate another progeny.

    Note: Single point crossover might not be suitable for all applications

    -

    Searching with non-deterministic actions

    +

    Searching with non-deterministic actions

    In a world with non-deterministic actions, the result of an action is not known with complete certainty.

    -

    The erratic vacuum world

    +

    The erratic vacuum world

    The erratic vacuum world is an extension of vacuum world from Chapter 2. In this world, the behavior of the cleaner is non-deterministic.

    To define this behavior more formally, in this world, the Suck action works as follows

      @@ -153,8 +166,9 @@
      Moves
      -

      Searching with Partial Observations

      Now we come back to a world where the actions of the robot are deterministic again (no erratic behavior like before) but, the robot no longer has complete sense of its current state or its environment. -

      Vacuum World with no observation

      +

      Searching with Partial Observations

      +

      Now we come back to a world where the actions of the robot are deterministic again (no erratic behavior like before) but, the robot no longer has complete sense of its current state or its environment.

      +

      Vacuum World with no observation

      In this world, the vacuum cleaner has no idea initially about its own location and the location of dirt in the world. Since the robot has no percept, it should be able to figure out a sequence of actions that will work despite its current state.

      Given below are 8 random initial states. You can record a sequence of actions and see it in action just like before. Assume that illegal moves (like moving right in the right-most tile) have no effect on the world.

      Try to find a sequence of actions that will lead to a final state (Clean all the dirt), no matter what the initial state of the world.

      @@ -201,18 +215,18 @@
      Moves
-

And-Or-Graph-Search

+
-

Online DFS Agent

+

Online DFS Agent

Click to reset. Green tile is destination.

-

LRTA*-Agent

+

LRTA*-Agent

diff --git a/5-Adversarial-Search/index.html b/5-Adversarial-Search/index.html index 3bb9c92..1f7a6a5 100644 --- a/5-Adversarial-Search/index.html +++ b/5-Adversarial-Search/index.html @@ -5,7 +5,7 @@ - + @@ -17,15 +17,19 @@

Adversarial Search

- -

Minimax

+

Table of contents

+ +

Minimax

Click on the screen to restart the simulation.


 
-      

Alpha Beta Pruning

+

Alpha Beta Pruning

Click on the screen to restart the simulation.

diff --git a/6-Constraint-Satisfaction-Problems/index.html b/6-Constraint-Satisfaction-Problems/index.html index 8a4ef5b..dc10e39 100644 --- a/6-Constraint-Satisfaction-Problems/index.html +++ b/6-Constraint-Satisfaction-Problems/index.html @@ -31,7 +31,16 @@

Constraint Satisfaction Problems

-

Defining CSP with Map Coloring Problem

+

Table of contents

+ +

Defining CSP with Map Coloring Problem

A map coloring problem is a type of CSP where each state can be assigned a color from the set (red,green,blue). The constraint involved says that no two neighbouring state is allowed to have the same color.

@@ -54,7 +63,7 @@

Defining CSP with Map Coloring Problem

-

Arc consistency

+

Arc consistency

  • Y = X + 1
  • @@ -67,7 +76,7 @@

    Arc consistency

    -

    Sudoku Example of CSP

    +

    Sudoku Example of CSP

    All sudoku puzzles can be forumulated as CSP by considering each cell as a variable. The initial domain of all cells is {1,2,3,4,5,6,7,8,9}.
    The constraints are formulated by the fact that in the solution of a sudoku puzzle, no two cell in a row, column or block can have identical numbers. Thus, there is an AllDiff( ) constraint for all the rows, columns and blocks.

    @@ -99,7 +108,7 @@

    Sudoku Example of CSP

-

Backtracking Search

+

Backtracking Search is a type of Search algorithm that is useful to solve CSPs. It interweaves inferences and search to arrive at a solution by pruning parts of the search tree.

It repeatedly chooses an unassigned variable and tries all the possible values for it. If any inconsistency is detected, it backtracks and tries another value for the previous assignment.

In the diagram below, order of selection of variable as well as the order of values for the variable is chosen randomly. Use the Next button to move forward and Previous button to go back. You can also use the slider to go to any state @@ -132,13 +141,13 @@

Assignments :
-

Min Conflicts

+

Min Conflicts

Min conflicts stuff


 
-      

Tree CSP

+

Tree CSP

Tree CSP Stuff