diff --git a/algorithms/algo_bfs.md b/algorithms/algo_bfs.md new file mode 100644 index 00000000..556e2de3 --- /dev/null +++ b/algorithms/algo_bfs.md @@ -0,0 +1,100 @@ +--- +title: "BFS" +description: "BFS" +--- + +# BFS + +## Overview + +The Breadth-First Search (BFS) procedure allows you to perform a breadth-first traversal of a graph starting from a specific node. +BFS explores all the nodes at the present depth before moving on to nodes at the next depth level. +This is particularly useful for finding the shortest path between two nodes or exploring a graph layer by layer. + +## Syntax + +``` +CALL algo.bfs(start_node, max_depth, relationship) +YIELD nodes, edges +``` + +## Arguments + +| Name | Type | Description | Default | +|--------------|----------------|-----------------------------------------------------------------------------|------------| +| start_node | Node | Starting node for the BFS traversal | (Required) | +| max_depth | Integer | Maximum depth to traverse | (Required) | +| relationship | String or null | The relationship type to traverse. If null, all relationship types are used | null | + +## Returns + +| Name | Type | Description | +|-------|------|----------------------------------------------| +| nodes | List | List of visited nodes in breadth-first order | +| edges | List | List of edges traversed during the BFS | + +## Examples + +### Basic BFS Traversal + +This example demonstrates a basic BFS traversal starting from a person node. + + +### Social Network Friend Recommendations + +This example demonstrates how to use BFS to find potential friend recommendations in a social network. + +#### Setup the Graph + +``` +// Create Person nodes representing users in a social network +CREATE (alice:Person {name: 'Alice', age: 28, city: 'New York'}) +CREATE (bob:Person {name: 'Bob', age: 32, city: 'Boston'}) +CREATE (charlie:Person {name: 'Charlie', age: 35, city: 'Chicago'}) +CREATE (david:Person {name: 'David', age: 29, city: 'Denver'}) +CREATE (eve:Person {name: 'Eve', age: 31, city: 'San Francisco'}) +CREATE (frank:Person {name: 'Frank', age: 27, city: 'Miami'}) + +// Create FRIEND relationships +CREATE (alice)-[:FRIEND]->(bob) +CREATE (alice)-[:FRIEND]->(charlie) +CREATE (bob)-[:FRIEND]->(david) +CREATE (charlie)-[:FRIEND]->(eve) +CREATE (david)-[:FRIEND]->(frank) +CREATE (eve)-[:FRIEND]->(frank) +``` + +#### Find Friends of Friends (Potential Recommendations) + +``` +// Find Alice's friends-of-friends (potential recommendations) +MATCH (aline:Person {name: 'Alice'}) +CALL algo.bfs(me, 2, 'FRIEND') +YIELD nodes + +// Process results to get only depth 2 connections (friends of friends) +WHERE size(nodes) >= 3 +WITH alice, nodes[2] AS potential_friend +WHERE NOT (alice)-[:FRIEND]->(potential_friend) +RETURN potential_friend +``` + +In this social network example, the BFS algorithm helps find potential friend recommendations by identifying people who are connected to Alice's existing friends but not directly connected to Alice yet. + + +## Performance Considerations + +- **Indexing:** Ensure properties used for finding your starting node are indexed for optimal performance +- **Maximum Depth:** Choose an appropriate max_depth value based on your graph's connectivity; large depths in highly connected graphs can result in exponential growth of traversed nodes +- **Relationship Filtering:** When applicable, specify the relationship type to limit the traversal scope +- **Memory Management:** Be aware that the procedure stores visited nodes in memory to avoid cycles, which may require significant resources in large, densely connected graphs + +## Error Handling + +Common errors that may occur: + +- **Null Starting Node:** If the start_node parameter is null, the procedure will raise an error; ensure your MATCH clause successfully finds the starting node +- **Invalid Relationship Type:** If you specify a relationship type that doesn't exist in your graph, the traversal will only include the starting node +- **Memory Limitations:** For large graphs with high connectivity, an out-of-memory error may occur if too many nodes are visited +- **Result Size:** If the BFS traversal returns too many nodes, query execution may be slow or time out; in such cases, try reducing the max_depth or filtering by relationship types + diff --git a/algorithms/algo_pagerank.md b/algorithms/algo_pagerank.md new file mode 100644 index 00000000..01463e16 --- /dev/null +++ b/algorithms/algo_pagerank.md @@ -0,0 +1,96 @@ +--- +title: "PageRank" +description: "PageRank" +--- + +# PageRank + +## Introduction + +PageRank is an algorithm that measures the importance of each node within the graph based on the number of incoming relationships and the importance of the corresponding source nodes. +The algorithm was originally developed by Google's founders Larry Page and Sergey Brin during their time at Stanford University. + +## Algorithm Overview + +PageRank works by counting the number and quality of relationships to a node to determine a rough estimate of how important that node is. +The underlying assumption is that more important nodes are likely to receive more connections from other nodes. + +The algorithm assigns each node a score, where higher scores indicate greater importance. +The score for a node is derived recursively from the scores of the nodes that link to it, with a damping factor typically applied to prevent rank sinks. + +## Syntax + +The PageRank procedure has the following call signature: + +```cypher +CALL pagerank.stream( + [label], + [relationship] +) +YIELD node, score +``` + +### Parameters + +| Name | Type | Default | Description | +|----------------|--------|---------|------------------------------------------------------------------------------| +| `label` | String | null | The label of nodes to run the algorithm on. If null, all nodes are used. | +| `relationship` | String | null | The relationship type to traverse. If null, all relationship types are used. | + +### Yield + +| Name | Type | Description | +|---------|-------|--------------------------------------| +| `node` | Node | The node processed by the algorithm. | +| `score` | Float | The PageRank score for the node. | + +## Examples + +### Unweighted PageRank + +First, let's create a sample graph representing a citation network between scientific papers: + +```cypher +CREATE + (paper1:Paper {title: 'Graph Algorithms in Database Systems'}), + (paper2:Paper {title: 'PageRank Applications'}), + (paper3:Paper {title: 'Data Mining Techniques'}), + (paper4:Paper {title: 'Network Analysis Methods'}), + (paper5:Paper {title: 'Social Network Graph Theory'}), + + (paper2)-[:CITES]->(paper1), + (paper3)-[:CITES]->(paper1), + (paper3)-[:CITES]->(paper2), + (paper4)-[:CITES]->(paper1), + (paper4)-[:CITES]->(paper3), + (paper5)-[:CITES]->(paper2), + (paper5)-[:CITES]->(paper4) +``` + +Now we can run the PageRank algorithm on this citation network: + +```cypher +CALL pagerank.stream('Paper', 'CITES') +YIELD node, score +RETURN node.title AS paper, score +ORDER BY score DESC +``` + +Expected results: + +| paper | score | +|--------------------------------------|-------| +| Graph Algorithms in Database Systems | 0.43 | +| Data Mining Techniques | 0.21 | +| PageRank Applications | 0.19 | +| Network Analysis Methods | 0.14 | +| Social Network Graph Theory | 0.03 | + + +## Usage Notes + +**Interpreting scores**: + - PageRank scores are relative, not absolute measures + - The sum of all scores in a graph equals 1.0 + - Scores typically follow a power-law distribution + diff --git a/algorithms/algo_spath.md b/algorithms/algo_spath.md new file mode 100644 index 00000000..ddcb4aa4 --- /dev/null +++ b/algorithms/algo_spath.md @@ -0,0 +1,478 @@ +--- +title: "Path algorithms" +description: "Learn how to use algo.SPpaths and algo.SSpaths to find single-pair and single-source paths" +--- + +# Path algorithms + +In v2.10 introduced two new path-finding algorithms, or more accurately, minimum-weight, optionally bounded-cost, and optionally bounded-length path-finding algorithms, `algo.SPpaths` and `algo.SSpaths`. + +`algo.SPpaths` and `algo.SSpaths` can solve a wide range of real-world problems, where minimum-weight paths need to be found. `algo.SPpaths` finds paths between a given pair of nodes, while `algo.SSpaths` finds paths from a given source node. Weight can represent time, distance, price, or any other measurement. A bound can be set on another property (e.g., finding a minimum-time bounded-price way to reach from point A to point B). Both algorithms are performant and have low memory requirements. + +For both algorithms, you can set: + +* A list of relationship types to traverse (`relTypes`). + +* The relationships' property whose sum you want to minimize (`weight`). + +* A optional relationships' property whose sum you want to bound (`cost`) and the optional bound (`maxCost`). + +* An optional bound on the path length - the number of relationships along the path (`maxLen`). + +* The number of paths you want to retrieve: either all minimal-weight paths (`pathCount` is 0), a single minimal-weight path (`pathCount` is 1), or _n_ minimal-weight paths with potentially different weights (`pathCount` is _n_). + +This topic explains which problems you can solve using these algorithms and demonstrates how to use them. + +Let's start with the following graph. + +![Road network](../images/road_network.png) + +This graph represents a road network with 7 cities (A, B, C, and so on) and 11 one-way roads. Each road has a distance (say, in kilometers) and trip time (say, in minutes). + +Let's create the graph. + +```cypher +CREATE + (a:City{name:'A'}), + (b:City{name:'B'}), + (c:City{name:'C'}), + (d:City{name:'D'}), + (e:City{name:'E'}), + (f:City{name:'F'}), + (g:City{name:'G'}), + + (a)-[:Road{time:4, dist:3}]->(b), + (a)-[:Road{time:3, dist:8}]->(c), + (a)-[:Road{time:4, dist:2}]->(d), + (b)-[:Road{time:5, dist:7}]->(e), + (b)-[:Road{time:5, dist:5}]->(d), + (d)-[:Road{time:4, dist:5}]->(e), + (c)-[:Road{time:3, dist:6}]->(f), + (d)-[:Road{time:1, dist:4}]->(c), + (d)-[:Road{time:2, dist:12}]->(f), + (e)-[:Road{time:5, dist:5}]->(g), + (f)-[:Road{time:4, dist:2}]->(g) + ``` + +If you're using RedisInsight v2, you can create and visualize the graph by slightly modifying the above query: you'll have to assign aliases to all nodes and relationships, and return them: + +```cypher +CREATE + (a:City{name:'A'}), + (b:City{name:'B'}), + (c:City{name:'C'}), + (d:City{name:'D'}), + (e:City{name:'E'}), + (f:City{name:'F'}), + (g:City{name:'G'}), + + (a)-[r1:Road{time:4, dist:3}]->(b), + (a)-[r2:Road{time:3, dist:8}]->(c), + (a)-[r3:Road{time:4, dist:2}]->(d), + (b)-[r4:Road{time:5, dist:7}]->(e), + (b)-[r5:Road{time:5, dist:5}]->(d), + (d)-[r6:Road{time:4, dist:5}]->(e), + (c)-[r7:Road{time:3, dist:6}]->(f), + (d)-[r8:Road{time:1, dist:4}]->(c), + (d)-[r9:Road{time:2, dist:12}]->(f), + (e)-[r10:Road{time:5, dist:5}]->(g), + (f)-[r11:Road{time:4, dist:2}]->(g) RETURN a,b,c,d,e,f,g,r1,r2,r3,r4,r5,r6,r7,r8,r9,r10,r11" +``` + +![Road network](../images/graph_query_city.png) + +### Find the shortest path (by number of roads) from A to G + +```cypher +MATCH (a:City{name:'A'}),(g:City{name:'G'}) +WITH shortestPath((a)-[*]->(g)) as p +RETURN length(p) AS length, [n in nodes(p) | n.name] as pathNodes +``` + +Expected results: + +| length | pathNodes | +|--------|--------------| +| 3 | [A, D, F, G] | + +`shortestPath` returns one of the shortest paths. +If there is more than one, only one is retrieved. + +With RedisInsight v2, you can visualize a path simply by returning it. + +![Road network](../images/graph_query_road.png) + +### Find all the shortest paths (by number of roads) from A to G + +```cypher +MATCH (a:City{name:'A'}),(g:City{name:'G'}) +WITH a,g MATCH p=allShortestPaths((a)-[*]->(g)) +RETURN length(p) AS length, [n in nodes(p) | n.name] as pathNodes +``` + +Expected results: + +| length | pathNodes | +|--------|--------------| +| 3 | [A, D, F, G] | +| 3 | [A, C, F, G] | +| 3 | [A, D, E, G] | +| 3 | [A, B, E, G] | + +All `allShortestPaths` results have, by definition, the same length (number of roads). + +### Find 5 shortest paths (by number of roads) from A to G + +```cypher +MATCH p = (a:City{name:'A'})-[*]->(g:City{name:'G'}) +RETURN length(p) AS length, [n in nodes(p) | n.name] as pathNodes +ORDER BY length(p) LIMIT 5 +``` + +Expected results: + +| length | pathNodes | +|--------|-----------------| +| 3 | [A, B, E, G] | +| 3 | [A, D, E, G] | +| 3 | [A, D, F, G] | +| 3 | [A, C, F, G] | +| 4 | [A, D, C, F, G] | + +Using the unbounded traversal pattern `(a:City{name:'A'})-[*]->(g:City{name:'G'})`, FalkorDB traverses all possible paths from A to G. +`ORDER BY length(p) LIMIT 5` ensures that you collect only [up to 5 shortest paths (minimal number of relationships). +This approach is very inefficient because all possible paths would have to be traversed. +Ideally, you would want to abort some traversals as soon as you are sure they would not result in the discovery of shorter paths. + +### Find 5 shortest paths (in kilometers) from A to G + +In a similarly inefficient manner, you can traverse all possible paths and collect the 5 shortest paths (in kilometers). + +```cypher +MATCH p = (a:City{name:'A'})-[*]->(g:City{name:'G'}) +WITH p,reduce(dist=0, n IN relationships(p) | dist+n.dist) as dist +RETURN dist,[n IN nodes(p) | n.name] as pathNodes +ORDER BY dist LIMIT 5 +``` + +Expected results: + +| dist | pathNodes | +|------|-----------------| +| 12 | [A, D, E, G] | +| 14 | [A, D, C, F, G] | +| 15 | [A, B, E, G] | +| 16 | [A, D, F, G] | +| 16 | [A, C, F, G] | + +Again, instead of traversing all possible paths, you would want to abort some traversals as soon as you are sure that they would not result in the discovery of shorter paths. + +## algo.SPpaths + +Finding shortest paths (in kilometers) by traversing all paths and collecting the shortest ones is highly inefficient, up to the point of being impractical for large graphs, as the number of paths can sometimes grow exponentially relative to the number of relationships. +Using the `algo.SPpaths` procedure (SP stands for _single pair_) you can traverse the graph, collecting only the required paths in the most efficient manner. + +`algo.SPpaths` receives several arguments. The arguments you used in the examples above are: + +## Syntax + +```cypher +CALL algo.SPpaths([config]) +``` + +```cypher +CALL algo.SPpaths({ + sourceNode: n, + targetNode: m, + relDirection: 'outgoing', + relTypes: ['E'], + maxLen: 3, + weightProp: 'weight', + costProp: 'cost', + maxCost: 4, + pathCount: 2}) YIELD path, pathWeight, pathCost +``` + +### Parameters + +| Name | Type | Description | +|----------------|--------|---------------------------------------------------------------------------------------| +| `sourceNode` | Node | Node from which the path starts. | +| `targetNode` | Node | Last node on the path. | +| `relTypes` | Array | List of one or more relationship types to traverse | +| `maxLen` | Int | | +| `relDirection `| String | | +| `weightProp` | String | The relationship's property that represents the weight (for all specified `relTypes`) | +| `costProp` | String | | +| `maxCost` | Int | | +| `pathCount` | Int | | + + +### Return Values + +| Name | Type | Description | +|---------------|---------|--------------------------------------------------------------------------------| +| `path` | Path | Discovered path | +| `pathWeight` | Integer | The path's weight or sum of weightProp of all the relationships along the path | +| `pathCost` | Integer | | + + +You are looking for minimum-weight paths. The _weight of the path_ is the sum of the weights of all relationships composing the path. +If a given relationship does not have such a property or its value is not a positive integer or float, the property defaults to 1. + + +With `algo.SPaths`, you can solve queries like this. + +### Find the shortest path (in kilometers) from A to G + +Set `weightProp` to `dist`: + +```cypher +MATCH (a:City{name:'A'}),(g:City{name:'G'}) +CALL algo.SPpaths( {sourceNode: a, targetNode: g, relTypes: ['Road'], weightProp: 'dist'} ) YIELD path, pathWeight +RETURN pathWeight, [n in nodes(path) | n.name] as nodes +``` + +Expected results: + +| pathWeight | nodes | +|------------|--------------| +| 12 | [A, D, E, G] | + +### Find the fastest path (in minutes) from A to G + +Continue as before, but now set `weightProp` to `time`. + +```cypher +MATCH (a:City{name:'A'}),(g:City{name:'G'}) +CALL algo.SPpaths( {sourceNode: a, targetNode: g, relTypes: ['Road'], weightProp: 'time'} ) YIELD path, pathWeight +RETURN pathWeight, [n in nodes(path) | n.name] as node +``` + +Expected results: + +| pathWeight | nodes | +|------------|--------------| +| 10 | [A, D, F, G] | + +### Find the shortest paths (in kilometers) from A to G + +```cypher +MATCH (a:City{name:'A'}),(g:City{name:'G'}) +CALL algo.SPpaths( {sourceNode: a, targetNode: g, relTypes: ['Road'], pathCount: 0, weightProp: 'dist'} ) YIELD path, pathWeight +RETURN pathWeight, [n in nodes(path) | n.name] as nodes +``` + +Expected results: + +| pathWeight | nodes | +|------------|--------------| +| 12 | [A, D, E, G] | + +In the example above, you also specified the `pathCount` argument, where `pathCount` is the number of paths to report: + +* `0`: retrieve all minimum-weight paths (all reported paths have the same weight) + +* `1`: retrieve a single minimum-weight path (default) + +* `n>1`: retrieve up to _n_ minimum-weight paths (reported paths may have different weights) + +### Find 5 shortest paths (in kilometers) from A to G + +```cypher +MATCH (a:City{name:'A'}),(g:City{name:'G'}) +CALL algo.SPpaths( {sourceNode: a, targetNode: g, relTypes: ['Road'], pathCount: 5, weightProp: 'dist'} ) +YIELD path, pathWeight +RETURN pathWeight, [n in nodes(path) | n.name] nodes +ORDER BY pathWeight +``` + +Expected results: + +| pathWeight | nodes | +|------------|-----------------| +| 12 | [A, D, E, G] | +| 14 | [A, D, C, F, G] | +| 15 | [A, B, E, G] | +| 16 | [A, C, F, G] | +| 16 | [A, D, F, G] | + +### Find 2 shortest paths (in kilometers) from A to G, where you can reach G in up to 12 minutes + +Another interesting feature is the introduction of path constraints ('bounded-cost'). Suppose that you want to find only paths where you can reach G in 12 minutes or less. + +```cypher +MATCH (a:City{name:'A'}),(g:City{name:'G'}) +CALL algo.SPpaths( {sourceNode: a, targetNode: g, relTypes: ['Road'], pathCount: 2, weightProp: 'dist', costProp: 'time', maxCost: 12} ) +YIELD path, pathWeight, pathCost +RETURN pathWeight, pathCost, [n in nodes(path) | n.name] AS path ORDER BY pathWeight +``` + +Expected results: + +| pathWeight | pathCost | path | +|------------|----------|-----------------| +| 14 | 12 | [A, D, C, F, G] | +| 16 | 10 | ? | + +In the example above, you added the following optional arguments: + +* `costProp`: the relationship's property that represents the _cost_. +You are looking for _minimum-weight bounded-cost_ paths. +If a given relationship does not have such property or its value is not a positive integer/float, `costProp` defaults to 1. + +* `maxCost`: the maximum cost (the bound). +If not specified, there is no maximum cost constraint. + +You also yielded: + +* `pathCost`: the path's cost or the sum of costProp of all relationships along the path. + +### Find paths from D to G, assuming you can traverse each road in both directions + +Another interesting feature is the ability to revert or ignore the relationship direction. + +```cypher +MATCH (a:City{name:'D'}),(g:City{name:'G'}) +CALL algo.SPpaths( {sourceNode: a, targetNode: g, relTypes: ['Road'], relDirection: 'both', pathCount: 1000, weightProp: 'dist'} ) +YIELD path, pathWeight +RETURN pathWeight, [n in nodes(path) | n.name] as pathNodes ORDER BY pathWeight +``` + + +Expected results: + +| pathWeight | pathNodes | +|------------|-----------------------| +| 10 | [D, E, G] | +| 12 | [D, C, F, G] | +| 14 | [D, F, G] | +| 17 | [D, A, B, E, G] | +| 17 | [D, B, E, G] | +| 18 | [D, A, C, F, G] | +| 24 | [D, B, A, C, F, G] | +| 27 | [D, C, A, B, E, G] | +| 31 | [D, E, B, A, C, F, G] | +| 41 | [D, F, C, A, B, E, G] | + +In the example above, you added the following optional argument: + +* `relDirection`: one of `incoming`, `outgoing`, or `both`. If not specified, `relDirection` defaults to `outgoing`. + +### Find paths with length up to 4 from D to G, assuming you can traverse each road in both directions + +Suppose you want to repeat the query above but also limit the path-length (number of relationships along to path) to 4: + +```cypher +MATCH (a:City{name:'D'}),(g:City{name:'G'}) +CALL algo.SPpaths( {sourceNode: a, targetNode: g, relTypes: ['Road'], relDirection: 'both', pathCount: 1000, weightProp: 'dist', maxLen: 4} ) +YIELD path, pathWeight +RETURN pathWeight, [n in nodes(path) | n.name] as pathNodes +ORDER BY pathWeight +``` + +Expected results: + +| pathWeight | pathNodes | +|------------|-----------------| +| 10 | [D, E, G] | +| 12 | [D, C, F, G] | +| 14 | [D, F, G] | +| 17 | [D, A, B, E, G] | +| 17 | [D, B, E, G] | +| 18 | [D, A, C, F, G] | + +In the example above, you specified the following optional constraint: + +* `maxLen`: maximum path length (number of roads along the path) + +## algo.SSpaths + +Some problems involve just one node, the source node, where you ask questions about possible paths or reachable destinations, given some constraints. + +That's what the `algo.SSpaths` procedure (SS stands for _single source_) is all about. + +`algo.SSpaths` accepts the same arguments as `algo.SPpaths`, except `targetNode`. It also yields the same outputs (`path`, `pathCost`, and `pathWeight`). + +### Find all paths from A if the trip is limited to 10 kilometers + +```cypher +MATCH (a:City{name:'A'}) +CALL algo.SSpaths( {sourceNode: a, relTypes: ['Road'], pathCount: 1000, costProp: 'dist', maxCost: 10} ) YIELD path, pathCost +RETURN pathCost, [n in nodes(path) | n.name] as pathNodes +ORDER BY pathCost +``` + +Expected results: + +| pathCost | pathNodes | +|----------|-----------| +| 2 | [A, D] | +| 3 | [A, B] | +| 6 | [A, D, C] | +| 7 | [A, D, E] | +| 8 | [A, B, D] | +| 8 | [A, C] | +| 10 | [A, B, E] | + +### Find all paths from A if the trip is limited to 8 minutes + +```cypher +MATCH (a:City{name:'A'}) +CALL algo.SSpaths( {sourceNode: a, relTypes: ['Road'], pathCount: 1000, costProp: 'time', maxCost: 8} ) YIELD path, pathCost +RETURN pathCost, [n in nodes(path) | n.name] as pathNodes +ORDER BY pathCost +``` + +Expected results: + +| pathCost | pathNodes | +|----------|--------------| +| 3 | [A, C] | +| 4 | [A, B] | +| 4 | [A, D] | +| 5 | [A, D, C] | +| 6 | [A, D, F] | +| 6 | [A, C, F] | +| 8 | [A, D, C, F] | +| 8 | [A, D, E] | + +### Find 5 shortest paths (in kilometers) from A + +```cypher +MATCH (a:City{name:'A'}) +CALL algo.SSpaths( {sourceNode: a, relTypes: ['Road'], pathCount: 5, weightProp: 'dist', costProp: 'cost'} ) YIELD path, pathWeight, pathCost +RETURN pathWeight, pathCost, [n in nodes(path) | n.name] as pathNodes +ORDER BY pathWeight +``` + +Expected results: + +| pathWeight | pathCost | pathNodes | +|------------|----------|-----------| +| 2 | 1 | [A, D] | +| 3 | 1 | [A, B] | +| 6 | 2 | [A, D, C] | +| 7 | 2 | [A, D, E] | +| 8 | 1 | [A, C] | + +### Find 5 shortest paths (in kilometers) from A if the trip is limited to 6 minutes + +```cypher +MATCH (a:City{name:'A'}) +CALL algo.SSpaths( {sourceNode: a, relTypes: ['Road'], pathCount: 5, weightProp: 'dist', costProp: 'time', maxCost: 6} ) +YIELD path, pathWeight, pathCost +RETURN pathWeight, pathCost, [n in nodes(path) | n.name] as pathNodes +ORDER BY pathWeight +``` + +Expected results: + +| pathWeight | pathCost | pathNodes | +|------------|----------|-----------| +| 2 | 4 | [A, D] | +| 3 | 4 | [A, B] | +| 6 | 5 | [A, D, C] | +| 8 | 3 | [A, C] | +| 14 | 6 | [A, D, F] | + diff --git a/path_algorithm.md b/algorithms/algo_sspath.md similarity index 99% rename from path_algorithm.md rename to algorithms/algo_sspath.md index 43e5edb3..7550fa17 100644 --- a/path_algorithm.md +++ b/algorithms/algo_sspath.md @@ -1,6 +1,5 @@ --- title: "Path algorithms" -nav_order: 5 description: "Learn how to use algo.SPpaths and algo.SSpaths to find single-pair and single-source paths" --- @@ -404,4 +403,4 @@ GRAPH.QUERY g "MATCH (a:City{name:'A'}) CALL algo.SSpaths( {sourceNode: a, relTy 5) 1) "14" 2) "6" 3) "[A, D, F]" -``` \ No newline at end of file +``` diff --git a/algorithms/algo_wcc.md b/algorithms/algo_wcc.md new file mode 100644 index 00000000..a20446a1 --- /dev/null +++ b/algorithms/algo_wcc.md @@ -0,0 +1,106 @@ +--- +title: "Weakly Connected Components (WCC)" +description: "Weakly Connected Components (WCC)" +--- + +# Weakly Connected Components (WCC) + +## Overview + +The Weakly Connected Components (WCC) algorithm identifies sets of nodes that are connected to each other, regardless of the edge directions. +Each node in a weakly connected component can reach every other node in that component if edge directions are ignored. + +WCC is a common algorithmic building block used in applications like: +- Community detection +- Data preprocessing +- Network analysis +- Identifying isolated subgraphs + +## Algorithm Details + +WCC begins with each node in its own component. +The algorithm repeatedly merges components when edges are found between them, ignoring edge directions. +This process continues until no more merges are possible, resulting in a set of disjoint communities. + +### Performance + +The time complexity of WCC is O(|V| + |E|) where: +- |V| is the number of nodes +- |E| is the number of edges + +## Syntax + +```cypher +CALL algo.wcc([config]) +``` + +### Parameters + +The procedure accepts an optional configuration `Map` with the following parameters: + +| Name | Type | Default | Description | +|---------------------|-------|------------------------|----------------------------------------------------------------------------------| +| `nodeLabels` | Array | All labels | Array of strings listing which node labels will be considered | +| `relationshipTypes` | Array | All relationship types | Array of strings specifying which relationship types are allowed to be traversed | + +### Return Values + +| Name | Type | Description | +|---------------|---------|--------------------------------------| +| `node` | Node | The current node object | +| `componentId` | Integer | The component ID the node belongs to | + +## Examples + +```cypher +CALL algo.wcc({ + nodeLabels: ["Person"], + relationshipTypes: ["KNOWS"] +}) +YIELD node, componentId +RETURN node.name, componentId +``` + +### Creating a Social Network Graph + +```cypher +// Create users in different communities +CREATE + // Community 1 + (alice:User {name: 'Alice'}), + (bob:User {name: 'Bob'}), + (charlie:User {name: 'Charlie'}), + + // Community 2 + (david:User {name: 'David'}), + (emma:User {name: 'Emma'}), + + // Community 3 + (frank:User {name: 'Frank'}) + +// Create relationships within communities +CREATE + (alice)-[:FOLLOWS]->(bob), + (bob)-[:FRIENDS_WITH]->(charlie), + (charlie)-[:FOLLOWS]->(alice), + + (david)-[:FRIENDS_WITH]->(emma) + +// Note that Frank is isolated and forms his own community +``` + +### Analyzing Social Networks + +```cypher +// Find isolated communities in a social network +CALL algo.wcc({ + nodeLabels: ["User"], + relationshipTypes: ["FOLLOWS", "FRIENDS_WITH"], +}) +YIELD componentId + +// Get community sizes +RETURN componentId AS communityId, count(*) AS communitySize +ORDER BY communitySize DESC +``` + diff --git a/algorithms/index.md b/algorithms/index.md new file mode 100644 index 00000000..8d13ffb7 --- /dev/null +++ b/algorithms/index.md @@ -0,0 +1,17 @@ +--- +title: "Algorithms" +description: "Algorithms" +--- + +# Algorithms + +## 1. [WCC](/algorithms/algo_wcc) + +## 2. [PageRank](/algorithms/algo_pagerank) + +## 3. [All Paths](/algorithms/algo_spath) + +## 4. [All Shortest Paths](/algorithms/algo_sspath) + +## 5. [BFS](/algorithms/algo_bfs) + diff --git a/cypher/algorithms.md b/cypher/algorithms.md deleted file mode 100644 index c83f47d1..00000000 --- a/cypher/algorithms.md +++ /dev/null @@ -1,25 +0,0 @@ ---- -title: "Algorithms" -nav_order: 20 -description: > - FalkorDB supported algorithms like BFS. -parent: "Cypher Language" ---- - -# Algorithms - -## BFS - -The breadth-first-search algorithm accepts 3 arguments: - -`source-node (node)` - The root of the search. - -`max-level (integer)` - If greater than zero, this argument indicates how many levels should be traversed by BFS. 1 would retrieve only the source's neighbors, 2 would retrieve all nodes within 2 hops, and so on. - -`relationship-type (string)` - If this argument is NULL, all relationship types will be traversed. Otherwise, it specifies a single relationship type to perform BFS over. - -It can yield two outputs: - -`nodes` - An array of all nodes connected to the source without violating the input constraints. - -`edges` - An array of all edges traversed during the search. This does not necessarily contain all edges connecting nodes in the tree, as cycles or multiple edges connecting the same source and destination do not have a bearing on the reachability this algorithm tests for. These can be used to construct the directed acyclic graph that represents the BFS tree. Emitting edges incurs a small performance penalty.