Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
100 changes: 100 additions & 0 deletions algorithms/algo_bfs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
---
title: "BFS"
description: "BFS"
---

# BFS

## Overview

The Breadth-First Search (BFS) procedure allows you to perform a breadth-first traversal of a graph starting from a specific node.
BFS explores all the nodes at the present depth before moving on to nodes at the next depth level.
This is particularly useful for finding the shortest path between two nodes or exploring a graph layer by layer.

## Syntax

```
CALL algo.bfs(start_node, max_depth, relationship)
YIELD nodes, edges
```

## Arguments

| Name | Type | Description | Default |
|--------------|----------------|-----------------------------------------------------------------------------|------------|
| start_node | Node | Starting node for the BFS traversal | (Required) |
| max_depth | Integer | Maximum depth to traverse | (Required) |
| relationship | String or null | The relationship type to traverse. If null, all relationship types are used | null |

## Returns

| Name | Type | Description |
|-------|------|----------------------------------------------|
| nodes | List | List of visited nodes in breadth-first order |
| edges | List | List of edges traversed during the BFS |

## Examples

### Basic BFS Traversal

This example demonstrates a basic BFS traversal starting from a person node.


### Social Network Friend Recommendations
Comment on lines +38 to +43
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Provide the missing Basic BFS Traversal example
The “Basic BFS Traversal” section is empty. An example query should be added (e.g., a simple traversal calling the procedure and yielding results).


This example demonstrates how to use BFS to find potential friend recommendations in a social network.

#### Setup the Graph

```
// Create Person nodes representing users in a social network
CREATE (alice:Person {name: 'Alice', age: 28, city: 'New York'})
CREATE (bob:Person {name: 'Bob', age: 32, city: 'Boston'})
CREATE (charlie:Person {name: 'Charlie', age: 35, city: 'Chicago'})
CREATE (david:Person {name: 'David', age: 29, city: 'Denver'})
CREATE (eve:Person {name: 'Eve', age: 31, city: 'San Francisco'})
CREATE (frank:Person {name: 'Frank', age: 27, city: 'Miami'})

// Create FRIEND relationships
CREATE (alice)-[:FRIEND]->(bob)
CREATE (alice)-[:FRIEND]->(charlie)
CREATE (bob)-[:FRIEND]->(david)
CREATE (charlie)-[:FRIEND]->(eve)
CREATE (david)-[:FRIEND]->(frank)
CREATE (eve)-[:FRIEND]->(frank)
```

#### Find Friends of Friends (Potential Recommendations)

```
// Find Alice's friends-of-friends (potential recommendations)
MATCH (aline:Person {name: 'Alice'})
CALL algo.bfs(me, 2, 'FRIEND')
YIELD nodes
Comment on lines +71 to +73
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix inconsistent variable name in example
The query uses MATCH (aline:Person {name: 'Alice'}) but then calls CALL algo.bfs(me, 2, 'FRIEND'). Either rename the bound variable from aline to me in the MATCH, or call algo.bfs(aline, ...).


// Process results to get only depth 2 connections (friends of friends)
WHERE size(nodes) >= 3
WITH alice, nodes[2] AS potential_friend
WHERE NOT (alice)-[:FRIEND]->(potential_friend)
RETURN potential_friend
```

In this social network example, the BFS algorithm helps find potential friend recommendations by identifying people who are connected to Alice's existing friends but not directly connected to Alice yet.


## Performance Considerations

- **Indexing:** Ensure properties used for finding your starting node are indexed for optimal performance
- **Maximum Depth:** Choose an appropriate max_depth value based on your graph's connectivity; large depths in highly connected graphs can result in exponential growth of traversed nodes
- **Relationship Filtering:** When applicable, specify the relationship type to limit the traversal scope
- **Memory Management:** Be aware that the procedure stores visited nodes in memory to avoid cycles, which may require significant resources in large, densely connected graphs

## Error Handling

Common errors that may occur:

- **Null Starting Node:** If the start_node parameter is null, the procedure will raise an error; ensure your MATCH clause successfully finds the starting node
- **Invalid Relationship Type:** If you specify a relationship type that doesn't exist in your graph, the traversal will only include the starting node
- **Memory Limitations:** For large graphs with high connectivity, an out-of-memory error may occur if too many nodes are visited
- **Result Size:** If the BFS traversal returns too many nodes, query execution may be slow or time out; in such cases, try reducing the max_depth or filtering by relationship types

96 changes: 96 additions & 0 deletions algorithms/algo_pagerank.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
---
title: "PageRank"
description: "PageRank"
---

# PageRank

## Introduction

PageRank is an algorithm that measures the importance of each node within the graph based on the number of incoming relationships and the importance of the corresponding source nodes.
The algorithm was originally developed by Google's founders Larry Page and Sergey Brin during their time at Stanford University.

## Algorithm Overview

PageRank works by counting the number and quality of relationships to a node to determine a rough estimate of how important that node is.
The underlying assumption is that more important nodes are likely to receive more connections from other nodes.

The algorithm assigns each node a score, where higher scores indicate greater importance.
The score for a node is derived recursively from the scores of the nodes that link to it, with a damping factor typically applied to prevent rank sinks.

## Syntax

The PageRank procedure has the following call signature:

```cypher
CALL pagerank.stream(
[label],
[relationship]
)
YIELD node, score
```

### Parameters

| Name | Type | Default | Description |
|----------------|--------|---------|------------------------------------------------------------------------------|
| `label` | String | null | The label of nodes to run the algorithm on. If null, all nodes are used. |
| `relationship` | String | null | The relationship type to traverse. If null, all relationship types are used. |

### Yield

| Name | Type | Description |
|---------|-------|--------------------------------------|
| `node` | Node | The node processed by the algorithm. |
| `score` | Float | The PageRank score for the node. |

## Examples

### Unweighted PageRank

First, let's create a sample graph representing a citation network between scientific papers:

```cypher
CREATE
(paper1:Paper {title: 'Graph Algorithms in Database Systems'}),
(paper2:Paper {title: 'PageRank Applications'}),
(paper3:Paper {title: 'Data Mining Techniques'}),
(paper4:Paper {title: 'Network Analysis Methods'}),
(paper5:Paper {title: 'Social Network Graph Theory'}),

(paper2)-[:CITES]->(paper1),
(paper3)-[:CITES]->(paper1),
(paper3)-[:CITES]->(paper2),
(paper4)-[:CITES]->(paper1),
(paper4)-[:CITES]->(paper3),
(paper5)-[:CITES]->(paper2),
(paper5)-[:CITES]->(paper4)
```

Now we can run the PageRank algorithm on this citation network:

```cypher
CALL pagerank.stream('Paper', 'CITES')
YIELD node, score
RETURN node.title AS paper, score
ORDER BY score DESC
```

Expected results:

| paper | score |
|--------------------------------------|-------|
| Graph Algorithms in Database Systems | 0.43 |
| Data Mining Techniques | 0.21 |
| PageRank Applications | 0.19 |
| Network Analysis Methods | 0.14 |
| Social Network Graph Theory | 0.03 |


## Usage Notes

**Interpreting scores**:
- PageRank scores are relative, not absolute measures
- The sum of all scores in a graph equals 1.0
- Scores typically follow a power-law distribution

Loading
Loading