Skip to content

dimgerasimou/pds-hw1-connected-components

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Connected Components — Shared-Memory Parallel Implementations

Assignment #1 of the Parallel and Distributed Systems coursework: parallel-distributed-systems

Parallel implementations of the connected components problem for sparse graphs, focusing on shared-memory parallelism, multiple algorithmic variants, and performance benchmarking.

This project evaluates connected components algorithms in C under different shared-memory programming models, with automated measurement of execution time, throughput, speedup, and efficiency.

Overview

The goal of this project is to study the performance characteristics of connected components algorithms under different shared-memory programming models.

It includes:

  • Multiple algorithmic variants
  • Multiple shared-memory parallelization approaches
  • A unified benchmarking framework with statistical analysis

Implementations

Algorithms

  • Label Propagation
    • Iterative label relaxation until convergence
    • Early termination and bitmap-based component counting
  • Union-Find
    • Disjoint-set structure with path halving
    • Typically faster and more scalable

Parallelization Models

Each algorithm is implemented using:

  • Sequential (baseline)
  • OpenMP
  • POSIX Threads
  • OpenCilk

Features

  • Multiple parallel connected components implementations
  • Automated benchmarking with configurable trials and threads
  • Performance metrics:
    • Execution time
    • Throughput (edges/sec)
    • Speedup and efficiency
  • Peak memory usage tracking
  • Machine-readable JSON output for analysis
  • Matrix Market (.mtx) and MAT-file (.mat) input support

Build

Requirements

  • C compiler (gcc or clang)
  • OpenMP
  • POSIX threads
  • OpenCilk (for Cilk variant)
  • libmatio (for .mat file support)

Ensure OpenCilk is installed and update the CILK_PATH variable in the Makefile to point to your OpenCilk installation:

# Example paths (edit as needed)
CILK_PATH = /opt/opencilk
# or
CILK_PATH = /usr/local/opencilk

Compile all targets

make

Executables are placed in bin/.

Usage

Available options

Run to see all available options:

make help

Quick benchmark

make benchmark MATRIX=data/soc-LiveJournal1.mtx THREADS=8 TRIALS=10

Results are printed in JSON format to stdout.

Save results to file

make benchmark-save MATRIX=data/soc-LiveJournal1.mtx THREADS=8 TRIALS=10

Output is stored in benchmarks/ with a timestamp.

Compare Variants

make benchmark-compare MATRIX=data/soc-LiveJournal1.mtx TRIALS=10 THREADS=8

Output is stored in benchmarks/ with a timestamp and the version.

Manual execution

bin/benchmark_runner -v 0 -t 8 -n 10 data/matrix.mtx

Project Structure

src/
├── algorithms/   # Sequential, OpenMP, Pthreads, OpenCilk
├── core/         # Matrix representations and utilities
├── utils/        # Benchmarking, JSON output, helpers
├── main.c        # Algorithm entry point
└── runner.c      # Benchmark runner

Performance Results

Detailed benchmark results and tables are available in: Performance results and analysis.

Raw benchmark outputs are generated automatically and stored under benchmarks/.

Notes

  • Designed for shared-memory systems
  • Emphasis on performance comparison rather than algorithmic novelty
  • Intended for experimentation and benchmarking on Linux systems

November 2025 • Aristotle University of Thessaloniki

About

Parallel connected components implementations in C using OpenMP, Pthreads, and Cilk, with benchmarking and performance analysis.

Topics

Resources

Stars

Watchers

Forks

Contributors