Skip to content

M3SOulu/light-oauth2

 
 

Repository files navigation

A Dataset of logs, metrics, and traces for a Microservice system

This a fork of light-oauth2 system by networknt adapted for the needs of creating the dataset. Original README is here: OLD_README.md. For other dockerfiles and more documentation, see original repo.

Versions

This is latest development version of the data collection scripts.

For the version corresponding to the PROMISE25 paper "LO2: Microservice API Anomaly Dataset of Logs and Metrics", see the corresponding release. The corresponding dataset is available at https://doi.org/10.5281/zenodo.14257989.

Contents

This repository contains the following files:

  • Original source code

    Most directories in this repository contain original source code of light-oauth2 components;

  • Locust tests

    Locust tests created for light-oauth2 APIs;

  • Docker compose file

    The Docker compose file adapted to deploy light-oauth2 system as well as additional components needed to gather the data. We use the MySQL database for deployment while light-oauth2 supports other options. See original repo for other deployment files;

  • prometheus.yml Configuration of Prometheus used in deployment

  • prometheus_metrics.txt List of available and queried Prometheus metrics (from our research server)

  • opentelemetry-javaagent.jar Java agent for Jaeger that we attempt to inject into each container for tracing collection

  • Scripts

    • prometheus_metrics.sh Script used to query all available Prometheus metrics
    • fetch_data.py Script to fetch data of Prometheus and Jaeger agents
    • data_run.sh Main script that deploys the system, runs Locust tests and collects all data

Setup

To replicate the data collection process, set up the following things:

Required packages

  • Clone this repository
  • Install MySQL (mysqladmin command should be available)
  • Install Locust (locust command should be available)
  • Install the requests python library
  • Install Docker (docker and docker compose commands should be available)

Prometheus metrics

The file prometheus_metrics.txt contains the list of all metrics that should be queried from Prometheus during data gathering. Currently, it contains metrics that were available on our research server.

It is possible to use the prometheus_metrics.sh script to query all metrics available on your host system:

  • Start your Prometheus instance (container)
  • If it is deployed somewhere else than localhost:9000, change the URL in the script
  • Run the script
  • The list of metrics will be saved into prometheus_metrics.txt to be used by the main script
  • If you need only a subset of metrics, edit the file accordingly

Making a data run

To perform a single data run, i.e. deploy the system and execute all the locust tests, run the data_run.sh script.

The script performs the following:

  • Get the list of all tagged tasks from locust files
  • Get the list of all Prometheus metrics from prometheus_metrics.txt
  • Deploy all the containers using docker compose -f docker-compose-oauth2-mysql.yml up --force-recreate -d
  • Wait for the MySQL database to be ready and read the configuration for light-oauth2
  • Shuffle all discovered tests in a random order
  • For each tag, run a test with a random duration of 20-180 seconds
    • If the tag is correct, run only tasks tagged correct
    • If the tag is any other tag, run all correct tasks and the tagged task
  • Wait between 1-5 seconds between tests
  • Fetch all logs, metrics, traces
  • The data has the following directory structure:
    • LO2_run_UNIX: root folder of data, UNIX is the unix timestamp of the beginning of the run
      • run_log.log: log of the data_run.sh script for the entire run
      • correct/ERROR: directory for the correct or correct+ERROR test execution
        • *.log files: log files for each container and Locust
        • metrics: folder containing all data from Prometheus and Jaeger
          • metric_*.json: a JSON file for each Prometheus metric from prometheus_metrics.txt with metric values
          • traces_*.csv: a CSV file for each container with Jaeger traces
          • last_fetch_time.txt: timestamp of the end of the interval the data was fetched for

Reference

If you use this dataset or the data collection package, please cite the following paper:

Bakhtin, A., Nyyssölä, J., Wang, Y., Ahmad, N., Ping, K., Esposito, M., Mäntylä, M., & Taibi, D. (2025). LO2: Microservice API Anomaly Dataset of Logs and Metrics. Proceedings of the 21st International Conference on Predictive Models and Data Analytics in Software Engineering, 1–10. https://doi.org/10.1145/3727582.3728682

About

A fast, light and cloud native OAuth 2.0 authorization microservices based on light-4j

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 84.7%
  • Python 11.2%
  • Shell 1.8%
  • Dockerfile 1.2%
  • TSQL 1.1%