Thank you for your interest in contributing to StateSet Data Studio! This document provides guidelines and information for contributors.
- Code of Conduct
- How to Contribute
- Development Setup
- Project Structure
- Submitting Changes
- Reporting Issues
- Testing
This project follows a code of conduct to ensure a welcoming environment for all contributors. Please read our Code of Conduct before contributing.
We welcome contributions in several forms:
- Bug reports and feature requests via GitHub Issues
- Code contributions via Pull Requests
- Documentation improvements
- Testing and feedback
- Bug Fixes: Fix existing issues
- Features: Add new functionality
- Documentation: Improve docs, add examples
- Tests: Add or improve test coverage
- UI/UX: Frontend improvements
- Python 3.10+
- Node.js 16+
- Docker (optional)
git clone https://github.com/stateset/stateset-data-studio.git
cd synthetic-data-studio# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r backend/requirements.txt
# Copy and configure environment
cp .env.example .env
# Edit .env with your API keyscd frontend
npm install
cp .env.example .env
# Configure frontend environment variables# Initialize database
python -c "from backend.db.session import init_db; init_db()"synthetic-data-studio/
├── backend/ # FastAPI backend
│ ├── api/ # API endpoints
│ ├── services/ # Business logic
│ ├── db/ # Database models and session
│ └── configs/ # Configuration files
├── frontend/ # React frontend
│ ├── src/
│ ├── public/
│ └── package.json
├── synthetic_data_kit/ # Core synthetic data generation
├── tests/ # Test suite
├── examples/ # Example scripts
├── data/ # Data directory structure
└── configs/ # Configuration files
# Fork the repository on GitHub
# Clone your fork
git clone https://github.com/stateset/stateset-data-studio.git
cd stateset-data-studio
# Create a feature branch
git checkout -b feature/your-feature-name- Write clear, concise commit messages
- Follow the existing code style
- Add tests for new functionality
- Update documentation as needed
# Run backend tests
cd backend
python -m pytest ../tests/
# Run frontend tests
cd frontend
npm test
# Manual testing
python run.py # Start the application- Push your changes to your fork
- Create a Pull Request on GitHub
- Fill out the PR template completely
- Wait for review and address any feedback
- Title: Use clear, descriptive titles
- Description: Explain what and why, not just how
- Commits: Squash related commits
- Tests: Include tests for new features
- Documentation: Update docs for API changes
When reporting bugs, please include:
- Expected behavior
- Actual behavior
- Steps to reproduce
- Environment details (OS, Python/Node versions)
- Error messages/logs
For feature requests, please include:
- Use case: Why do you need this feature?
- Proposed solution: How should it work?
- Alternatives: Other approaches considered?
# Backend tests
python -m pytest tests/ -v
# Frontend tests
cd frontend && npm test
# Integration tests
python run_api_tests.pyNotes:
pytestcollects deterministic unit/integration tests only.- Legacy manual validation scripts in
tests/remain runnable directly withpython <script>.pyand are excluded from CI collection.
- Place tests in the
tests/directory - Use descriptive test names
- Test both success and failure cases
- Mock external dependencies
Format: type(scope): description
Types:
feat: New featuresfix: Bug fixesdocs: Documentation changesstyle: Code style changesrefactor: Code refactoringtest: Test additions/changeschore: Maintenance tasks
Example: feat(auth): add OAuth2 login support
Feel free to open a GitHub Discussion or contact the maintainers if you have questions about contributing.