diff --git a/content/docs/2025/beer-bears/week2.md b/content/docs/2025/beer-bears/week2.md new file mode 100644 index 000000000..a54604088 --- /dev/null +++ b/content/docs/2025/beer-bears/week2.md @@ -0,0 +1,96 @@ +--- +title: "Week #2" +--- +# Week #2 (Jun 12 - Jun 18) + +## Detailed Requirements Elaboration + +This week focused on establishing Scaffold's core technical foundation through detailed requirements specification. We defined and implemented: +- (1) **Graph entity schemas** (Nodes, Relationships, Tags, Metainfo) to structurally represent code components and their dependencies; +- (2) **AT Generator specifications** for transforming parsed code into srtuctured database; +- (3) **Database configuration requirements** ensuring Neo4j compatibility for graph persistence; +- (4) **MCP interface standards** create interface for llm inregration; +- (5) **Testing infrastructure** create Python projects test group, research now to setup pipeline for context fetching technique. +### Prioritized backlog + +[Scaffold Planning Board](https://github.com/orgs/Beer-Bears/projects/1) + +## Project specific progress + +### Software Development + +- Database configuration (#6) - Done (Trunn5) +- AT Generator (#11) - Done (Trunn5, onemoreslacker) +- Parser for Python project (#9) - In Progress (onemoreslacker) +- Define Graph Entities: Nodes, Relationships, Tags, Metainfo (#10) - Done (Trunn5, peplxx, onemoreslacker) +- Simple MCP interface (#18) - Pull Request (mashfeii) + +### Management +- Project Roles for points evaluation (#31) - Done (peplxx) +- Setup Planning Board (#27) - Done (Trunn5, peplxx) +- Update project README (#30) - Done (peplxx) +- Week 2 Report (#25) - In Progress (peplxx) + +### Research +- Research: How QA pipeline will be setup (#12) - Done (4hellboy4) + +### DevOps & Infrastructure +- CI: Pull Request Conventional Checker (#38) - Done (peplxx) +- Setup Dependabot for project (#42) - Pull Request (peplxx) + +### Testing +- Create testgroup with python projects (#33) - Done (peplxx) + +# Weekly commitments + +## Individual contribution of each participant +- **4hellboy4**: + - Researched and finalized QA pipeline setup (#12) +- **peplxx**: + - Updated project README (#30) + - Defined project roles for points evaluation (#31) + - Setup CI Pull Request Conventional Checker (#38) + - Created Python project testgroup (#33) + - Setup Planning Board (#27) with Trunn5 + - Defined Graph Entities (#10) with team + - Setup Dependabot (#42) - PR + - Preparing Week 2 report (#25) - In Progress +- **Trunn5**: + - Configured database system (#6) + - Developed AT Generator (#11) with onemoreslacker + - Setup Planning Board (#27) with peplxx + - Defined Graph Entities (#10) with team +- **onemoreslacker**: + - Developed AT Generator (#11) with Trunn5 + - Developing Python project parser (#9) - In Progress + - Defined Graph Entities (#10) with team +- **mashfeii**: + - Developed Simple MCP interface (#18) - PR + +## Plan for Next Week + +1. **Implement full MCP server** with needed llm integration +2. **Integrate MCP server** with Python pasrser and Neo4j database +3. **Research signal interface** for Database refreshing mechanism. +4. **Develop autotesting setup** for parser and AT generator components +5. **Implement testing pipeline** for context fetching evaluation metrics +6. **Establish QA framework** for graph accuracy validation + +## Usefull Links + + - [Course Repository](https://github.com/IU-Capstone-Project-2025/scaffold) + - [Main Repository](https://github.com/Beer-Bears/scaffold) + +- [Excalidraw Board](https://excalidraw.com/#json=DNp6vtk7Ps-d8IqUnFX5p,F8fM6s7Bx-8FcoYoUmuDmA) + +- [Google Document](https://docs.google.com/document/d/1K4CPKvia2kNnlKm9MNFnxmQRqHM1KS_lJMJzueEnQVE/edit?usp=sharing) + +- [Weekly Report](https://github.com/Beer-Bears/beer-bears/tree/master/content/docs/2025/beer-bears) + +- [Weekly Role Distibution Table](https://docs.google.com/spreadsheets/d/1uc_GRhpqoXTGrU90zRO2Lp6TWDvCVzt__PE6KlVH9DU/edit?gid=0#gid=0) + +## Confirmation of the code's operability + +We confirm that the code in the main branch: +- [x] In working condition. +- [x] Run via docker-compose (or another alternative described in the `README.md`). \ No newline at end of file diff --git a/content/docs/2025/beer-bears/week3.md b/content/docs/2025/beer-bears/week3.md new file mode 100644 index 000000000..97cfb65e8 --- /dev/null +++ b/content/docs/2025/beer-bears/week3.md @@ -0,0 +1,123 @@ +--- +title: "Week #2" +--- +# Week #3 (Jun 19 - Jun 25) + + +## Implemented MVP features + +### Code Analysis & Parsing +1. **Refactored Python Parser** ([#62](https://github.com/Beer-Bears/scaffold/issues/62)) + Improved AST-based parser with enhanced code structure extraction +2. **High-Level Method Detection** ([#53](https://github.com/Beer-Bears/scaffold/issues/53)) + Identifies and extracts key methods with their signatures and relationships +3. **AT Generator** ([#11](https://github.com/Beer-Bears/scaffold/issues/11)) + Automated test generation framework from code structure +4. **Python Project Parser** ([#9](https://github.com/Beer-Bears/scaffold/issues/9)) + Core parsing capability for Python codebases + +### Knowledge Graph Infrastructure +5. **Graph Entity Definition** ([#10](https://github.com/Beer-Bears/scaffold/issues/10)) + Schema for nodes, relationships, tags and metadata +6. **Database Configuration** ([#6](https://github.com/Beer-Bears/scaffold/issues/6)) + Neo4j graph database setup with vector storage +7. **Classic Vector RAG Approach** ([#55](https://github.com/Beer-Bears/scaffold/issues/55)) + Retrieval-Augmented Generation implementation + +### User Interface +8. **Simple MCP Interface** ([#18](https://github.com/Beer-Bears/scaffold/issues/18)) + Minimum Complete Product API for system interactions + +### Testing & Validation +9. **Testgroup Projects** ([#33](https://github.com/Beer-Bears/scaffold/issues/33), [#65](https://github.com/Beer-Bears/scaffold/issues/65)) + Validation environment with sample Python codebases +10. **Essential Integration Tests** ([#32](https://github.com/Beer-Bears/scaffold/issues/32)) + Core functionality validation framework +11. **CodeQL Scanning** ([#43](https://github.com/Beer-Bears/scaffold/issues/43)) + Security and quality analysis integration + +### Development Infrastructure +12. **Pre-commit Formatters** ([#46](https://github.com/Beer-Bears/scaffold/issues/46)) + Automated code standardization +13. **Conventional PR Checker** ([#38](https://github.com/Beer-Bears/scaffold/issues/38)) + Commit message and workflow enforcement +14. **Dependabot Setup** ([#42](https://github.com/Beer-Bears/scaffold/issues/42)) + Dependency update management +15. **Poetry Dependency Management** ([#7](https://github.com/Beer-Bears/scaffold/issues/7)) + Python package and environment control + +### Project Management +16. **Planning Board Setup** ([#27](https://github.com/Beer-Bears/scaffold/issues/27)) + Task tracking and workflow management +17. **Project Structure Definition** ([#23](https://github.com/Beer-Bears/scaffold/issues/23)) + Organized repository architecture +18. **Role Definitions** ([#31](https://github.com/Beer-Bears/scaffold/issues/31)) + Contribution tracking framework +19. **Development Rules** ([#24](https://github.com/Beer-Bears/scaffold/issues/24)) + Team workflow standards + +### Documentation & Research +20. **Project Documentation** ([#3](https://github.com/Beer-Bears/scaffold/issues/3), [#22](https://github.com/Beer-Bears/scaffold/issues/22), [#30](https://github.com/Beer-Bears/scaffold/issues/30)) + README, architecture specs, and internal docs +21. **QA Pipeline Research** ([#12](https://github.com/Beer-Bears/scaffold/issues/12)) + Quality assurance strategy foundation +22. **Codebase Search Research** ([#13](https://github.com/Beer-Bears/scaffold/issues/13)) + Efficiency optimization for large repositories +23. **Weekly Reports** ([#21](https://github.com/Beer-Bears/scaffold/issues/21), [#25](https://github.com/Beer-Bears/scaffold/issues/25), [#67](https://github.com/Beer-Bears/scaffold/issues/67)) + +## Demonstration of the working MVP + + - [Google doc with Screenshots](https://docs.google.com/document/d/196VQI1ILK83AfmGCaVqqCMW51-Z-YcYbWhjRe5qetiA/edit?usp=sharing) + + +## Usefull Links + + - [Course Repository](https://github.com/IU-Capstone-Project-2025/scaffold) + - [Main Repository](https://github.com/Beer-Bears/scaffold) + +- [Excalidraw Board](https://excalidraw.com/#json=DNp6vtk7Ps-d8IqUnFX5p,F8fM6s7Bx-8FcoYoUmuDmA) + +- [Google Document](https://docs.google.com/document/d/1K4CPKvia2kNnlKm9MNFnxmQRqHM1KS_lJMJzueEnQVE/edit?usp=sharing) + +- [Weekly Report](https://github.com/Beer-Bears/beer-bears/tree/master/content/docs/2025/beer-bears) + +- [Weekly Role Distibution Table](https://docs.google.com/spreadsheets/d/1uc_GRhpqoXTGrU90zRO2Lp6TWDvCVzt__PE6KlVH9DU/edit?gid=0#gid=0) + +- [Scaffold Planning Board](https://github.com/orgs/Beer-Bears/projects/1) + + +## Individual contribution of each participant + +### Software Development +- Classic Vector RAG Approach - In Progress (peplxx) +- Simple MCP interface - Done (mashfeii) +- Refactor Python Parser - Done (Trunn5) +- Pre-commit code formatters - Done (Trunn5) +- High-level individual methods detection - Done (Trunn5) + +### Research +- Research: Efficiency information search in codebases - In Progress (4hellboy4) + +### Testing & Infrastructure +- Essential Integration Tests - In Progress (onemoreslacker) +- Add projects to testgroup - Done (onemoreslacker) +- CodeQL code scanning - Done (peplxx) + +### Management & Reporting +- Week 3 Report - Done (peplxx) + +## Plan for Next Week + +1. **Implement full MCP server** with needed LLM integration +2. **Integrate MCP server** with Python parser and Neo4j database +3. **Research signal interface** for Database refreshing mechanism +4. **Develop autotesting setup** for parser and AT generator components +5. **Implement testing pipeline** for context fetching evaluation metrics +6. **Establish QA framework** for graph accuracy validation + + +## Confirmation of the code's operability + +We confirm that the code in the main branch: +- [x] In working condition. +- [x] Run via docker-compose (or another alternative described in the `README.md`). \ No newline at end of file diff --git a/content/docs/2025/beer-bears/week4.md b/content/docs/2025/beer-bears/week4.md new file mode 100644 index 000000000..c621d113b --- /dev/null +++ b/content/docs/2025/beer-bears/week4.md @@ -0,0 +1,99 @@ +--- +title: "Week #4" +--- + +# **Week #4** + +## Testing and QA + +*Summary of testing strategy and types of tests implemented.* +Our team developed, and research project's QA aspects from first capstone week so, we have multiple testing scenarious, because our project is consist of complex components, which really need QA testing. +In our project we have: + - Graph generation testing + - Basic Unit testing + - Vector RAG testing + - Docker Compose Configuration testing + - Linting and styling checks + +### Evidence of test execution +- [Tests CI Results](https://github.com/Beer-Bears/scaffold/actions/workflows/tests.yml?query=branch%3Amain) +- [Style CI Results](https://github.com/Beer-Bears/scaffold/actions/workflows/pre-commit.yaml?query=branch%3Amain) +- [Security CodeQL Scanning Results](https://github.com/Beer-Bears/scaffold/actions/workflows/codeql.yml?query=branch%3Amain) +- [Docker Compose CI Results](https://github.com/Beer-Bears/scaffold/actions/workflows/compose-check.yaml?query=branch%3Amain) + +## CI/CD + +### Security Code QL Scanning +[Workflow](https://github.com/Beer-Bears/scaffold/blob/main/.github/workflows/codeql.yml) + +Automated Security Scanner for repository find vulnerabilities in code logic. + +### Docker Compose Check +[Workflow](https://github.com/Beer-Bears/scaffold/blob/main/.github/workflows/compose-check.yaml) + +Checks if docker compose can prorerly configure, build and run using default configuration. + +### Pre-commit style check +[Workflow](https://github.com/Beer-Bears/scaffold/blob/main/.github/workflows/pre-commit.yaml) + +Check code styling and linting uses pre-commit configuration. + +### Pull request aproval celebration +[Workflow](https://github.com/Beer-Bears/scaffold/blob/main/.github/workflows/pull-request-approved.yaml) +When pull request is approved by person sends cute bear picture into comments to congradulate with approval and readiness for merge. + +### Conventional PR title checker +[Workflow](https://github.com/Beer-Bears/scaffold/blob/main/.github/workflows/pull-request-conventional-title.yaml) +Checks pr title with conventional rules, because that we use squashing [ruleset](https://github.com/Beer-Bears/scaffold/settings/rules/5972646) into main (protected) branch. + +### Testing CI (Graph generation & Unit Testing) +[Workflow](https://github.com/Beer-Bears/scaffold/blob/main/.github/workflows/pull-request-conventional-title.yaml) +Checks pr title with conventional rules, because that we use squashing [ruleset](https://github.com/Beer-Bears/scaffold/settings/rules/5972646) into main (protected) branch. + +## Deployment + +As deployment we do not need any deployment now, but in near future we will create docker image building and publishing into register. + +## Vibe Check +Team vibe picture +Team vibe picture +Team vibe picture + +> Now we are feeling exited by project, we are currently adding more and more complex features into project, but feeling a bit tired and overwhelmed with a lot of tasks and deals we need to implement in near future. + +# Weekly commitments + +## Individual contribution of each participant + +#### Trunn5 +- **Add Pytest** + Add pytest framework for testing infrastructure +- **Parse Async Functions** + Added support for asynchronous function parsing +- **[Generator] Connect Files to Folders nodes** + Implemented file-to-folder mapping in generator component + +#### onemoreslacker +- **Essential Integration Tests** + Developed core integration tests for critical paths +- **CI: Graph Generation Auto Testing** + Set up automated graph testing in CI pipeline + +#### peplxx +- **Classical Vector RAG approach** + Introduce Vector RAG approach in project + +#### 4hellboy4 +- **Research: Efficiency information search in codebases** + Investigating optimization techniques for code search + +### mashfeii +- **Week 4 Report** + Compiling weekly progress metrics and findings + + +## Confirmation of the code's operability + +We confirm that the code in the main branch: +- [x] In working condition. +- [x] Run via docker-compose (or another alternative described in the `README.md`). \ No newline at end of file diff --git a/content/docs/2025/beer-bears/week5.md b/content/docs/2025/beer-bears/week5.md new file mode 100644 index 000000000..aea067394 --- /dev/null +++ b/content/docs/2025/beer-bears/week5.md @@ -0,0 +1,105 @@ +--- +title: "Week #5" +--- + +# **Week #5** + +## Feedback + +### Sessions + +We conducted three user feedback sessions with 1st- and 2nd-year students working on summer software engineering projects. + +**Session 1: Nikita (2nd year student)** +- Working on a Python-based project involving API development, machine learning, and DevOps practices. +- Found the core idea of a living knowledge graph compelling and tested Scaffold on his own project. +- Noted some bugs during usage and ultimately decided not to integrate it fully at this stage. +- Expressed interest in adopting it later once the system becomes more stable. + +**Session 2: Veronika (1st year student)** +- Found the concept very accessible and engaging. +- Together with her team, visualized the structure of their small project using Scaffold. +- Identified a few opportunities for refactoring based on the generated graph. +- Did not use the MCP interface, but mentioned that the visual structure alone provided clear value. + +**Session 3: Maksim (2nd year student)** +- Working on a monorepo project using TypeScript and Node.js. +- Was interested in the tool but could not try it meaningfully because Scaffold currently supports only Python. +- Expressed strong interest in multi-language support, especially for JavaScript/TypeScript, and noted that understanding cross-package dependencies is a real challenge in his stack. + +### Analyze + +**Key Insights:** +1. High demand for automated context extraction during onboarding (High Priority). +2. Users need better documentation for internal concepts (High Priority). +3. Interest in infrastructure-level code understanding (Medium Priority). +4. Integration with CI/CD is a clear direction for future iterations (Medium Priority). + +We created and prioritized tasks accordingly: +- Improve beginner-facing docs and onboarding (#docs-onboarding) +- Prototype infrastructure knowledge extraction (#infra-parsing) +- Simplify terminology explanations in docs (#doc-glossary) + +## Iteration & Refinement + +### Implemented features based on feedback + +- Improved AST parsing: added more robust handling of imports, class structures, and ignored files via `.scaffoldignore`. +- Integrated initial version of the MCP (Model Context Protocol) interface for connecting AI agents to external systems. +- Added basic infrastructure config parsing (Dockerfile and docker-compose) into the parsing system. +- Simplified onboarding documentation with visual walkthroughs and clarified key terminology (e.g., MCP, Signal Interface). +- Continued development of the Signal Interface prototype for triggering workflows from context graph updates. + +### Performance & Stability + +To measure performance, we focused on: + +- **AST Generation Time**: Measured average time to parse and convert files into abstract trees. +- **Graph Insertion Speed**: Time taken to push entities into the graph database. +- **Context Fetch Latency**: Time to retrieve relevant subgraphs or vectors for a user query. + +Current benchmark results (small codebase): +- AST Generation: ~1m/codebase +- Graph Insert: ~30ms/codebase +- Context Fetch: <200ms/query + +### Documentation + +Types of documentation: + +- `docs/research`: Detailed reports and experimental notes. +- `docs/docmost`: Configuration and setup for documentation generation. +- `README.md`: Project overview, links. +- `Schemas/`: Structured documentation of internal architecture, interfaces, use cases. + +This structure supports onboarding, development, and research directions simultaneously. + +### ML Model Refinement + +Not applicable in Week #5 – focus was on MCP integration, AST parsing, and GraphRAG enhancements. + +# Weekly commitments + +## Individual contribution of each participant + +| Team Member | Contributions | +|---------------------|---------------| +| Sergei Melnikov (@peplxx) | Implemented vector-based RAG pipeline and docker-compose profiles | +| Sergei Razmakhov (@onemoreslacker) | Continued work on Graph-Based Context Fetching for MCP | +| Dmitry Prosvirkin (@dmitry5567) | Maintained and refined vector/graph database logic | +| Timofei Mashenkov (@mashfeii) | Prototyped the MCP interface and researched Signal integrations | +| Sergei Glazov (@pushkin404) | Conducted QA, user sessions, and updated onboarding docs | + +## Plan for Next Week + +- Finalize MCP <-> Graph interface with automatic signal triggers. +- Implement AST-based use case extraction. +- Improve `.scaffoldignore` coverage and performance. +- Begin benchmarking on larger repositories. +- Start prototype for infrastructure knowledge extraction. + +## Confirmation of the code's operability + +We confirm that the code in the main branch: +- [x] In working condition. +- [x] Run via docker-compose (or another alternative described in the `README.md`).