Research-Data-Insights

This prototype explores how Large Language Models (LLMs) can enhance research workflows by enabling intelligent analysis of large volumes of qualitative and quantitative data. By identifying patterns, scoring responses against user-defined metrics, and surfacing nuanced insights through advanced data processing, it supports more efficient research decision-making, improves accessibility to complex datasets, and fosters deeper understanding of research content through personalized, adaptive AI assistance.

Index	Description
High Level Architecture	High level overview illustrating component interactions
Deployment	How to deploy the project
User Guide	The working solution
Directories	General project directory structure
API Documentation	Documentation on the API the project uses
Credits	Meet the team behind the solution
License	License details

High-Level Architecture

The following architecture diagram illustrates the various AWS components utilized to deliver the solution. For an in-depth explanation of the frontend and backend stacks, please look at the Architecture Guide.

Deployment Guide

To deploy this solution, please follow the steps laid out in the Deployment Guide

User Guide

Please refer to the Web App User Guide for instructions on navigating the web app interface.

Directories

├── cdk/
│   ├── bin/
│   ├── lambda/
│   ├── layers/
│   ├── stacks/
│   └── OpenAPI_Swagger_Definition.yaml

├── docs/
│   ├── userGuide.md
│   ├── deploymentGuide.md
│   ├── architectureDeepDive.md
│   ├── securityGuide.md
│   ├── Experimentation_Guide.md
│   ├── data_ingestion.md
│   ├── bedrock_guardrails.md
│   ├── dependencyManagement.md
│   ├── modificationGuide.md
│   ├── scoring.md
│   ├── troubleshootingGuide.md
│   ├── api-documentation.pdf
│   ├── data_ingestion/
│   │   ├── helpers/
│   │   │   ├── helper.md
│   │   │   └── vectorstore.md
│   │   └── processing/
│   │       └── documents.md
│   └── media/

├── frontend/
│   ├── public/
│   └── src/
│       ├── app/
│       └── components/

├── Notebooks/
│   ├── LLM_scoring.ipynb
│   └── RAG_model.ipynb

/cdk: Contains the deployment code for the app's AWS infrastructure
- /bin: Contains the instantiation of CDK stack
- /lambda: Contains the lambda functions for data ingestion, scoring, and other core functionalities
- /layers: Contains the required layers for lambda functions
- /stacks: Contains the deployment code for all infrastructure stacks
- OpenAPI_Swagger_Definition.yaml: API specification for the research data insights service
/docs: Contains comprehensive documentation for the application including:
- User guides, deployment instructions, and architecture details
- Security and troubleshooting guides
- Bedrock guardrails and dependency management documentation
- Modification guides and scoring methodology
- Detailed data ingestion documentation with helper utilities and processing guides
/frontend: Contains the user interface of the research data insights application
/Notebooks: Contains Jupyter notebooks for experimentation with LLM scoring and RAG (Retrieval-Augmented Generation) models

API Documentation

Here you can learn about the API the project uses: API Documentation.

Experimentation Guide

For information on how to experiment with the LLM scoring and RAG models using the provided Jupyter notebooks, see the Experimentation Guide.

Data Ingestion

Details about the data ingestion process and how to work with research datasets can be found in the Data Ingestion Guide.

Security Guide

Security considerations and best practices for deploying and using the research data insights platform are outlined in the Security Guide.

Additional Documentation

Configuration and Management

Bedrock Guardrails: Configuration and management of AWS Bedrock guardrails for AI safety
Dependency Management: Managing project dependencies and version control
Modification Guide: Guidelines for modifying and extending the application
Troubleshooting Guide: Common issues and their solutions

Scoring and Analysis

Scoring Documentation: Detailed information about the LLM scoring system and methodologies

Data Processing

Data Processing Helpers: Utility functions and helper documentation for data processing
- Helper Functions
- Vector Store Management
Document Processing: Document processing workflows and procedures

Credits

This application was architected and developed by Harsh Amin, Rohit Murali, and Harleen Chahal. Thanks to the UBC Cloud Innovation Centre Technical and Project Management teams for their guidance and support.

License

This project is distributed under the MIT License.

Licenses of libraries and tools used by the system are listed below:

PostgreSQL license

For PostgreSQL and pgvector
"a liberal Open Source license, similar to the BSD or MIT licenses."

LLaMa 3 Community License Agreement

For Llama 3 70B Instruct model

Name		Name	Last commit message	Last commit date
Latest commit History 397 Commits
Notebooks		Notebooks
cdk		cdk
docs		docs
frontend		frontend
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Research-Data-Insights

High-Level Architecture

Deployment Guide

User Guide

Directories

API Documentation

Experimentation Guide

Data Ingestion

Security Guide

Additional Documentation

Configuration and Management

Scoring and Analysis

Data Processing

Credits

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

UBC-CIC/Research-Data-Insights

Folders and files

Latest commit

History

Repository files navigation

Research-Data-Insights

High-Level Architecture

Deployment Guide

User Guide

Directories

API Documentation

Experimentation Guide

Data Ingestion

Security Guide

Additional Documentation

Configuration and Management

Scoring and Analysis

Data Processing

Credits

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages