Skip to content

Submission - Lannister Lions #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6,094 changes: 6,094 additions & 0 deletions Lannister-Lions/Fine_tune_Llama2_cpu_intel.ipynb

Large diffs are not rendered by default.

97 changes: 97 additions & 0 deletions Lannister-Lions/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Team Name - Lannister Lions

## Problem Statement - Generative AI Large Language Models Fine Tuned For Legal Practice Platform

## Team Leader Email - [Nikhil Bansal](mailto:[email protected])

## A Brief of the Prototype:
The platform seamlessly integrates the capabilities of oneAPI and Intel technologies to empower legal professionals with advanced tools. Lawyers, upon login, gain access to a personalized dashboard presenting a detailed profile summary, including statistics on cases won, cases lost, and ongoing cases. Leveraging the Retrieval-Augmented Generation (RAG) technique backed by oneAPI, specific cases unfold with context-rich information, enhancing the lawyer's insights.

Furthermore, a sophisticated chat module, underpinned by our finely-tuned LLAMA-2 model and powered by Intel technologies, facilitates technical discussions on legal matters. The integration of oneAPI ensures optimized performance, scalability, and efficiency throughout the platform. This technical synergy of oneAPI, Intel technologies, RAG, and LLAMA-2 not only elevates the lawyer's ability to efficiently manage cases but also establishes a secure and technologically advanced environment for legal practitioners.

## Dataset:
https://www.kaggle.com/datasets/vangap/indian-supreme-court-judgments

## Tech Stack:
1. **Intel® oneAPI Deep Neural Network Library (oneDNN):**
- Use oneDNN for optimized deep learning operations, accelerating the training and inference processes of the fine-tuned LLAMA-2 model.
```python
import onednnSave
```

2. **Intel® oneAPI Math Kernel Library (oneMKL):**
- Leverage oneMKL for optimized mathematical operations, improving the overall performance of machine learning algorithms and computations.
```python
import mkl
```

3. **Intel® oneAPI Threading Building Blocks (oneTBB):**
- Employ oneTBB for efficient parallelism in indexing, retrieval, and other parallelizable tasks, enhancing the scalability of the platform.
```cpp
#include <tbb>
```

4. **Intel® oneAPI Data Parallel C++ (oneAPI DPC++):**
- Use oneAPI DPC++ for parallel execution of natural language processing tasks, optimizing query understanding and expansion.
```cpp
#include <CL/sycl.hpp>
```

5. **Intel® oneAPI Security Libraries:**
- Incorporate oneAPI security libraries to implement robust security measures for safeguarding sensitive legal information.
```cpp
#include <ippcp.h>
```

These libraries from the Intel® oneAPI AI Analytics toolkit provide a comprehensive set of tools for optimized deep learning, mathematical operations, parallelism, and security, contributing to the efficiency and performance of the legal document research platform.

## Step-by-Step Code Execution Instructions:
##Commands for setup a new server -

Install anaconda with this guide-
https://www.digitalocean.com/community/tutorials/how-to-install-the-anaconda-python-distribution-on-ubuntu-22-04

AI toolkit installation-
https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-analytics-toolkit-download.html?operatingsystem=linux&distributions=offline

Clone this repo and follow instructions from step 3 onwards from README-
https://github.com/vishnumadhu365/oneapi-devsummit-sea-2023


##Running backend flask server

Create SSH session with tunnel on port 5000
```cpp
ssh -L 5000:100.80.227.42:5000 -J [email protected] [email protected]
```


Switch to root user:
```cpp
sudo -i
```


Go to project directory:
```cpp
cd /root/hackathon/oneapi-devsummit-sea-2023/
```


Activate virtual environment:
```cpp
source itex_cpu/bin/activate
```

start python application:
```cpp
python hackathon-notebooks/rag_llm_serviceV2.py
```

## Future Scope:
The prototype exhibits notable scalability and forward-looking features that position it as an advanced solution for the evolving landscape of legal document research. Leveraging Intel technologies such as oneAPI Threading Building Blocks and oneDAL, the platform efficiently scales to handle a substantial volume of legal documents. This not only ensures optimal performance but also primes the system for accommodating future expansions in legal data repositories.

Looking toward the future, the prototype's iterative refinement processes, parallelized with Intel oneAPI Threading Building Blocks, enable continuous improvements based on user feedback and evolving legal document trends. The integration of advanced Natural Language Understanding (NLU) modules and the Retrieval-Augmented Generation (RAG) approach showcases the platform's adaptability to emerging legal language nuances and evolving user needs.

Furthermore, the prototype's incorporation of Intel oneAPI technologies, including Deep Neural Network Library (oneDNN) and Threading Building Blocks (oneTBB), positions it for future advancements in artificial intelligence and legal technology. The seamless integration of these technologies not only ensures optimal performance today but also establishes a foundation for incorporating future AI enhancements, making the platform a scalable and future-proof solution for legal professionals. Overall, the prototype stands as an innovative and scalable platform, ready to meet the evolving demands of the legal industry.

Loading