LoRA Finetuning: RoBERTa-Base on Few-NERD (Supervised)

Live DEMO - deployed here on HF spaces

LoRA.Finetuned.NER.-.a.Hugging.Face.Space.by.ajnx014.-.Brave.2025-11-26.22-19-38.mp4

This section documents the full finetuning pipeline used to train a parameter-efficient NER model using LoRA adapters on top of RoBERTa-base, optimized for deployment on low-memory environments (AWS EC2, Lambda, containers). if you wish to jump to AWS deployment part of the markdown, click here - TAKE ME TO AWS SETUP

What is Named Entity Recognition (NER)?

Named Entity Recognition (NER) is a core Natural Language Processing (NLP) task where a model identifies and classifies meaningful pieces of text (called entities) into predefined categories.

In simple terms, NER turns unstructured text into structured information by answering two questions:

What is the entity? (e.g., “Barack Obama”)

What type is it? (e.g., person-politician)

📊 Dataset Configuration

We use the Few-NERD (supervised) setting with fine-grained entity types. A larger dataset is sampled for LoRA efficiency:

Split	Samples Used	% of Original
Train	100,000	75.9%
Validation	15,000	79.7%
Test	37,648	100% (full test set)

Total training samples: 115,000 Expected training time: ~20–30 minutes (A100 / T4 with LoRA)

⚙️ LoRA Configuration

Parameter	Value
Task	Token Classification
Rank (r)	16
Alpha	32
Dropout	0.1
Target Modules	`query`, `value`
Bias	none

🧮 Parameter Efficiency

Metric	Value
Total model parameters	124,747,910
Trainable params (LoRA)	641,347
Trainable %	0.51%
Frozen params	124,106,563 (99.49%)

Parameter efficiency: 194.5× fewer trainable parameters Memory savings: ~99.5% reduction in trainable memory footprint

🏋️ Training Configuration

Setting	Value
Epochs	5
Batch size	16
Gradient accumulation	2
Effective batch size	32
Learning rate	3e-4
Warmup ratio	0.1
Mixed precision	FP16
Total training steps	~15,625

📈 Validation Metrics (per Epoch)

Epoch	Train Loss	Val Loss	Precision	Recall	F1
1	0.2823	0.2627	0.6284	0.6739	0.6503
2	0.2627	0.2478	0.6488	0.6843	0.6661
3	0.2464	0.2450	0.6449	0.6936	0.6683
4	0.2350	0.2390	0.6617	0.6940	0.6774
5	0.2303	0.2380	0.6607	0.7003	0.6800

Best validation F1: 0.6800

🧪 Test Set Results

Metric	Score
Test F1	0.6744
Test Precision	0.6539
Test Recall	0.6961
Test Loss	0.2431

📦 Model Artifacts

LoRA Adapters MERGED TO BASE MODEL

Path: ./roberta-lora-fewnerd-merged

Loads like a standard Transformers model:

AutoModelForTokenClassification.from_pretrained("roberta-lora-fewnerd-merged")

🚀 AWS Deployment (FastAPI + Docker + EC2)

This project includes a full production-style deployment of the fine-tuned NER model to AWS EC2 using Docker and FastAPI. The entire pipeline is lightweight, reproducible, and works on low-cost instances such as t3.micro.

🏗️ 1. Infrastructure Overview

Service: AWS EC2 Instance Type: t3.micro (2 vCPUs, 1GB RAM) AMI: Ubuntu 22.04 LTS Model Size: ~300MB (LoRA merged RoBERTa model) Serving Stack:

FastAPI
Uvicorn
Docker
CPU-only PyTorch
Custom NERPredictor class (Hugging Face Transformers)

Instance Config

🧱 2. Folder Structure Shipped to EC2

The entire app folder is packaged into a tar.gz archive and uploaded:

lora-ner-full.tar.gz
│
├── app/
│   ├── main.py
│   └── predictor.py
│
├── models/
│   └── roberta-lora-fewnerd-merged/
│
├── requirements.txt
├── Dockerfile
└── .dockerignore

This ensures a clean, deterministic environment for Docker.

⚙️ 3. FastAPI Application (app/main.py)

The API exposes three endpoints:

/ → API health + model info
/health → for container health checks
/predict → NER inference endpoint

Features:

Proper error handling
Logging
Pydantic validation
Model loaded once at startup
CPU-optimized inference path

🧠 4. NERPredictor (app/predictor.py)

A minimal inference wrapper that handles:

Tokenization
Model forwarding
Argmax decoding
Entity reconstruction
Device management (CPU/GPU)

It works with any AutoModelForTokenClassification checkpoints.

📦 5. Dockerfile (Deployable Container Image)

Dockerfile:

Installs Python deps
Copies model + app into /app
Runs Uvicorn at 0.0.0.0:8000
Adds AWS-compatible health checks
Disables parallel tokenizers (fixes crashes)

This ensures the container is production-ready and works even on minimal CPU instances.

🖥️ 6. EC2 Deployment Steps

1️⃣ Create EC2 instance

AMI: Ubuntu 22.04
Instance: t3.micro
Open inbound rules:
- 22 (SSH)
- 8000 (API)

2️⃣ Upload project

sftp -i lora-ner-key.pem ubuntu@<EC2_IP>
put lora-ner-full.tar.gz

3️⃣ SSH into EC2

ssh -i lora-ner-key.pem ubuntu@<EC2_IP>

4️⃣ Extract project

mkdir -p ~/lora-ner
tar -xzf lora-ner-full.tar.gz -C ~/lora-ner/
cd ~/lora-ner

5️⃣ Build Docker image

docker build -t ner-api:v1 .

6️⃣ Run container

docker run -d --name ner-api -p 8000:8000 --restart unless-stopped ner-api:v1

7️⃣ Check container health

docker ps
docker logs ner-api

8️⃣ Test API on EC2

curl http://localhost:8000
curl http://localhost:8000/predict -d '{"text":"Barack Obama was president of USA"}'

9️⃣ Test from outside

curl http://<EC2_PUBLIC_IP>:8000
curl http://<EC2_PUBLIC_IP>:8000/predict ...

Everything becomes publicly accessible at:

http://<YOUR_EC2_PUBLIC_IP>:8000/docs

Local ENDPOINT TESTING -

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
AWS_container		AWS_container
training		training
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LoRA Finetuning: RoBERTa-Base on Few-NERD (Supervised)

Live DEMO - deployed here on HF spaces

What is Named Entity Recognition (NER)?

📊 Dataset Configuration

⚙️ LoRA Configuration

🧮 Parameter Efficiency

🏋️ Training Configuration

📈 Validation Metrics (per Epoch)

🧪 Test Set Results

📦 Model Artifacts

LoRA Adapters MERGED TO BASE MODEL

🚀 AWS Deployment (FastAPI + Docker + EC2)

🏗️ 1. Infrastructure Overview

🧱 2. Folder Structure Shipped to EC2

⚙️ 3. FastAPI Application (app/main.py)

🧠 4. NERPredictor (app/predictor.py)

📦 5. Dockerfile (Deployable Container Image)

🖥️ 6. EC2 Deployment Steps

1️⃣ Create EC2 instance

2️⃣ Upload project

3️⃣ SSH into EC2

4️⃣ Extract project

5️⃣ Build Docker image

6️⃣ Run container

7️⃣ Check container health

8️⃣ Test API on EC2

9️⃣ Test from outside

Local ENDPOINT TESTING -

About

Uh oh!

Releases

Packages

Languages

ArjunJagdale/NER

Folders and files

Latest commit

History

Repository files navigation

LoRA Finetuning: RoBERTa-Base on Few-NERD (Supervised)

Live DEMO - deployed here on HF spaces

What is Named Entity Recognition (NER)?

📊 Dataset Configuration

⚙️ LoRA Configuration

🧮 Parameter Efficiency

🏋️ Training Configuration

📈 Validation Metrics (per Epoch)

🧪 Test Set Results

📦 Model Artifacts

LoRA Adapters MERGED TO BASE MODEL

🚀 AWS Deployment (FastAPI + Docker + EC2)

🏗️ 1. Infrastructure Overview

🧱 2. Folder Structure Shipped to EC2

⚙️ 3. FastAPI Application (app/main.py)

🧠 4. NERPredictor (app/predictor.py)

📦 5. Dockerfile (Deployable Container Image)

🖥️ 6. EC2 Deployment Steps

1️⃣ Create EC2 instance

2️⃣ Upload project

3️⃣ SSH into EC2

4️⃣ Extract project

5️⃣ Build Docker image

6️⃣ Run container

7️⃣ Check container health

8️⃣ Test API on EC2

9️⃣ Test from outside

Local ENDPOINT TESTING -

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages