Modern face detection, recognition & analysis in 3 lines of code
VisionFace is a state-of-the-art, open-source framework for comprehensive face analysis, built with PyTorch. It provides a unified interface for face detection, recognition, landmark detection, and visualization with support for multiple cutting-edge models.
Quick Start β’ Examples β’ Models β’ API Docs
- Detect faces in images with 12+ models (YOLO, MediaPipe, MTCNN...)
- Recognize faces with vector search and embedding models
- Extract landmarks (68-point, 468-point face mesh)
- Batch process thousands of images efficiently
- Production-ready with Docker support and REST API
pip install visionfaceThe Face Detection module is your gateway to identifying faces in any image. Built for both beginners and experts, it provides a unified interface to 12+ cutting-edge detection models.
β¨ Key Features:
- Multiple Input Sources: Image Files, URLs, PIL images, NumPy arrays
- Flexible Processing: Single image or batch processing thousands of images efficiently
- 12+ State-of-the-Art Models: From ultra-fast mobile models to high-precision detectors
- One-Line Detection: Get results with just
detector.detect_faces(image) - Rich Outputs: Bounding boxes, confidence scores, cropped faces ready to use
π Quick Example:
import cv2
from visionface import FaceDetection, FaceAnnotators
# 1. Initialize detector
detector = FaceDetection(detector_backbone="yolo-small")
# 2. Detect faces
image = cv2.imread("your_image.jpg")
faces = detector.detect_faces(image)
# 3. Visualize results
result = FaceAnnotators.box_annotator(image, faces[0])
cv2.imwrite("detected.jpg", result)The Face Recognition module identifies individuals by generating embeddings and comparing them in a vector database. The process includes three stages: detecting faces, creating embeddings with the chosen model, and searching the database to find the closest matches.
β¨ Key Features:
- Multi-model support: Choose from high-accuracy embedding backbones such as FaceNet-VGG, FaceNet-CASIA, and Dlib.
- Vector DB Integration: Store and query embeddings using Qdrant, Milvus, or local file-based storage.
- Scalable Search: Efficiently match thousands or millions of faces with fast search.
- Flexible Enrollment: Add faces one-by-one or in batches with associated labels.
- Threshold & Ranking: Control similarity thresholds and retrieve top-k matches for robust recognition results.
from visionface import FaceRecognition
# 1. Setup recognition system
fr = FaceRecognition(detector_backbone="yolo-small",
embedding_backbone="FaceNet-VGG",
db_backend="qdrant")
# 2. Add known faces
fr.upsert_faces(
images=["john.jpg", "jane.jpg", "bob.jpg"],
labels=["John", "Jane", "Bob"],
collection_name="employees"
)
# 3. Search for matches
matches = fr.search_faces("query_face_image.jpg",
collection_name="employees",
score_threshold=0.7,
top_k=3)
for match in matches:
print(f"Found: {match['face_name']} (confidence: {match['score']:.2f})")The Face Embeddings module transforms each detected face into a high-dimensional numeric vector (embedding) that captures its unique features.
These embeddings can be used for:
- Face verification: Check if two faces belong to the same perso
- Recognition: Match against a database of known faces
- Clustering: Group similar faces automatically
- Advanced analytics:
β¨ Supported Embedding Models:
FaceNet-VGG, FaceNet-CASIA, Dlib
π Quick Example:
from visionface import FaceEmbedder
# 1. Initialize embedder
embedder = FaceEmbedder(embedding_backbone="FaceNet-VGG")
# 2. Generate embeddings for face images
embeddings = embedder.embed_faces(
face_imgs=["face1.jpg", "face2.jpg"],
normalize_embeddings=True # L2 normalization
)
# 3. Use embeddings
for i, embedding in enumerate(embeddings):
print(f"Face {i+1} embedding shape: {embedding.shape}") # (512,)
# Use for: face verification, clustering, custom databasesThe Landmarks module identifies key facial features with pixel-perfect accuracy. From eye positions to lip contours, get detailed facial geometry for advanced applications.
β¨ Key Features:
- Multiple Input Sources: Image Files, URLs, PIL images, NumPy arrays
- Flexible Processing: Single image or batch processing thousands of images efficiently
- 2D & 3D Support: Standard 2D points or full 3D face mesh
- Rich Annotations: Built-in visualization with customizable styling
- Multiple Backends: MediaPipe (468 points) or Dlib (68 points)
π Quick Example:
from visionface import LandmarkDetection
from visionface.annotators.landmark import MediaPipeFaceMeshAnnotator
landmark_detector = LandmarkDetection(detector_backbone="mediapipe")
image = cv2.imread("your_image.jpg")
# Get 468 facial landmarks
landmarks = landmark_detector.detect_3d_landmarks(image)
# Visualize with connections
vizualizer = MediaPipeFaceMeshAnnotator(thickness=2, circle_radius=3)
result = vizualizer.annotate(
image, landmarks[0], connections=True
)
cv2.imwrite("detected_landmarks.jpg", result)π― Real-time Face Detection
import cv2
from visionface import FaceDetection, FaceAnnotators
detector = FaceDetection(detector_backbone="yolo-nano") # Fastest model
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
faces = detector.detect_faces(frame)
annotated = FaceAnnotators.box_annotator(frame, faces)
cv2.imshow('Face Detection', annotated)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()π Batch Processing
from visionface import FaceDetection
import glob
detector = FaceDetection(detector_backbone="yolo-medium")
# Process entire folder
image_paths = glob.glob("photos/*.jpg")
images = [cv2.imread(path) for path in image_paths]
# Detect all faces at once
all_detections = detector.detect_faces(images)
# Save cropped faces
for i, detections in enumerate(all_detections):
for j, face in enumerate(detections):
if face.cropped_face is not None:
cv2.imwrite(f"faces/image_{i}_face_{j}.jpg", face.cropped_face)π’ Employee Recognition System
from visionface import FaceRecognition
import os
# Initialize system
fr = FaceRecognition(db_backend="qdrant")
# Auto-enroll from employee photos folder
def enroll_employees(folder_path):
for filename in os.listdir(folder_path):
if filename.endswith(('.jpg', '.png')):
name = filename.split('.')[0] # Use filename as name
image_path = os.path.join(folder_path, filename)
fr.upsert_faces(
images=[image_path],
labels=[name],
collection_name="company_employees"
)
print(f"Enrolled: {name}")
# Enroll all employees
enroll_employees("employee_photos/")
# Check security camera feed
def identify_person(camera_image):
results = fr.search_faces(
camera_image,
collection_name="company_employees",
score_threshold=0.8,
top_k=1
)
if results[0]: # If match found
return results[0][0]['face_name']
return "Unknown person"Choose the right model for your use case:
| Use Case | Speed | Accuracy | Recommended Model |
|---|---|---|---|
| π Real-time apps | β‘β‘β‘ | ββ | yolo-nano, mediapipe |
| π― General purpose | β‘β‘ | βββ | yolo-small (default) |
| π High accuracy | β‘ | ββββ | yolo-large, mtcnn |
| π± Mobile/Edge | β‘β‘β‘ | ββ | mediapipe, yolo-nano |
| π Landmarks needed | β‘β‘ | βββ | mediapipe, dlib |
π Complete Model List
Detection Models:
yolo-nano,yolo-small,yolo-medium,yolo-largeyoloe-small,yoloe-medium,yoloe-large(prompt-based)yolow-small,yolow-medium,yolow-large,yolow-xlarge(open-vocabulary)mediapipe,mtcnn,opencv
Embedding Models:
FaceNet-VGG(512D) - Balanced accuracy/speedFaceNet-CASIA(512D) - High precisionDlib(128D) - Lightweight
Landmark Models:
mediapipe- 468 points + 3D meshdlib- 68 points, robust
- π Full Documentation
- π Tutorials & Guides
- π REST API Reference
- π‘ Use Case Examples
We welcome contributions! See our Contributing Guide.
Quick ways to help:
- β Star the repo
- π Report bugs
- π‘ Request features
- π Improve docs
- π§ Submit PRs
MIT License - see LICENSE file.
@software{VisionFace2025,
title = {VisionFace: Modern Face Detection & Recognition Framework},
author = {VisionFace Team},
year = {2025},
url = {https://github.com/miladfa7/visionface}
}β¬ Back to Top β’ Made with β€οΈ by the VisionFace team

Face Detection
Face Recognition
Face Landmarks
Face Analysis
Face Verification
Face Visualization


