Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡ Add Summary for SENTENCE BERT #99

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions summaries/Sentence-BERT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

## Summary
"Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks" introduces a novel approach to computing sentence embeddings using BERT. Traditional BERT models are computationally expensive for generating sentence embeddings, and this paper proposes a more efficient method by fine-tuning BERT using a Siamese network structure. This allows for quick and effective comparison of sentence embeddings in various tasks like semantic textual similarity, clustering, and information retrieval.

## Contributions
- **Efficient Sentence Embeddings**: Proposed a method to generate sentence embeddings quickly and efficiently using Siamese BERT networks.
- **Performance Improvements**: Demonstrated significant improvements in computational efficiency and performance on various sentence-level tasks.
- **Benchmarking**: Provided extensive benchmarking against existing methods and state-of-the-art (SOTA) techniques.

## Method
1. **Siamese Network Architecture**: The paper utilizes a Siamese network structure where two BERT models are used to encode two sentences independently.
2. **Objective Function**: The network is fine-tuned using a triplet loss or contrastive loss to ensure that similar sentences have closer embeddings in the vector space, and dissimilar sentences are further apart. For example:
- `softmax(Wt(u, v, |u − v|))`
- `max(||sa − sp|| − ||sa − sn|| + E, 0)`
3. **Fine-Tuning**: The model is fine-tuned on a variety of datasets to capture semantic similarity, making it versatile for multiple applications.
4. **Pooling Strategy**: Sentence embeddings are derived using different pooling strategies, including mean pooling, max pooling, and CLS token pooling.

![Figure: Siamese BERT Network Architecture (adapted from the paper)](path/to/figure.png)

## Results
- **Performance Metrics**: The paper shows improvements in various benchmarks, including the Semantic Textual Similarity (STS) tasks and the SentEval benchmark.
- **Comparison with SOTA**: Sentence-BERT outperforms existing methods in terms of both efficiency and accuracy. Specifically, it shows a substantial reduction in computational time while maintaining or improving accuracy.
- **Visualization**: T-SNE plots demonstrate that Sentence-BERT embeddings cluster semantically similar sentences closer together compared to other methods.

## Two-Cents
- **Appreciation**: The paper successfully addresses a significant limitation of BERT by improving the efficiency of generating sentence embeddings. The use of Siamese networks is a clever approach that balances performance with computational cost.

## Resources
- **Paper**: [https://arxiv.org/abs/1908.10084](https://arxiv.org/abs/1908.10084)