A reading list related to storage systems, including data deduplication, erasure coding, general storage and other related topics (i.e., Security...), updating from time to time~
[TOC]
- Understanding Data Deduplication Ratios----SNIA'08 (link)
- A Survey of Classification of Storage Deduplication Systems----ACM Computing Surveys'14 (link)
- A Comprehensive Study of the Past, Present, and Future on Data Deduplication----Proceedings of the IEEE'16 (link)
- 99 Deduplication Problems----HotStorage'16 (link) (summary)
- A Survey of Secure Data Deduplication Schemes for Cloud Storage Systems----ACM Computing Surveys'17 (link)
- Backup to the Future: How Workload and Hardware Changes Continually Redefine Data Domain File Systems----IEEE Computer'17 (link)
- Characterizing Datasets for Data Deduplication in Backup Applications----IISWC'10 (link)
- A Study of Practical Deduplication----FAST'11 (link) summary
- Capacity Forecasting in a Backup Storage Environment----LISA'11 (link) summary
- Characteristics of Backup Workloads in Production Systems----FAST'12 (link) summary
- A Study on Data Deduplication in HPC Storage Systems----SC'12 (link)
- Inside Dropbox: Understanding Personal Cloud Storage Services----IMC'12 (link)
- Insights for Data Reduction in Primary Storage: a Practical Analysis----SYSTOR'12 (link)
- Modeling the Dropbox Client Behavior----ICC'14 (link)
- Identifying Trends in Enterprise Data Protection Systems----USENIX ATC'15 (link)
- A Long-Term User-Centric Analysis of Deduplication Patterns----MSST'16 (link)
- Getting back up: Understanding how enterprise data backups fail----USENIX ATC'16 (link)
- A Simulation Analysis of Redundancy and Reliability in Primary Storage Deduplication----TC'18 (link) summary
- Deduplication Analyses of Multimedia System Images----HotStorage'18 (link)
- Improving Docker Registry Design based on Production Workload Analysis----FAST'18 (link)
- Venti: A New Approach to Archival Storage----FAST'02 (link)
- Avoiding the Disk Bottleneck in the Data Domain Deduplication File System----FAST'08 (link) summary
- Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality----FAST'09 (link) summary
- Extreme Binning: Scalable, Parallel Deduplication for Chunk-based File Backup----MASCOTS'09 (link) summary
- I/O Deduplication: Utilizing Content Similarity to Improve I/O Performance----FAST'10 (link)
- dedupv1: Improving Deduplication Throughput using Solid State Drives (SSD)----MSST'10 (link) summary
- ChunkStash: Speeding up Inline Storage Deduplication using Flash Memory----USENIX ATC'10 (link)
- SiLo: A Similarity-Locality based Near-Exact Deduplication Scheme with Low RAM Overhead and High Throughput----USENIX ATC'11 (link)
- Building a High-performance Deduplication System----USENIX ATC'11 (link) summary
- Primary Data Deduplication - Large Scale Study and System Design----USENIX ATC'12 (link)
- iDedup: Latency-aware, Inline Data Deduplication for Primary Storage----FAST'12 (link) summary
- Deduplication in SSDs: Model and quantitative analysis----MSST'12 (link)
- Efficiently Storing Virtual Machine Backups----HotStorage'13 (link)
- Storage Efficiency Opportunities and Analysis for Video Repositories----HotStorage'15 (link)
- Deriving and Comparing Deduplication Techniques Using a Model-Based Classification----EuroSys'15 (link)
- Design Tradeoffs for Data Deduplication Performance in Backup Workloads----FAST'15 (link) summary
- Sorted Deduplication: How to Process Thousands of Backup Streams----MSST'16 (link)
- Backup to the future: How workload and hardware changes continually redefine data domain file systems----TC'17 (link)
- Can't We All Get Along? Redesigning Protection Storage for Modern Workloads----USENIX ATC'18 (link) summary
- SmartDedup: Optimizing Deduplication for Resource-constrained Devices----USENIX ATC'19 (link)
- DupHunter: Flexible High-Performance Deduplication for Docker Registries----USENIX ATC'20 (link)
- The Dilemma between Deduplication and Locality: Can Both be Achieved?---FAST'21 (link) summary
- SLIMSTORE: A Cloud-based Deduplication System for Multi-version Backups----ICDE'21 (link)
- Improving the Performance of Deduplication-Based Backup Systems via Container Utilization Based Hot Fingerprint Entry Distilling----ACM TOS'21 (link)
- BURST: A Chunk-Based Data Deduplication System with Burst-Encoded Fingerprint Matching----MSST'24 (link)
- RevDedup: A Reverse Deduplication Storage System Optimized for Reads to Latest Backups----APSys'13 (link) summary
- ALACC: Accelerating Restore Performance of Data Deduplication Systems Using Adaptive Look-Ahead Window Assisted Chunk Caching----FAST'18 (link) summary
- Reducing Impact of Data Fragmentation Caused by In-line Deduplication----SYSTOR'12 (link)
- Reducing Fragmentation Impact with Forward Knowledge in Backup Systems with Deduplication----SYSTOR'15 (link)
- Assuring Demanded Read Performance of Data Deduplication Storage with Backup Datasets----MASCOTS'12 (link)
- Sliding Look-Back Window Assisted Data Chunk Rewriting for Improving Deduplication Restore Performance----FAST'19 (link) summary
- Improving Restore Speed for Backup Systems that Use Inline Chunk-Based Deduplication---FAST'13 (link) summary
- Chunk Fragmentation Level: An Effective Indicator for Read Performance Degradation in Deduplication Storage----HPCC'11
- Improving the Restore Performance via Physical Locality Middleware for Backup Systems----Middleware'20 (link) summary
- Efficient Hybrid Inline and Out-of-Line Deduplication for Backup Storage----ACM TOS'14 (link)
- Convergent Dispersal: Toward Storage-Efficient Security in a Cloud-of-Clouds----HotStorage'14 (link) summary
- CDStore: Toward Reliable, Secure, and Cost-Efficient Cloud Storage via Convergent Dispersal----USENIX ATC'15 (link) summary
- Information Leakage in Encrypted Deduplication via Frequency Analysis----DSN'17 (link)
- DupLESS: Server-Aided Encryption for Deduplicated Storage----USENIX Security'13 (link) summary
- Side Channels in Cloud Services, the Case of Deduplication in Cloud Storage----S&P'10 (link) summary
- Side Channels in Deduplication: Trade-offs between Leakage and Efficiency----AsiaCCS'17 (link) summary
- On Information Leakage in Deduplication Storage Systems----CCS Workshop'16 summary
- SecDep: A User-Aware Efficient Fine-Grained Secure Deduplication Scheme with Multi-Level Key Management----MSST'15 (link)
- Message-Locked Encryption and Secure Deduplication----EuroCrypt'13 summary
- Proofs of Ownership in Remote Storage System----CCS'11 (link)
- Tapping the Potential: Secure Chunk-based Deduplication of Encrypted Data for Cloud Backup----CNS'18 summary
- A Bandwidth-Efficient Middleware for Encrypted Deduplication----DSC'18 summary
- Bloom Filter Based Privacy Preserving Deduplication System----Springer International Conference on Security & Privacy'19 (link) summary
- Enhanced Secure Thresholded Data Deduplication Scheme for Cloud Storage----TDSC'16 (link) summary
- Transparent Data Deduplication in the Cloud----CCS'15 (link) summary
- Secure Deduplication of Encrypted Data without Additional Independent Servers----CCS'15 (link) summary
- Fast and Secure Laptop Backups with Encrypted Deduplication----LISA'10 (link)
- Weak Leakage-Resilient Client-side Deduplication of Encrypted Data in Cloud Storage----ASIA CCS'13 (link)
- Lamassu: Storage-Efficient Host-Side Encryption----USENIX ATC'15 (link)
- Mitigating Traffic-based Side Channel Attacks in Bandwidth-efficient Cloud Storage----IPDPS'18 (link) summary
- RARE: Defeating Side Channels based on Data-Deduplication in Cloud Storage----INFOCOM'18 (link) summary
- PerfectDedup: Secure Data Deduplication----Data Privacy Management, and Security Assurance'15 (link)
- Privacy Aware Data Deduplication for Side Channel in Cloud Storage----ToCC'18 (link)
- PraDa: Privacy-preserving Data Deduplication as a Service----CIKM'14 (link)
- Privacy-Preserving Data Deduplication on Trusted Processors----CLOUD'17 (link) summary
- Distributed Key Generation for Encrypted Deduplication: Achieving the Strongest Privacy----CCSW'14 (link) summary
- Proofs of Ownership on Encrypted Cloud Data via Intel SGX----ACNS'20 (link) summary
- Accelerating Encrypted Deduplication via SGX----USENIX ATC'21(link)
- S2Dedup: SGX-enabled Secure Deduplication----SYSTOR'21 (link) summary
- Secure Deduplication of General Computations----USENIX ATC'15 (link)
- When Delta Sync Meets Message-Locked Encryption: a Feature-based Delta Sync Scheme for Encrypted Cloud Storage----ICDCS'21 (link) summary
- DUPEFS: Leaking Data Over the Network With Filesystem Deduplication Side Channels----FAST'22 (link) summary
- Metadedup: Deduplicating Metadata in Encrypted Deduplication via Indirection----MSST'19 (link)
- Rekeying for Encrypted Deduplication Storage----DSN'16 (link) summary
- File Recipe Compression in Data Deduplication Systems----FAST'13 (link) summary
- Metadata Considered Harmful ... to Deduplication----HotStorage'15 (link) summary
- LIPA: A Learning-based Indexing and Prefetching Approach for Data Deduplication----MSST'19 (link) summary
- Lazy Exact Deduplication----MSST'16 (link)
- MAD2: A Scalable High-throughput Exact Deduplication Approach for Network Backup Services----MSST'10 (link)
- Block Locality Caching for Data Deduplication----SYSTOR'13 (link)
- HANDS: A Heuristically Arranged Non-Backup In-line Deduplication System----ICDE'13 (link)
- Estimating Unseen Deduplication - from Theory to Practice----FAST'16 (link) summary
- Estimation of Deduplication Ratios in Large Data Sets----MSST'12 (link) summary
- Sketching Volume Capacities in Deduplicated Storage----FAST'19 (link) summary
- Estimating Duplication by Content-based Sampling----USENIX ATC'13 summary
- Content-aware Load Balancing for Distributed Backup----LISA'11 (link)
- Rangoli: Space Management in Deduplication Environments----SYSTOR'13 (link) summary
- Data Domain Cloud Tier: Backup here, Backup there, Deduplicated Everywhere!----USENIX ATC'19 (link) summary
- InftyDedup: Scalable and Cost-Effective Cloud Tiering with Deduplication----FAST'23 (link) summary
- Redundancy Elimination Within Large Collections of Files----USENIX ATC'04 (link)
- The Design of a Similarity Based Deduplication System----SYSTOR'09 (link)
- Delta Compressed and Deduplicated Storage Using Stream-Informed Locality----HotStorage'12 (link) summary
- WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression----FAST'12 (link) summary
- To Zip or not to Zip: Effective Resource Usage for Real-Time Compression----FAST'13 (link) summary
- Combining Deduplication and Delta Compression to Achieve Low-Overhead Data Reduction on Backup Datasets----DCC'14 (link) summary
- Ddelta: A Deduplication-inspired Fast Delta Compression Approach----Performance'14 (link)
- Migratory Compression: Coarse-grained Data Reordering to Improve Compressibility----FAST'14 (link) summary
- Odess: Speeding up Resemblance Detection for Redundancy Elimination by Fast Content-Defined Sampling----ICDE'14 (link)
- Reducing Replication Bandwidth for Distributed Document Databases----SoCC'15 (link)
- Edelta: A Word-Enlarging Based Fast Delta Compression Approach----HotStorage'15 (link)
- Online Deduplication for Database----SIGMOD'17 (link)
- Finesse: Fine-Grained Feature Locality based Fast Resemblance Detection for Post-Deduplication Delta Compression----FAST'19 (link) summary
- Improving Restore Performance for In-Line Backup System Combining Deduplication and Delta Compression----TPDS'20 (link)
- Exploring the Potential of Fast Delta Encoding: Marching to a Higher Compression Ratio----CLUSTER'20 (link) summary
- Adaptively Compressing IoT Data on the Resource-constrained Edge----HotEdge'20 (link)
- Length Preserving Compression – Marrying Encryption with Compression----SYSTOR'21 (link) summary
- DeepSketch: A New Machine Learning-Based Reference Search Technique for Post-Deduplication Delta Compression----FAST'22 (link) summary
- Building a High Performance Fine-grained Deduplication Framework for Backup Storage with High Deduplication Ratio----USENIX ATC'22 (link) summary
- Donag: Generating Eficient Patches and Difs for Compressed Archives----ACM TOS'22 (link)
- LoopDelta: Embedding Locality-aware Opportunistic Delta Compression in Inline Deduplication for Highly Efficient Data Reduction----USENIX ATC'23 (link)
- Palantir: Hierarchical Similarity Detection for Post-Deduplication Delta Compression----ASPLOS'24 (link)
- DedupSearch: Two-Phase Deduplication Aware Keyword Search----FAST'22 (link) summary
- Physical vs. Logical Indexing with IDEA: Inverted Deduplication-Aware Index----FAST'24 (link) summary
- Is Low Similarity Threshold A Bad Idea in Delta Compression?----HotStorage'24 (link)
- CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of Flash Memory based Solid State Drives----FAST'11 (link) summary
- XLM: More Effective Memory Deduplication Scanners through Cross-Layer Hints----USENIX ATC'13 (link)
- Dmdedup: Device Mapper Target for Data Deduplication-----OLS'14 (link)
- Using Hints to Improve Inline Block-Layer Deduplication----FAST'16 (link) summary
- OrderMergeDedup: Efficient, Failure-Consistent Deduplication on Flash----FAST'16 (link)
- UKSM: Swift Memory Deduplication via Hierarchical and Adaptive Memory Region Distilling----FAST'18 (link) summary
- Remap-SSD: Safely and Efficiently Exploiting SSD Address Remapping to Eliminate Duplicate Writes----FAST'21 (link)
- Memory Deduplication for Serverless Computing with Medes----EuroSys'22 (link)
- On the Effectiveness of Same-Domain Memory Deduplication----EuroSec'22 (link)
- Dedup-for-Speed: Storing Duplications in Fast Programming Mode for Enhanced Read Performance----SYSTOR'22 (link)
- A Framework for Analyzing the Improving Content-Based Chunking Algorithms----HP Technique Report'05 (link)
- Multi-Level Comparison of Data Deduplication in a Backup Scenario----SYSTOR'09 (link)
- Frequency Based Chunking for Data De-Duplication----MASCOTS'10 (link) summary
- Bimodal Content Defined Chunking for Backup Streams----FAST'10 (link)
- MUCH: Multi-threaded Content-Based File Chunking----TC'15 (link)
- FastCDC: a Fast and Efficient Content-Defined Chunking Approach for Data Deduplication----USENIX ATC'16 (link) summary
- SS-CDC: A Two-stage Parallel Content-Defined Chunking for Deduplicating Backup Storage----SYSTOR'19 (link) summary
- RapidCDC: Leveraging Duplicate Locality to Accelerate Chunking in CDC-based Deduplication Systems----SoCC'19 (link) summary
- PLC-cache: Endurable SSD cache for deduplication-based primary storage----MSST'14 (link)
- Nitro: A Capacity-Optimized SSD Cache for Primary Storage----USENIX ATC'14 (link)
- CDAC: Content-Driven Deduplication-Aware Storage Cache----MSST'19 (link)
- Austere Flash Caching with Deduplication and Compression----USENIX ATC'20 (link)
- Memory Efficient Sanitization of a Deduplicated Storage System----FAST'13 (link) summary
- Concurrent Deletion in a Distributed Content-addressable Storage System with Global Deduplication----FAST'13 (link)
- Accelerating Restore and Garbage Collection in Deduplication-based Backup System via Exploiting Historical Information----USENIX ATC'14 (link) summary
- The Logic of Physical Garbage Collection in Deduplicating Storage----FAST'17 (link)
- EF-Dedup: Enabling Collaborative Data Deduplication at the Network Edge----ICDCS'19 (link)
- Even Data Placement for Load Balance in Reliable Distributed Deduplication Storage Systems--IWQoS'15 (link) summary
- Probabilistic Deduplication for Cluster-Based Storage Systems----SoCC'12 (link) summary
- A Scalable Inline Cluster Deduplication Framework for Big Data Protection----Middleware'12 (link)
- Tradeoffs in Scalable Data Routing for Deduplication Clusters----FAST'11 (link) summary
- Cluster and Single-Node Analysis of Long-Term Deduplication Patterns----ACM TOS'18 (link) summary
- Decentralized Deduplication in SAN Cluster File Systems----USENIX ATC'09 (link)
- HYDRAstore: A Scalable Secondary Storage----FAST'09 (link)
- GoSeed: Generating an Optimal Seeding Plan for Deduplicated Storage----FAST'20 (link)
- The what, The from, and The to: The Migration Games in Deduplicated Systems----FAST'22 (link) summary
- Nv-dedup: High performance Inline Deduplication for Non-volatile Memory----TC'17 (link)
- Improving the Performance and Endurance of Encrypted Non-volatile Main Memory through Deduplicating Writes----MICRO'18 (link)
- DeNOVA: Deduplication Extended NOVA File System----IPDPS'22 (link)
- Light-Dedup: A Light-weight Inline Deduplication Framework for Non-Volatile Memory File Systems----USENIX ATC'23 (link) summary
- Network Coding for Distributed Storage System----TIT'09
- A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries for Storage----FAST'09
- Erasure Coding for Cloud Storage Systems: A Survey----By Jun Li in 2013
- CORE: Augmenting Regenerating-Coding-Based Recovery for Single and Concurrent Failures in Distributed Storage Systems----MSST'13
- Degraded-First Scheduling for MapReduce in Erasure-Coded Storage Clusters----DSN'14
- Repair Pipelining for Erasure-Coded Storage----USENIX ATC'17
- A Tale of Two Erasure Codes in HDFS----FAST'15
- On the Speedup of Single-Disk Failure Recovery in XOR-Coded Storage Systems: Theory and Practice----MSST'12
- Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads----FAST'12
- Lazy Means Smart: Reducing Repair Bandwidth Costs in Erasure-coded Distributed Storage----SYSTOR'14
- Enabling Efficient and Reliable Transition from Replication to Erasure Coding for Clustered File System----DSN'15
- Reconsidering Single Failure Recovery in Clustered File Systems----DSN'16 summary
- RAFI: Risk-Aware Failure Identification to Improve the RAS in Erasure-coded Data Center----USENIX ATC'18
- Partial-Parallel-Repair (PPR): A Distributed Technique for Repairing Erasure Coded Storage----EuroSys'16
- Cross-Rack-Aware Updates in Erasure-Coded Data Centers----ICPP'18
- PARIX: Speculative Partial Writes in Erasure-Coded Systems----USENIX ATC'17
- OpenEC: Toward Unified and Configurable Erasure Coding Management in Distributed Storage Systems----FAST'19
- ParaRC: Embracing Sub-Packetization for Repair Parallelization in MSR-Coded Storage----FAST'23 (link)
- CodePlugin: Plugging Deduplication into Erasure Coding for Cloud Storage----HotCloud'15
- Double Regenerating Codes for Hierarchical Data Centers----ISIT'16
- Pyramid codes: Flexible schemes to trade space for access efficiency in reliable data storage systems----ACM TOS'13
- Having Your Cake and Eating It Too: Jointly Optimal Erasure Codes for I/O, Storage and Network-bandwidth----FAST'15
- Opening the Chrysalis: On the Real Repair Performance of MSR Codes----FAST'16
- NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds----FAST'12
- Erasure Coding in Windows Azure Storage----USENIX ATC'12
- XORing Elephants: Novel Erasure Codes for Big Data----VLDB'13
- Clay Codes: Moulding MDS Codes to Yield an MSR Code----FAST'18
- Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments----DSN'18 summary
- On Fault Tolerance, Locality, and Optimality in Locally Repairable Codes----USENIX ATC'18
- Parallelism-Aware Locally Repairable Code for Distributed Storage Systems----ICDCS'18
- Beehive: Erasure Codes for Fixing Multiple Failures in Distributed Storage Systems----HotStorage'15
- Pipelined regeneration with Regenerating Codes for Distributed Storage Systems----NetCod'11
- Cooperative Pipelined Regeneration in Distribution Storage Systems----INFOCOM'14
- Zebra: Demand-aware Erasure Coding for Distributed Storage Systems----IWQoS'16
- On Data Parallelism of Erasure Coding in Distributed Storage Systems----ICDCS'17
- Giza: Erasure Coding Objects across Global Data Centers----USENIX ATC'17
- EC-Store: Bridging the Gap Between Storage and Latency in Distributed Erasure Coded Systems----ICDCS'18
- Latency Reduction and Load Balancing in Coded Storage Systems----SoCC'17
- RAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures----FAST'21 (link)
- FusionRAID: Achieving Consistent Low Latency for Commodity SSD Arrays----FAST'22 (link)
- A Survey on Systems Security Metrics----ACM Computing Surveys'16
- How to Best Share a Big Secret----SYSTOR'18 (link) summary
- AONT-RS: Blending Security and Performance in Dispersed Storage Systems----FAST'11
- Splinter: Practical Private Queries on Public Data----NSDI'17
- Efficient Homophonic Coding----TIT'99 (link)
- How Far Can we Go Beyond Linear Cryptanalysis?----AsiaCRYPTO'04 (link)
- CryptDB: Protecting Confidentiality with Encrypted Query Processing----SOSP'11 (link)
- Dark Clouds on the Horizon: Using Cloud Storage as Attack Vector and Online Slack Space----USENIX Security'11 (link)
- RAPPOR: Randomized Aggregable Privacy-Preserving Ordinal Response----CCS'14 (link)
- Frequency-Hiding Order-Preserving Encryption----CCS'15 (link)
- Inference Attacks on Property-Preserving Encrypted Databases----CCS'15 (link)
- A Note on the Optimality of Frequency Analysis vs. lp-Optimization----IACR'15 (link)
- Oblivious RAM as a Substrate for Cloud Storage - The Leakage Challenge Ahead----CCSW'16 (link) summary
- Oblivious RAM: A Dissection and Experimental Evaluation---VLDB'16 (link)
- MiniCrypt: Reconciling Encryption and Compression for Big Data Stores----EuroSys'17 (link)
- Splinter: Practical Private Queries on Public Data----NSDI'17 (link)
- Frequency-smoothing Encryption: Preventing Snapshot Attacks on Deterministically Encrypted Data----IACR'17 (link) summary
- The Overhead of Confidentiality and Client-side Encryption in Cloud Storage Systems----UCC'19 (link) summary
- PRO-ORAM: Practical Read-Only Oblivious RAM----RAID'19 (link)
- Quantifying Information Leakage of Deterministic Encryption----CCSW'19 (link) summary
- Pancake: Frequency Smoothing for Encrypted Data Stores----USENIX Security'20 (link)
- Hiding the Lengths of Encrypted Message via Gaussian Padding----CCS'21 (link)
- On Fingerprinting Attacks and Length-Hiding Encryption----CT-RSA'22 (link)
- Rethinking Block Storage Encryption with Virtual Disks----HotStorage'22 (link) summary
- Differential Privacy----ICALP'06 (link)
- Calibrating Noise to Sensitivity in Private Data Analysis----TCC'06 (link)
- Differentially Private Access Patterns for Searchable Symmetric Encryption----INFOCOM'18 (link) summary
- Privacy at Scale: Local Differential Privacy in Practice----SIGMOD'18 (link)
- Graphene-SGX: A Practical Library OS for Unmodified Applications on SGX----USENIX ATC'17 (link)
- Intel SGX Explained----IACR'16 (link)
- OpenSGX: An Open Platform for SGX Research----NDSS'16 (link)
- SCONE: Secure Linux Containers with Intel SGX----OSDI'16 (link)
- Varys: Protecting SGX Enclaves From Practical Side-Channel Attacks---USENIX ATC'18 (link)
- sgx-perf: A Performance Analysis Tool for Intel SGX Enclaves----Middleware'18 (link) summary
- TaLoS: Secure and Transparent TLS Termination inside SGX Enclaves----arxiv'17 (link) summary
- Switchless Calls Made Practical in Intel SGX----SysTex'18 (link) summary
- Regaining Lost Seconds: Efficient Page Preloading for SGX Enclaves----Middleware'20 (link)
- Everything You Should Know About Intel SGX Performance on Virtualized Systems----Sigmeterics'19 (link) summary
- A Comparison Study of Intel SGX and AMD Memory Encryption Technology---HASP'18 (link)
- SGXoMeter: Open and Modular Benchmarking for Intel SGX----EuroSec'21 (link)
- Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Transient Out-of-Order Execution----USENIX Security'18 (link)
- NEXUS: Practical and Secure Access Control on Untrusted Storage Platforms using Client-side SGX----DSN'19 (link)
- Securing the Storage Data Path with SGX Enclaves----arxiv'18 (link) summary
- EnclaveDB: A Secure Database using SGX----S&P'18 (link)
- Isolating Operating System Components with Intel SGX----SysTEX'16 (link)
- SPEICHER: Securing LSM-based Key-Value Stores using Shielded Execution----FAST'19 (link) summary
- ShieldStore: Shielded In-memory Key-Value Storage with SGX----EuroSys'19 (link) summary
- SeGShare: Secure Group File Sharing in the Cloud using Enclaves----DSN'20 (link) summary
- DISKSHIELD: A Data Tamper-Resistant Storage for Intel SGX----AsiaCCS'20 (link)
- SPEED: Accelerating Enclave Applications via Secure Deduplication----ICDCS'19 (link) summary
- Secure In-memory Key-Value Storage with SGX----SoCC'18
- EnclaveCache: A Secure and Scalable Key-value Cache in Multi-tenant Clouds using Intel SGX----Middleware'19 (link) summary
- Building enclave-native storage engines for practical encrypted databases----VLDB'21 (link)
- Aria: Tolerating Skewed Workloads in Secure In-memory Key-value Stores----ICDE'21 (link)
- A Privacy-Preserving Defense Mechanism Against Request Forgery Attacks----TrustCom'11 (link) summary
- Internet Censorship in Thailand: User Practices and Potential Threats----EuroS&P'17 (link)
- Accessing Google Scholar under Extreme Internet Censorship: A Legal Avenue----Middleware'17 (link)
- How China Detects and Blocks Shadowsocks----IMC'20 (link)
- How the Great Firewall of China Detects and Blocks Fully Encrypted Traffic----USENIX Security'23 (link)
- Revisiting HDD Rules of Thumb: 1/3 Is Not (Quite) the Average Seek Distance----MSST'24 (link)
- MapReduce: Simplified Data Processing on Large Clusters----OSDI'04 (link)
- Cumulus: Filesystem Backup to the Cloud----FAST'09 (link) summary
- RACS: A Case for Cloud Storage Diversity----SoCC'10 (link)
- The Hadoop Distributed File System----MSST'10 (link) summary
- SPANStore: Cost-Effective Geo-Replicated Storage Spanning Multiple Cloud Services----SOSP'13 (link) summary
- A Day Late and a Dollar Short: The Case for Research on Cloud Billing Systems----HotCloud'14 (link)
- CosTLO: Cost-Effective Redundancy for Lower Latency Variance on Cloud Storage Service----NSDI'15 (link)
- Kurma: Secure Geo-Distributed Multi-Cloud Storage Gateways----SYSTOR'19 (link) summary
- Ursa: Hybrid Block Storage for Cloud-Scale Virtual Disks----EuroSys'19 (link)
- Duplicacy: A New Generation of Cloud Backup Tool Based on Lock-Free Deduplication----TCC'20 (link) summary
- In Search of an Understandable Consensus Algorithm----USENIX ATC'14 (link)
- TinyLFU: A Highly Efficient Cache Admission Policy----ACM TOS'17 (link)
- Hyperbolic Caching: Flexible Caching for Web Applications----USENIX ATC'17 (link)
- Flashield: a Hybrid Key-value Cache that Controls Flash Write Amplification----USENIX NSDI'19 (link)
- It’s Time to Revisit LRU vs. FIFO----HotStorage'20 (link) summary trace
- The CacheLib Caching Engine: Design and Experiences at Scale----OSDI'20 (link)
- Unifying the Data Center Caching Layer — Feasible? Profitable?----HotStorage'21 (link)
- Learning Cache Replacement with Cacheus----FAST'21 (link)
- Kangaroo: Caching Billions of Tiny Objects on Flash----SOSP'21 (link)
- Segcache: a Memory-efficient and Scalable In-memory Key-value Cache for Small Objects----NSDI'21 (link)
- FarReach: Write-back Caching in Programmable Switches----USENIX ATC'23 (link)
- FIFO can be Better than LRU: the Power of Lazy Promotion and Quick Demotion----HotOS'23 (link)
- An Analysis of Compare-by-Hash----HotOS'03 (link)
- On-the-Fly Verification of Rateless Erasure Codes for Efficient Content Distribution----S&P'04 (link)
- Compare-by-Hash: A Reasoned Analysis----USENIX ATC'06 (link) summary
- Don’t Thrash: How to Cache your Hash on Flash----HotStorage'11 (link)
- Algorithmic Improvements for Fast Concurrent Cuckoo Hashing----EuroSys'14 (link)
- A Lock-Free, Cache-Efficient Multi-Core Synchronization Mechanism for Line-Rate Network Traffic Monitoring----IPDPS'10 (link)
- Lock-Free Collaboration Support for Cloud Storage Services with Operation Inference and Transformation----FAST'20 (link)
- Design Tradeoffs for SSD Performance----USENIX ATC'08 (link)
- Design Tradeoffs for SSD Reliability----USENIX ATC'19 (link)
- The Tail at Store: A Revelation from Millions of Hours of Disk and SSD Deployments----FAST'16 (link)
- The Unwritten Contract of Solid State Drives----EuroSys'17 (link)
- The CASE of FEMU: Cheap, Accurate, Scalable and Extensible Flash Emulator----FAST'18 (link) summary
- From blocks to rocks: a natural extension of zoned namespaces----HotStorage'21 (link)
- Don’t Be a Blockhead: Zoned Namespaces Make Work on Conventional SSDs Obsolete----HotOS'21 (link) summary
- What Systems Researchers Need to Know about NAND Flash----HotStorage'13 (link)
- Caveat-Scriptor: Write Anywhere Shingled Disks----HotStorage'15 (link)
- Towards an Unwritten Contract of Intel Optane SSD----HotStorage'19 (link)
- Improving the Reliability of Next Generation SSDs using WOM-v Codes----FAST'22 (link)
- Fantastic SSD internals and how to learn and use them----SYSTOR'22 (link)
- Understanding NVMe Zoned Namespace (ZNS) Flash SSD Storage Devices----arxiv'22 (link)
- Compaction-Aware Zone Allocation for LSM based Key-Value Store on ZNS SSDs----HotStorage'22 (link)
- Lifetime-leveling LSM-tree Compaction for ZNS SSD----HotStorage'22 (link)
- What You Can't Forget: Exploiting Parallelism for Zoned Namespaces----HotStorage'22 (link)
- NVMe SSD Failures in the Field: the Fail-Stop and the Fail-Slow----USENIX ATC'22 (link)
- Offline and Online Algorithms for SSD Management----ACM TOS'22 (link)
- NVMeVirt: A Versatile Software-defined Virtual NVMe Device----FAST'23 (link)
- Excessive SSD-Internal Parallelism Considered Harmful----HotStorage'23 (link)
- Is Garbage Collection Overhead Gone? Case study of F2FS on ZNS SSDs----HotStorage'23 (link)
- ZapRAID: Toward High-Performance RAID for ZNS SSDs via Zone Append----ApSys'23 (link)
- BypassD: Enabling fast userspace access to shared SSDs----ASPLOS'24 (link)
- LightNVM: The Linux Open-Channel SSD Subsystem----USENIX FAST'17 (link)
- ZoneAlloy: Elastic Data and Space Management for Hybrid SMR Drives----HotStorage'19 (link)
- Zone Append: A New Way of Writing to Zoned Storage----Vault'20 (link)
- ZNS: Avoiding the Block Interface Tax for Flash-based SSDs----USENIX ATC'21 (link) code
- ZNS+: Advanced Zoned Namespace Interface for Supporting In-Storage Zone Compaction----OSDI'21 (link)
- RAIZN: Redundant Array of Independent Zoned Namespaces----ASPLOS'23 (link)
- An Efficient Order-Preserving Recovery for F2FS with ZNS SSD----HotStorage'23 (link)
- Is Garbage Collection Overhead Gone? Case study of F2FS on ZNS SSDs----HotStorage'23 (link)
- A Free-Space Adaptive Runtime Zone-Reset Algorithm for Enhanced ZNS Efficiency----HotStorage'23 (link)
- Can ZNS SSDs be Better Storage Devices for Persistent Cache?----HotStorage'24 (link) summary
- NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories----FAST'16 (link)
- Redesigning LSMs for Nonvolatile Memory with NoveLSM----USENIX ATC'18 (link) summary
- SLM-DB: Single-Level Key-Value Store with Persistent Memory----FAST'19 (link) summary
- An Empirical Guide to the Behavior and Use of Scalable Persistent Memory----FAST'20 (link)
- Characterizing the Performance of Intel Optane Persistent Memory: A Close Look at its on-dimm Buffering----EuroSys'22 (link)
- An Introduction to Be-trees and Write-Optimization----USENIX Login'15 (link) code
- Building Workload-Independent Storage with VT-Trees----FAST'13 (link)
- SDGen: Mimicking Datasets for Content Generation in Storage Benchmarks----FAST'15 (link)
- BPF for Storage: An Exokernel-Inspired Approach----HotOS'21 (link) summary
- Understanding Modern Storage APIs: A systematic study of libaio, SPDK, and io_uring----SYSTOR'22 (link)
- PAIO: General, Portable I/O Optimizations With Minor Application Modifications----FAST'22 (link)
- zIO: Accelerating IO-Intensive Applications with Transparent Zero-Copy IO----OSDI'22 (link)
- XRP: In-Kernel Storage Functions with eBPF----OSDI'22 (link)
- HintStor: A Framework to Study I/O Hints in Heterogeneous Storage----ACM ToS'22 (link)
- The Google File System----SOSP'03 (link)
- Bigtable: A Distributed Storage System for Structured Data----OSDI'06 (link)
- Finding A Needle in Haystack: Facebook’s Photo Storage----OSDI'10 (link)
- f4: Facebook’s Warm BLOB Storage System----OSDI'14 (link)
- Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service----USENIX ATC'22 (link)
- CacheSack: Admission Optimization for Google Datacenter Flash Caches----USENIX ATC'22 (link)
- From Luna to Solar: The Evolutions of the Compute-to-Storage Networks in Alibaba Cloud----SIGCOMM'22 (link)
- Hello Bytes, Bye Blocks: PCIe Storage Meets Compute Express Link for Memory Expansion (CXL-SSD)----HotStorage'22 (link)
- Fail-Slow at Scale: Evidence of Hardware Performance Faults in Large Production Systems----FAST'18 (link)
- Metastable Failures in Distributed Systems----HotOS'21 (link)
- Metastable Failures in the Wild----OSDI'22 (link)
- Replication Under Scalable Hashing: A Family of Algorithms for Scalable Decentralized Data Distribution----IPDPS'04 (link)
- Dynamic Metadata Management for Petabyte-scale File Systems----SC'04 (link)
- CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data----SC'06 (link)
- Ceph: A Scalable, High-performance Distributed File System----OSDI'06 (link) (slides)
- The Design and Implementation of AQuA: An Adaptive Quality of Service Aware Object-Based Storage Device----MSST'06 (link)
- Mantle: A Programmable Metadata Load Balancer for the Ceph File System----SC'15 (link)
- Understanding Write Behaviors of Storage Backends in Ceph Object Store----MSST'17 (link) slides
- Design of Global Data Deduplication for A Scale-out Distributed Storage System----ICDCS'18 (link)
- File Systems Unfit as Distributed Storage Backends: Lessons from 10 Years of Ceph Evolution----SOSP'19 (link) summary
- MAPX: Controlled Data Migration in the Expansion of Decentralized Object-Based Storage Systems----FAST'20 (link)
- Lunule: An Agile and Judicious Metadata Load Balancer for CephFS----SC'21 (link)
- Speculative Recovery: Cheap, Highly Available Fault Tolerance with Disaggregated Storage----USENIX ATC‘22 (link)
- InfiniFS: An Efficient Metadata Service for Large-Scale Distributed Filesystems----FAST'22 (link)
- TiDedup: A New Distributed Deduplication Architecture for Ceph----USENIX ATC'23 (link)
- GPFS: A Shared-Disk File System for Large Computing Clusters----FAST'02 (link)
- Efficient Object Storage Journaling in a Distributed Parallel File System----FAST'10 (link)
- Tips and Tricks for Diagnosing Lustre Problems on Cray Systems----CUG'11 (link)
- Lustre Resiliency: Understanding Lustre Message Loss and Tuning for Resiliency----CUG'15 (link)
- Taking back control of HPC file systems with Robinhood Policy Engine----arxiv'15 (link)
- Lustre Lockahead: Early Experience and Performance using Optimized Locking----CUG'17 (link)
- LPCC: Hierarchical Persistent Client Caching for Lustre----SC'19 (link) slides
- A Performance Study of Lustre File System Checker: Bottlenecks and Potentials----MSST'19 (link)
- I/O Characterization and Performance Evaluation of BeeGFS for Deep Learning----ICPP'19 (link)
- HadaFS: A File System Bridging the Local and Shared Burst Buffer for Exascale Supercomputers----FAST'23 (link)
- Accelerating I/O performance of ZFS-based Lustre file system in HPC environment----Journal of Supercomputing'23 (link)
- MetaWBC: POSIX-compliant Metadata Write-back Caching for Distributed File Systems----SC'22 (link)
- Xfast: Extreme File Attribute Stat Acceleration for Lustre----SC'23 (link) slides
- The I/O Trace Initiative: Building a Collaborative I/O Archive to Advance HPC----SC-workshop'23 (link)
- Combining Buffered I/O and Direct I/O in Distributed File Systems----FAST'24 (link) slides summary
- The Effects of Filesystem Fragmentation----OLS'06 (link)
- Ext4 Block and Inode Allocator Improvements----OLS'08 (link)
- File Systems Fated for Senescence? Nonsense, Says Science!----FAST'17 (link)
- Filesystem Aging: It's more Usage than Fullness----HotStorage'19 (link)
- Understanding Configuration Dependencies of File Systems----HotStorage'22 (link)
- CONFD: Analyzing Configuration Dependencies of File Systems for Fun and Profit----FAST'24 (link)
- Journaling of Journal Is (Almost) Free----FAST'14 (link)
- iJournaling: Fine-Grained Journaling for Improving the Latency of Fsync System Call----USENIX ATC'17 (link)
- FastCommit: Resource-efficient, Performant and Cost-effective File System Journaling----USENIX ATC'24 (link)
- StreamCache: Revisiting Page Cache for File Scanning on Fast Storage Devices----USENIX ATC'24 (link)
- The Linear Tape File System----MSST'10 (link)
- Scale and Concurrency of GIGA+: File System Directories with Millions of Files----FAST''11 (link)
- F2FS: A New File System for Flash Storage----FAST'15 (link)
- POSIX is Dead! Long Live... errr... What Exactly?----HotStorage'15 (link)
- BetrFS: A Right-Optimized Write-Optimized File System----FAST'15 (link)
- The Full Path to Full-Path Indexing----FAST'18 (link)
- SplitFS: persistent-memory file system that reduces software overhead----SOSP'19 (link)
- EROFS: A Compression-friendly Readonly File System for Resource-scarce Devices----USENIX ATC'19 (link)
- How to Copy Files----FAST'20 (link)
- WineFS: a hugepage-aware file system for persistent memory that ages gracefully----SOSP'21 (link)
- LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism----SOSP'21 (link)
- BetrFS: A Compleat File System for Commodity SSDs----EuroSys'22 (link)
- To FUSE or Not to FUSE: Performance of User-Space File Systems----FAST'17 (link)
- Performance and Resource Utilization of FUSE User-Space File Systems----ACM TOS'19 (link)
- XFUSE: An Infrastructure for Running Filesystem Services in User Space----USENIX ATC'21 (link)
- Survey of Distributed File System Design Choices----ACM TOS'22 (link)
- Can Modern LLMs Tune and Configure LSM-based Key-Value Stores?----HotStorage'24 (link)