Skip to content

Latest commit

 

History

History
80 lines (59 loc) · 2.78 KB

README.md

File metadata and controls

80 lines (59 loc) · 2.78 KB

Succinct-Core

Java implementation of Succinct's core algorithms. This library provides the core algorithms for Succinct as described in the NSDI'15 paper.

Requirements

This library has no external requirements.

Dependency Information

Apache Maven

To build your application with Succinct-Core, you can link against this library using Maven by adding the following dependency information to your pom.xml file:

<dependency>
    <groupId>amplab</groupId>
    <artifactId>succinct-core</artifactId>
    <version>0.1.8</version>
</dependency>

Usage

The Succinct-Core library exposes Succinct in three layers:

SuccinctCore
SuccinctFile
SuccinctIndexedFile

SuccinctCore

SuccinctCore exposes the basic construction primitive for all internal internal data-structures, along with accessors to the core data-structures (e.g., NPA, SA and ISA, which are termed as NextCharIdx, Input2AOS and AOS2Input in the paper). An implementation of the same is at SuccinctBuffer.

SuccinctFile

SuccinctFile builds on top of SuccinctCore and exposes the interface for three main functionalities:

byte[] extract(int offset, int length)
long[] search(byte[] query)
long count(byte[] query)

These primitives allow random access (extract) and search (count, search) directly on the compressed representation of flat-file (i.e., unstructured) data. SuccinctFileBuffer is a ByteBuffer based implementation of SuccinctFile. Look at this example to see how SuccinctFileBuffer can be used.

SuccinctIndexedFile

Finally, SuccinctIndexedFile builds on the functionality of both SuccinctCore and SuccinctFile to expose a record buffer, i.e., a collection of records. This interface finds app;ications in the Succinct on Apache Spark interfaces, particularly in SuccinctRDD and SuccinctTableRDD implementations.

Example Program

We provide an example program that outlines the usage of count, search and extract functionalities of the SuccinctFile. A convenient script is included in the bin/ directory to run the example. The usage of the script is as follows:

./bin/succinct-shell <file-name>

where filename is the name of the file being analyzed.