Skip to content

Commit

Permalink
prepare version 0.3
Browse files Browse the repository at this point in the history
  • Loading branch information
kermitt2 committed Oct 17, 2019
1 parent dea10fd commit 83cbb25
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 6 deletions.
6 changes: 4 additions & 2 deletions Dependencies_INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,12 +67,14 @@ See [ICU Readme](http://source.icu-project.org/repos/icu/trunk/icu4c/readme.html
6- Copy `libicuuc.a` from `icu/source/lib` and `libicudata.a` from `icu/source/stubdata` and put it under corresponding OS `libs/icu/<OS>`
# Contributors

Main contact: Patrice Lopez ([email protected])

pdfalto is developed by Patrice Lopez ([email protected]) and Achraf Azhar ([email protected]).

xpdf is developed by Glyph & Cog, LLC (1996-2017) and distributed under GPL2 or GPL3 license.

pdf2xml is orignally written by Hervé Déjean, Sophie Andrieu, Jean-Yves Vion-Dury and Emmanuel Giguet (XRCE) under GPL2 license.

pdfalto has been modified and forked by Patrice Lopez ([email protected]) and Achraf Azhar ([email protected]).

The windows version has been built originally by [@pboumenot](https://github.com/boumenot) and ported on windows 7 for 64 bit, then for windows (native and cygwin) by [@lfoppiano](https://github.com/lfoppiano) and [@flydutch](https://github.com/flydutch).

# License
Expand Down
8 changes: 5 additions & 3 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

**pdfalto** is a command line executable for parsing PDF files and producing structured XML representations of the PDF content in [ALTO](https://github.com/kermitt2/pdfalto/blob/master/schema/alto.xsd) format.

**pdfalto** is a fork of [pdf2xml](http://sourceforge.net/projects/pdf2xml), developed at XRCE, with modifications for robustness, addition of features and output enhanced format in [ALTO](https://github.com/altoxml/documentation/wiki) (including in particular space information, useful for instance for further machine learning processing). It is based on the [Xpdf](http://www.xpdfreader.com/) library.
**pdfalto** is initially a fork of [pdf2xml](http://sourceforge.net/projects/pdf2xml), developed at XRCE, with modifications for robustness, addition of features and output enhanced format in [ALTO](https://github.com/altoxml/documentation/wiki) (including in particular space information, useful for instance for further machine learning processing). It is based on the [Xpdf](http://www.xpdfreader.com/) library.

The latest stable version is *0.2*.

Expand Down Expand Up @@ -123,7 +123,9 @@ The executable `pdfalto` is generated in the root directory. Additionally, this

# Contributors

Contact: Patrice Lopez ([email protected]), Achraf Azhar ([email protected])
Contact: Patrice Lopez ([email protected])

pdfalto is developed by Patrice Lopez ([email protected]) and Achraf Azhar ([email protected]).

pdf2xml is orignally written by Hervé Déjean, Sophie Andrieu, Jean-Yves Vion-Dury and Emmanuel Giguet (XRCE) under GPL2 license.

Expand All @@ -138,7 +140,7 @@ As the original pdf2xml and main dependency Xpdf, pdfalto is distributed under G

# Useful links

Some tools for converting ALTO to other formats:
Some tools for converting ALTO into other formats:

- https://github.com/filak/hOCR-to-ALTO
- https://github.com/UB-Mannheim/ocr-fileformat
2 changes: 1 addition & 1 deletion src/ConstantsUtils.cc
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ namespace ConstantsUtils {
const char *NAME_ANNOT = "annot";
const char *NAME_DATA_DIR = "_data";

const char *PDFALTO_VERSION = "0.1";
const char *PDFALTO_VERSION = "0.3";
const char *PDFALTO_NAME = "pdfalto";

const char *PDFALTO_CREATOR = "CONTRIBUTORS";
Expand Down

0 comments on commit 83cbb25

Please sign in to comment.