Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Smartphones transitive closure performance #18

Open
DavidLeoni opened this issue Aug 29, 2016 · 0 comments
Open

Improve Smartphones transitive closure performance #18

DavidLeoni opened this issue Aug 29, 2016 · 0 comments

Comments

@DavidLeoni
Copy link
Member

DavidLeoni commented Aug 29, 2016

If we run transitive closure on wordnet, and after we import smartphones and run again transitive closure, we get these timings:

  • Regular hard disk:
    normalization takes ~2 mins and closure takes 7 mins which is way too much.
  • SSD
    Whole process is just 45 secs !

If we need it, it should be possibile to optimize timings with better hash indexes or algorithms that exploit dag graph properties.

Here's the regular hard disk log using DiverCli :

da@da-HP-ProBook-6560b:~/Da/tmp/divercli-0.1.0-SNAPSHOT/bin$ ./divercli --prj w init --db classpath:/div-wn31.h2.db.xz

Restoring database:   classpath:/div-wn31.h2.db.xz
                to:   /home/da/Da/tmp/divercli-0.1.0-SNAPSHOT/bin/w/w.h2.db  ...
Database created in 18secs
da@da-HP-ProBook-6560b:~/Da/tmp/divercli-0.1.0-SNAPSHOT/bin$ cd w
da@da-HP-ProBook-6560b:~/Da/tmp/divercli-0.1.0-SNAPSHOT/bin/w$ l
total 587M
drwxrwxr-x 2 da da 4.0K Oct 19 10:53 .
drwxr-xr-x 4 da da 4.0K Oct 19 10:53 ..
-rw-rw-r-- 1 da da  372 Oct 19 10:53 divercli.ini
-rw-rw-r-- 1 da da 587M Oct 19 10:53 w.h2.db
da@da-HP-ProBook-6560b:~/Da/tmp/divercli-0.1.0-SNAPSHOT/bin/w$ ../divercli --import-xml -a a -d d classpath:/smartphones.xml

ERROR: Unknown option: --import-xml
da@da-HP-ProBook-6560b:~/Da/tmp/divercli-0.1.0-SNAPSHOT/bin/w$ ../divercli import-xml -a a -d d classpath:/smartphones.xml

Connecting to database   jdbc:h2:file:/home/da/Da/tmp/divercli-0.1.0-SNAPSHOT/bin/w/w   ...

 Welcome to
         _  _                          _                     
      __| |(_)__   __  ___  _ __  ___ (_)  ___   ___   _ __  
     / _` || |\ \ / / / _ \| '__|/ __|| | / __| / _ \ | '_ \ 
    | (_| || | \ V / |  __/| |   \__ \| || (__ | (_) || | | |
     \__,_||_|  \_/   \___||_|   |___/|_| \___| \___/ |_| |_|


Going to import 1 files by import author a...
Loading LMF : classpath:/smartphones.xml ...

Validating XML Schema of /tmp/extracted4346025131758287327/smartphones.xml   ...

XML is valid!


Starting import...

Wed Oct 19 10:55:04 CEST 2016: COMMIT 0
Wed Oct 19 10:55:05 CEST 2016: COMMIT 9
TOTAL TIME: 795
NUM ENTRIES: 9
Done loading LMF : classpath:/smartphones.xml .

Executing post-import db validation ... 

DB is valid!

   Elapsed time: 9secs

Normalizing SynsetRelations...

Found 117795 synsets.

SynsetRelation normalization - processed synsets: 10,000
SynsetRelation normalization - processed synsets: 20,000
SynsetRelation normalization - processed synsets: 30,000
SynsetRelation normalization - processed synsets: 40,000
SynsetRelation normalization - processed synsets: 50,000
SynsetRelation normalization - processed synsets: 60,000
SynsetRelation normalization - processed synsets: 70,000
SynsetRelation normalization - processed synsets: 80,000
SynsetRelation normalization - processed synsets: 90,000

Done normalizing SynsetRelations.

   No edges were added to the 973,021 existing ones. 

   Elapsed time: 1min, 59secs

Computing transitive closure for SynsetRelations (may take some minutes) ...

   Found 973,021 synset relations.


   Elapsed time: 5mins, 39secs


Going to write closure into the db...


Done writing transitive closure for SynsetRelations.

   Max level:      3
   Initial edges:  973,021
   Inserted edges: 6
        hypernym:   6

Total elapsed time:  5mins, 40secs

Elapsed time: 7mins, 51secs   Started: 2016-10-19 10:55:03   Ended: 2016-10-19 11:02:55



Done importing 1 LMF by import author a

Imported lexical resources: 

    div-smartphones    from    classpath:/smartphones.xml

Disconnected.

@DavidLeoni DavidLeoni added this to the 0.1 milestone Aug 29, 2016
@DavidLeoni DavidLeoni self-assigned this Aug 29, 2016
@DavidLeoni DavidLeoni removed this from the 0.1 milestone Sep 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant