|
| 1 | +################################################################################ |
| 2 | +# PS-MCL : Parallel Shotgun Coarsening Markov Clustering. |
| 3 | +# |
| 4 | +# Author: InJae Yu ([email protected]), KAIST |
| 5 | +# YongSub Lim([email protected]), KAIST |
| 6 | +# U Kang ([email protected]), KAIST |
| 7 | +# |
| 8 | +# Version : 1.0 |
| 9 | +# Date : August 17, 2015 |
| 10 | +# Main Contact: U Kang |
| 11 | +# |
| 12 | +# |
| 13 | +################################################################################ |
| 14 | + |
| 15 | +1. General information |
| 16 | + |
| 17 | +This is an implementation of PS-MCL (Parallel Shotgun Coarsening MCL) |
| 18 | +This also includes implementations for the original MCL, R-MCL and MLR-MCL. |
| 19 | + |
| 20 | + |
| 21 | +2. Minimal Environment |
| 22 | + |
| 23 | +This tool needs |
| 24 | + (a) Java 1.8 or higher |
| 25 | + |
| 26 | +This software was tested on Ubuntu. |
| 27 | + |
| 28 | + |
| 29 | +3. How to run PS-MCL |
| 30 | + |
| 31 | +IMPORTANT! |
| 32 | +- Before you use it, please check that the script files are executable. If not, |
| 33 | +you may manually modify the permission of scripts or you may type "make install" |
| 34 | +to do the same work. |
| 35 | +- You may type "make demo" if you want to just try PS-MCL. |
| 36 | + |
| 37 | + |
| 38 | +To run MCL, you need to do the followings: |
| 39 | +- prepare an undirected edge file. |
| 40 | + The format should be "node_index DELIMITER node_index". |
| 41 | + |
| 42 | + |
| 43 | +PS-MCL.jar is located in /bin |
| 44 | + |
| 45 | +Command : ./PS-MCL [INPUT (Graph File Path)] [Output Directory] [CoarseMode] [Coarse Level] [Balance Factor] [MCL Mode] [Number of Thread] [epsilon] [rand_seed] |
| 46 | + |
| 47 | +You should provide following arguments. |
| 48 | + |
| 49 | + (a) INPUT : path of input data |
| 50 | + (b) Output Directory : directory for output data |
| 51 | + (c) CoarseMode : -sc or -hem. "-sc" for Shotgun Coarsening which is proposed method, and "-hem" for Heavy Edge Matching. |
| 52 | + (d) Coarse Level : The number of coarsening step. Should be an non-negative integer. |
| 53 | + (e) Balance Factor : Balance factor for B-MCL. If this is 0, R-MCL will be executed. |
| 54 | + (f) MCL Mode : -basic or -reg. Run MCL with "-basic". Run R,B-MCL with "-reg"(B-MCL is specialized by R-MCL with balance factor larger than 0). |
| 55 | + (g) Number of Thread : the number of threads to be used |
| 56 | + (h) epsilon : run until the error is under epsilon |
| 57 | + (i) rand_seed : random seed number |
| 58 | + (i) skip rate : Float number from 0~1 when using "-sc" mode. |
| 59 | + |
| 60 | + |
| 61 | + |
| 62 | +4. Demo Run |
| 63 | +Type "make" in the source folder. The MCL will start a demo run B-MCL for "dataset/ETC/Yeast-2/Yeast-2" with 3 coarsening step. |
| 64 | +File "Yeast-2" contains 7,049 edges between 2,223 nodes. If the application runs well, you will see the cluster size distribution as |
| 65 | +"size of cluster \t # of clusters \t # of nodes contained in such size of cluster" and detailed result in the console. |
| 66 | + |
| 67 | + |
| 68 | +5. Output Explanation |
| 69 | +The output will consist of following files. |
| 70 | +"Data, MCL Mode, Coarsen Info, Thread Info".result |
| 71 | +"Data, MCL Mode, Coarsen Info, Thread Info".assign |
| 72 | +"Data, MCL Mode, Coarsen Info, Thread Info".dist |
| 73 | + |
| 74 | +ex) ./PS-MCL dataset/SUNY/DIP/DIP ./ -sc 4 1.5 -reg 4 1 |
| 75 | +makes 3 outputs |
| 76 | +================================================================================================================================================== |
| 77 | +DIP_B-MCL-1.5_SC_Level-5_numT-4_conv-norm.result : records time, NCut, # of clusters |
| 78 | + |
| 79 | +coarsen_mode coarseLevel b_Factor mcl_mode time NCut AVG_Ncut ClusterNum #ofThread ofIteration |
| 80 | +SC 5 1.500000 Regularized 5.109000 647.902975 0.398710 1625 4 50 |
| 81 | +================================================================================================================================================== |
| 82 | +DIP_B-MCL-1.5_SC_Level-5_numT-4_conv-norm.assign : cluster assignment of each node |
| 83 | + |
| 84 | +cluster index |
| 85 | +0 128 |
| 86 | +0 144 |
| 87 | +0 129 |
| 88 | +================================================================================================================================================== |
| 89 | +DIP_B-MCL-1.5_SC_Level-5_numT-4_conv-norm.dist : distribution of size of clusters |
| 90 | + |
| 91 | +size #of Clusters #of Nodes |
| 92 | +1 6 6 |
| 93 | +2 17 34 |
| 94 | +3 32 96 |
| 95 | +4 27 108 |
| 96 | +5 51 255 |
| 97 | +6 50 300 |
| 98 | +================================================================================================================================================== |
| 99 | +.result and .dist files' contents will be also printed in console. |
| 100 | + |
| 101 | + |
| 102 | + |
| 103 | +6. Scripts for existing MCL based methods |
| 104 | + |
| 105 | +To use MCL, R-MCL, B-MCL, use provided scripts MCL, R-MCL(Multi-Level), and B-MCL(Multi-Level). |
| 106 | +MCL : ./MCL [INPUT (Graph File Path)] [Output Directory] [epsilon] |
| 107 | +R-MCL : ./R-MCL [INPUT (Graph File Path)] [Output Directory] [Coarse Level] [epsilon] |
| 108 | +B-MCL : ./B-MCL [INPUT (Graph File Path)] [Output Directory] [Coarse Level] [Balance Factor] [epsilon] |
| 109 | + |
| 110 | + |
| 111 | +7. Rebuilding source codes |
| 112 | + |
| 113 | +MCL distribution includes the source code. You can modify the code and rebuild |
| 114 | +the code. The source codes are in 'src' directory. |
| 115 | +Since the binary file PS-MCL.jar already exists in 'bin' directory, normally you |
| 116 | +don't need to build the code again. Thus, this is the instruction when you |
| 117 | +modify the source code and build it. |
| 118 | +To build the source code, use the script 'compile_PS-MCL.sh'. When you |
| 119 | +execute the script, it automatically compile the source codes in 'src' and make |
| 120 | +a jar file PS-MCL.jar in 'bin' directory. |
0 commit comments