Skip to content

guokr/string-demon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is the string feature extracting project for later maching learning algorithms.

sample:

import string_demon as sd

str1 = "我住在北方,夜晚听见窗外的雨声,让我想起了南方。May the force be with you....""
print sd.spam_check(str1)

(0.9047619047619048, 2.6246719160104988, 4.833333333333333, 0.7241379310344828) return refer to: (中文重复率,中文停顿长度,英文停顿长度,中英文长度比)

import string_demon as sd

str2 = "我住在南方,我住在南方。"

print sd.lcs_check(str2)

(2, '\xe6\x88\x91\xe4\xbd\x8f\xe5\x9c\xa8\xe5\x8d\x97\xe6\x96\xb9', 5) return refer to: (重复次数,LCS,LCS.length)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages