Feature dimension reduction method for automatic classification of Chinese text
Title:
Feature dimension reduction method for automatic classification of Chinese text
Application Number:
200410000721
Application Date:
2004/01/16
Announcement Date:
2004/12/29
Pub. Date:
2006/04/19
Publication Number:
1558367
Announcement Number:
1252635
Grant Date:
2006-4-19
Granted Pub. Date:
2006-4-19
ApplicationType:
Invention
State/Country:
11[China|beijing]
IPC:
G06K 9/80
Applicant(s):
Tsinghua University
Inventor(s):
Sun Maosong, Xue Dejun
Key Words:
Feature dimension reduction method, automatic classification, Chinese text
Abstract:
The present invention features that one characteristic selecting method is first selected to lowering the dimension of original characteristic set to obtain intermediate characteristic set; the intermediate characteristic set is then analyzed to find out 'high superposed binary string' and 'high deviated binary string'; merging the high superposed binary strings into corresponding ternary string and deleting high deviated binary strings to obtain the learning characteristic set for machine to learn; and finally obtaining classifier for use in classifying stage. The present invention makes best use of the characteristics of language, and lowers the dimensions greatly on the basis of intermediate characteristic set to ensure that the selected characteristic possesses high classifying capacity and description capacity, being superior to characteristic selection adopting statistic amount only.
Claim:
Priority:
PCT:
LegalStatus:

Recommend this patent:
1 2 3 4 5
Average ( 0 votes):
                                                                          Recommended Patents>>

Relevancy information





News & Events More>>

Last Update  
2008-4-17
  Selected patents owned by Tsinghua University filed in 2005 are loaded.
2008-3-31
  Selected patents owned by Tsinghua University filed in 2006 and 2007 are load.







Copyright 2008-2015 All Rights Reserved Patent License of China.      Designed by Easygo