Web text mining method based on subtopic selection and three-level stratified structure
As the problem of fuzzy inquiry and data sparseness cased by intention gap between users and queries,according to the ranking list of possible subtopic from popularity and diversity,subtopic selection and sorting of stratified structure were used for web text mining.Firstly,on the basic of noun phra...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Beijing Xintong Media Co., Ltd
2016-05-01
|
Series: | Dianxin kexue |
Subjects: | |
Online Access: | http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2016142/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841529902046642176 |
---|---|
author | Yuzhen SHI Donghong SHAN |
author_facet | Yuzhen SHI Donghong SHAN |
author_sort | Yuzhen SHI |
collection | DOAJ |
description | As the problem of fuzzy inquiry and data sparseness cased by intention gap between users and queries,according to the ranking list of possible subtopic from popularity and diversity,subtopic selection and sorting of stratified structure were used for web text mining.Firstly,on the basic of noun phrase and substitute of part query,a simple model was used to extract a variety of related phrases as candidate subtopic.Then,related documents of a web document collection were used to build three-level stratified structure of candidate subtopic.Finally,considering popularity and diversity,the stratified structure and estimated popularity were applied for sorting.Based on 100 Japanese queries from NTCIR-9 library,100 English queries from TREC 2009 library and network tracking diversity task,experiments verify that the proposed method can be effectively applied to a variety of search,and the proposed mining is better than external resources for high ranking subtopics. |
format | Article |
id | doaj-art-9e39d19a12ca4a8184d1650a9e8e1cc3 |
institution | Kabale University |
issn | 1000-0801 |
language | zho |
publishDate | 2016-05-01 |
publisher | Beijing Xintong Media Co., Ltd |
record_format | Article |
series | Dianxin kexue |
spelling | doaj-art-9e39d19a12ca4a8184d1650a9e8e1cc32025-01-15T03:14:53ZzhoBeijing Xintong Media Co., LtdDianxin kexue1000-08012016-05-01329610459608887Web text mining method based on subtopic selection and three-level stratified structureYuzhen SHIDonghong SHANAs the problem of fuzzy inquiry and data sparseness cased by intention gap between users and queries,according to the ranking list of possible subtopic from popularity and diversity,subtopic selection and sorting of stratified structure were used for web text mining.Firstly,on the basic of noun phrase and substitute of part query,a simple model was used to extract a variety of related phrases as candidate subtopic.Then,related documents of a web document collection were used to build three-level stratified structure of candidate subtopic.Finally,considering popularity and diversity,the stratified structure and estimated popularity were applied for sorting.Based on 100 Japanese queries from NTCIR-9 library,100 English queries from TREC 2009 library and network tracking diversity task,experiments verify that the proposed method can be effectively applied to a variety of search,and the proposed mining is better than external resources for high ranking subtopics.http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2016142/data sparsenesstext miningstratified structurediversity,popularity |
spellingShingle | Yuzhen SHI Donghong SHAN Web text mining method based on subtopic selection and three-level stratified structure Dianxin kexue data sparseness text mining stratified structure diversity,popularity |
title | Web text mining method based on subtopic selection and three-level stratified structure |
title_full | Web text mining method based on subtopic selection and three-level stratified structure |
title_fullStr | Web text mining method based on subtopic selection and three-level stratified structure |
title_full_unstemmed | Web text mining method based on subtopic selection and three-level stratified structure |
title_short | Web text mining method based on subtopic selection and three-level stratified structure |
title_sort | web text mining method based on subtopic selection and three level stratified structure |
topic | data sparseness text mining stratified structure diversity,popularity |
url | http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2016142/ |
work_keys_str_mv | AT yuzhenshi webtextminingmethodbasedonsubtopicselectionandthreelevelstratifiedstructure AT donghongshan webtextminingmethodbasedonsubtopicselectionandthreelevelstratifiedstructure |