Web text mining method based on subtopic selection and three-level stratified structure

As the problem of fuzzy inquiry and data sparseness cased by intention gap between users and queries,according to the ranking list of possible subtopic from popularity and diversity,subtopic selection and sorting of stratified structure were used for web text mining.Firstly,on the basic of noun phra...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuzhen SHI, Donghong SHAN
Format: Article
Language:zho
Published: Beijing Xintong Media Co., Ltd 2016-05-01
Series:Dianxin kexue
Subjects:
Online Access:http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2016142/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841529902046642176
author Yuzhen SHI
Donghong SHAN
author_facet Yuzhen SHI
Donghong SHAN
author_sort Yuzhen SHI
collection DOAJ
description As the problem of fuzzy inquiry and data sparseness cased by intention gap between users and queries,according to the ranking list of possible subtopic from popularity and diversity,subtopic selection and sorting of stratified structure were used for web text mining.Firstly,on the basic of noun phrase and substitute of part query,a simple model was used to extract a variety of related phrases as candidate subtopic.Then,related documents of a web document collection were used to build three-level stratified structure of candidate subtopic.Finally,considering popularity and diversity,the stratified structure and estimated popularity were applied for sorting.Based on 100 Japanese queries from NTCIR-9 library,100 English queries from TREC 2009 library and network tracking diversity task,experiments verify that the proposed method can be effectively applied to a variety of search,and the proposed mining is better than external resources for high ranking subtopics.
format Article
id doaj-art-9e39d19a12ca4a8184d1650a9e8e1cc3
institution Kabale University
issn 1000-0801
language zho
publishDate 2016-05-01
publisher Beijing Xintong Media Co., Ltd
record_format Article
series Dianxin kexue
spelling doaj-art-9e39d19a12ca4a8184d1650a9e8e1cc32025-01-15T03:14:53ZzhoBeijing Xintong Media Co., LtdDianxin kexue1000-08012016-05-01329610459608887Web text mining method based on subtopic selection and three-level stratified structureYuzhen SHIDonghong SHANAs the problem of fuzzy inquiry and data sparseness cased by intention gap between users and queries,according to the ranking list of possible subtopic from popularity and diversity,subtopic selection and sorting of stratified structure were used for web text mining.Firstly,on the basic of noun phrase and substitute of part query,a simple model was used to extract a variety of related phrases as candidate subtopic.Then,related documents of a web document collection were used to build three-level stratified structure of candidate subtopic.Finally,considering popularity and diversity,the stratified structure and estimated popularity were applied for sorting.Based on 100 Japanese queries from NTCIR-9 library,100 English queries from TREC 2009 library and network tracking diversity task,experiments verify that the proposed method can be effectively applied to a variety of search,and the proposed mining is better than external resources for high ranking subtopics.http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2016142/data sparsenesstext miningstratified structurediversity,popularity
spellingShingle Yuzhen SHI
Donghong SHAN
Web text mining method based on subtopic selection and three-level stratified structure
Dianxin kexue
data sparseness
text mining
stratified structure
diversity,popularity
title Web text mining method based on subtopic selection and three-level stratified structure
title_full Web text mining method based on subtopic selection and three-level stratified structure
title_fullStr Web text mining method based on subtopic selection and three-level stratified structure
title_full_unstemmed Web text mining method based on subtopic selection and three-level stratified structure
title_short Web text mining method based on subtopic selection and three-level stratified structure
title_sort web text mining method based on subtopic selection and three level stratified structure
topic data sparseness
text mining
stratified structure
diversity,popularity
url http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2016142/
work_keys_str_mv AT yuzhenshi webtextminingmethodbasedonsubtopicselectionandthreelevelstratifiedstructure
AT donghongshan webtextminingmethodbasedonsubtopicselectionandthreelevelstratifiedstructure