A topical VAEGAN-IHMM approach for automatic story segmentation

Feature representations with rich topic information can greatly improve the performance of story segmentation tasks. VAEGAN offers distinct advantages in feature learning by combining variational autoencoder (VAE) and generative adversarial network (GAN), which not only captures intricate data repre...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jia Yu, Huiling Peng, Guoqiang Wang, Nianfeng Shi
Format:	Article
Language:	English
Published:	AIMS Press 2024-07-01
Series:	Mathematical Biosciences and Engineering
Subjects:	generative adversarial network variational autoencoder hdp hidden markov model story segmentation
Online Access:	https://www.aimspress.com/article/doi/10.3934/mbe.2024289?viewType=HTML
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846163749260492800
author	Jia Yu Huiling Peng Guoqiang Wang Nianfeng Shi
author_facet	Jia Yu Huiling Peng Guoqiang Wang Nianfeng Shi
author_sort	Jia Yu
collection	DOAJ
description	Feature representations with rich topic information can greatly improve the performance of story segmentation tasks. VAEGAN offers distinct advantages in feature learning by combining variational autoencoder (VAE) and generative adversarial network (GAN), which not only captures intricate data representations through VAE's probabilistic encoding and decoding mechanism but also enhances feature diversity and quality via GAN's adversarial training. To better learn topical domain representation, we used a topical classifier to supervise the training process of VAEGAN. Based on the learned feature, a segmentor splits the document into shorter ones with different topics. Hidden Markov model (HMM) is a popular approach for story segmentation, in which stories are viewed as instances of topics (hidden states). The number of states has to be set manually but it is often unknown in real scenarios. To solve this problem, we proposed an infinite HMM (IHMM) approach which utilized an HDP prior on transition matrices over countably infinite state spaces to automatically infer the state's number from the data. Given a running text, a Blocked Gibbis sampler labeled the states with topic classes. The position where the topic changes was a story boundary. Experimental results on the TDT2 corpus demonstrated that the proposed topical VAEGAN-IHMM approach was significantly better than the traditional HMM method in story segmentation tasks and achieved state-of-the-art performance.
format	Article
id	doaj-art-33e620b48815430991f415ce1142f7c0
institution	Kabale University
issn	1551-0018
language	English
publishDate	2024-07-01
publisher	AIMS Press
record_format	Article
series	Mathematical Biosciences and Engineering
spelling	doaj-art-33e620b48815430991f415ce1142f7c02024-11-19T01:27:16ZengAIMS PressMathematical Biosciences and Engineering1551-00182024-07-012176608663010.3934/mbe.2024289A topical VAEGAN-IHMM approach for automatic story segmentationJia Yu 0Huiling Peng1Guoqiang Wang 2Nianfeng Shi 31. School of Computer and Information Engineering, Luoyang Institute of Science and Technology, China 2. Software Research Institute, Technological University of Shannon, Ireland1. School of Computer and Information Engineering, Luoyang Institute of Science and Technology, China1. School of Computer and Information Engineering, Luoyang Institute of Science and Technology, China1. School of Computer and Information Engineering, Luoyang Institute of Science and Technology, ChinaFeature representations with rich topic information can greatly improve the performance of story segmentation tasks. VAEGAN offers distinct advantages in feature learning by combining variational autoencoder (VAE) and generative adversarial network (GAN), which not only captures intricate data representations through VAE's probabilistic encoding and decoding mechanism but also enhances feature diversity and quality via GAN's adversarial training. To better learn topical domain representation, we used a topical classifier to supervise the training process of VAEGAN. Based on the learned feature, a segmentor splits the document into shorter ones with different topics. Hidden Markov model (HMM) is a popular approach for story segmentation, in which stories are viewed as instances of topics (hidden states). The number of states has to be set manually but it is often unknown in real scenarios. To solve this problem, we proposed an infinite HMM (IHMM) approach which utilized an HDP prior on transition matrices over countably infinite state spaces to automatically infer the state's number from the data. Given a running text, a Blocked Gibbis sampler labeled the states with topic classes. The position where the topic changes was a story boundary. Experimental results on the TDT2 corpus demonstrated that the proposed topical VAEGAN-IHMM approach was significantly better than the traditional HMM method in story segmentation tasks and achieved state-of-the-art performance.https://www.aimspress.com/article/doi/10.3934/mbe.2024289?viewType=HTMLgenerative adversarial networkvariational autoencoderhdphidden markov modelstory segmentation
spellingShingle	Jia Yu Huiling Peng Guoqiang Wang Nianfeng Shi A topical VAEGAN-IHMM approach for automatic story segmentation Mathematical Biosciences and Engineering generative adversarial network variational autoencoder hdp hidden markov model story segmentation
title	A topical VAEGAN-IHMM approach for automatic story segmentation
title_full	A topical VAEGAN-IHMM approach for automatic story segmentation
title_fullStr	A topical VAEGAN-IHMM approach for automatic story segmentation
title_full_unstemmed	A topical VAEGAN-IHMM approach for automatic story segmentation
title_short	A topical VAEGAN-IHMM approach for automatic story segmentation
title_sort	topical vaegan ihmm approach for automatic story segmentation
topic	generative adversarial network variational autoencoder hdp hidden markov model story segmentation
url	https://www.aimspress.com/article/doi/10.3934/mbe.2024289?viewType=HTML
work_keys_str_mv	AT jiayu atopicalvaeganihmmapproachforautomaticstorysegmentation AT huilingpeng atopicalvaeganihmmapproachforautomaticstorysegmentation AT guoqiangwang atopicalvaeganihmmapproachforautomaticstorysegmentation AT nianfengshi atopicalvaeganihmmapproachforautomaticstorysegmentation AT jiayu topicalvaeganihmmapproachforautomaticstorysegmentation AT huilingpeng topicalvaeganihmmapproachforautomaticstorysegmentation AT guoqiangwang topicalvaeganihmmapproachforautomaticstorysegmentation AT nianfengshi topicalvaeganihmmapproachforautomaticstorysegmentation

A topical VAEGAN-IHMM approach for automatic story segmentation

Similar Items