Tag‐inferring and tag‐guided Transformer for image captioning

Tag‐inferring and tag‐guided Transformer for image captioning

Abstract Image captioning is an important task for understanding images. Recently, many studies have used tags to build alignments between image information and language information. However, existing methods ignore the problem that simple semantic tags have difficulty expressing the detailed semant...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yaohua Yi, Yinkai Liang, Dezhu Kong, Ziwei Tang, Jibing Peng
Format:	Article
Language:	English
Published:	Wiley 2024-09-01
Series:	IET Computer Vision
Subjects:	computer vision image recognition learning (artificial intelligence)
Online Access:	https://doi.org/10.1049/cvi2.12280
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Vision Transformers for Image Classification: A Comparative Survey
by: Yaoli Wang, et al.
Published: (2025-01-01)

Monitoring poultry social dynamics using colored tags: Avian visual perception, behavioral effects, and artificial intelligence precision
by: Florencia B. Rossi, et al.
Published: (2025-01-01)

Pruning‐guided feature distillation for an efficient transformer‐based pose estimation model
by: Dong‐hwi Kim, et al.
Published: (2024-09-01)

Swin‐fisheye: Object detection for fisheye images
by: Dawei Zhang, et al.
Published: (2024-11-01)

3D landmark‐based face restoration for recognition using variational autoencoder and triplet loss
by: Sahil Sharma, et al.
Published: (2021-01-01)

Time-Distributed Vision Transformer Stacked With Transformer for Heart Failure Detection Based on Echocardiography Video
by: Mgs M. Luthfi Ramadhan, et al.
Published: (2024-01-01)

MIRA-CAP: Memory-Integrated Retrieval-Augmented Captioning for State-of-the-Art Image and Video Captioning
by: Sabina Umirzakova, et al.
Published: (2024-12-01)

Ear tag and PIT tag retention by white‐tailed deer
by: Emily H. Belser, et al.
Published: (2017-12-01)

Behavioral Tagging: A Translation of the Synaptic Tagging and Capture Hypothesis
by: Diego Moncada, et al.
Published: (2015-01-01)

Detailed Image Captioning and Hashtag Generation
by: Nikshep Shetty, et al.
Published: (2024-11-01)

DualAD: Dual adversarial network for image anomaly detection⋆
by: Yonghao Wan, et al.
Published: (2024-12-01)

Safety After Dark: A Privacy Compliant and Real-Time Edge Computing Intelligent Video Analytics for Safer Public Transportation
by: Johan Barthelemy, et al.
Published: (2024-12-01)

Detection and Classification of <i>Agave angustifolia</i> Haw Using Deep Learning Models
by: Idarh Matadamas, et al.
Published: (2024-12-01)

Effect of Camera Choice on Image-Classification Inference
by: Jason Brown, et al.
Published: (2024-12-01)

DPANet: Position‐aware feature encoding and decoding for accurate large‐scale point cloud semantic segmentation
by: Haoying Zhao, et al.
Published: (2024-12-01)

An improved multi‐scale YOLOv8 for apple leaf dense lesion detection and recognition
by: Shixin Huo, et al.
Published: (2024-12-01)

Novel Advance Image Caption Generation Utilizing Vision Transformer and Generative Adversarial Networks
by: Shourya Tyagi, et al.
Published: (2024-11-01)

Improving spaCy dependency annotation and PoS tagging web service using independent NER services
by: Nico Colic, et al.
Published: (2019-06-01)

Cloud-edge collaboration based computer vision inference mechanism
by: Boheng TANG, et al.
Published: (2021-05-01)

Depth Segmentation Approach for Egocentric 3D Human Pose Estimation with a Fisheye Camera
by: Hyeonghwan Shin, et al.
Published: (2024-12-01)

Deployment of an Artificial Intelligent Robot for Weed Management in Legumes Farmland
by: Adedamola Abdulmatin Adeniji, et al.
Published: (2023-08-01)

Object detection in smart indoor shopping using an enhanced YOLOv8n algorithm
by: Yawen Zhao, et al.
Published: (2024-12-01)

L-AVATeD: The lidar and visual walking terrain dataset
by: David Whipps, et al.
Published: (2024-12-01)

An algorithm for cattle counting in rangeland based on multi‐scale perception and image association
by: Bingxuan Li, et al.
Published: (2024-11-01)

Application of Particle Transformer to quark flavor tagging in the ILC project
by: Tagami Risako, et al.
Published: (2024-01-01)

Geometry‐preserved image editing
by: Taeeun Kwon, et al.
Published: (2024-09-01)

Parallel Algorithms for Unsupervised Tagging
by: Sujith Ravi, et al.
Published: (2024-07-01)

An Interpretable Deep Learning-Based Feature Reduction in Video-Based Human Activity Recognition
by: Micheal Dutt, et al.
Published: (2024-01-01)

Improving the segmentation of the vertebrae using a multi-stage machine learning algorithm
by: Vladyslav Koniukhov
Published: (2024-11-01)

Stroke‐Seg: A deep learning‐based framework for chinese stroke segmentation
by: Xinyu Gong, et al.
Published: (2024-11-01)

IMP‐DETR: Optimization model for defect detection of injection‐moulded products
by: Anzhan Liu, et al.
Published: (2024-12-01)

Improving unsupervised pedestrian re‐identification with enhanced feature representation and robust clustering
by: Jiang Luo, et al.
Published: (2024-12-01)

A fully rotation invariant multi‐camera finger vein recognition system
by: Bernhard Prommegger, et al.
Published: (2021-05-01)

Haplotype tagging efficiency and tagSNP sets portability in worldwide populations in NAT2 gene
by: Audrey Sabbagh, et al.
Published: (2007-12-01)

Recognition of vehicle license plates in highway scenes with deep fusion network and connectionist temporal classification
by: Liru Hua, et al.
Published: (2024-11-01)

Tag Rugby : Everything you need to know to play and coach /
by: Liddiard, Jane
Published: (2014)

Discretization of a mathematical model for image analysis based on the optics of spiral beams
by: S.A. Kishkin, et al.
Published: (2024-04-01)

Leveraging modality‐specific and shared features for RGB‐T salient object detection
by: Shuo Wang, et al.
Published: (2024-12-01)

ADAPTIVE VISION AI
by: V. Vodyanitskyi, et al.
Published: (2024-12-01)

Fluorescently Tagged Poly(methyl methacrylate)s
by: Fabia Grisi, et al.
Published: (2024-12-01)