Technical Name See-through-Text Grouping for Referring Image Segmentation
Project Operator Institute of Information Science, Academia Sinica
Project Host 劉庭祿
Summary
We propose an iterative learning scheme to tackle the referring image segmentation. In each iteration starting from a given a referring expression, the scheme learns to predict its relevance to each pixelderives a see-through-text embedding pixel-wise heatmap. Then, a ConvRNN refines the heatmap for altering the referring expression to start the next iteration.
Scientific Breakthrough
The technique iteratively updates the language expression, generatesrefines the heatmap to tackle the referring image segmentation. Our model is end-to-end trainableshows the SOTA performance on four datasets without using an object detectoran attribute predictor as the existing models. This technique is easy to trainprovides additional attention-based referring representation.
Industrial Applicability
The multi-modal analysis is one of the main trends in current research. The referring image segmentation addressed by our technique is a cross-modal application which associates computer visionnatural language processing. The multi-modal representation embedding method in this technique can be used as a template for the industry to develop multimedia applications on a combination of visual an
Keyword Computer Vision Deep Learning Convolutional Neural Network Convolutional-Recurrent Neural Network Image Segmentation Natural Language Referring Segmentation Embedding Attention Referring Expression
other people also saw