Fusion of High-Level Visual Attributes for Image Captioning

Home Page
About
Submit A Journal
Submit A Conference
Submit Paper/Book
- Submit a Preprint
- Submit a Book
Publisher/Editor Panel
- Sign In/Sign Up

Avrupa Bilim ve Teknoloji Dergisi
Sayı: 52
Fusion of High-Level Visual Attributes for Image Captioning

Fusion of High-Level Visual Attributes for Image Captioning

Authors : Murat Kilci, Özkan Çayli, Volkan Kiliç

Pages : 161-168

View : 85 | Download : 83

Publication Date : 2023-12-15

Article Type : Research

Abstract :Image captioning aims to generate a natural language description that accurately conveys the content of an image. Recently, deep learning models have been used to extract visual attributes from images, enhancing the accuracy of captions. However, it is essential to assess these visual attributes to ensure optimal performance and avoid incorporating redundant or misleading information. In this study, we employ the visual attributes of semantic segmentation, object detection, instance segmentation, keypoint detection, and their fusion. Experimental evaluations on the commonly used datasets VizWiz and MSCOCO Captions demonstrate that the fusion of visual attributes improves the accuracy of caption generation. Furthermore, the image captioning model, which utilizes the fusion of visual attributes, has been embedded into our custom-designed Android application, named NObstacle, enabling captioning without the need for an internet connection.
Keywords : Görsel Öznitelikler, Görüntü Altyazılama, Android Uygulama

ORIGINAL ARTICLE URL

VIEW PAPER (PDF)