- Avrupa Bilim ve Teknoloji Dergisi
- Sayı: 52
- Fusion of High-Level Visual Attributes for Image Captioning
Fusion of High-Level Visual Attributes for Image Captioning
Authors : Murat Kilci, Özkan Çayli, Volkan Kiliç
Pages : 161-168
View : 85 | Download : 83
Publication Date : 2023-12-15
Article Type : Research
Abstract :Image captioning aims to generate a natural language description that accurately conveys the content of an image. Recently, deep learning models have been used to extract visual attributes from images, enhancing the accuracy of captions. However, it is essential to assess these visual attributes to ensure optimal performance and avoid incorporating redundant or misleading information. In this study, we employ the visual attributes of semantic segmentation, object detection, instance segmentation, keypoint detection, and their fusion. Experimental evaluations on the commonly used datasets VizWiz and MSCOCO Captions demonstrate that the fusion of visual attributes improves the accuracy of caption generation. Furthermore, the image captioning model, which utilizes the fusion of visual attributes, has been embedded into our custom-designed Android application, named NObstacle, enabling captioning without the need for an internet connection.Keywords : Görsel Öznitelikler, Görüntü Altyazılama, Android Uygulama