Repository logo
  • English
  • العربية
  • বাংলা
  • Català
  • Čeština
  • Deutsch
  • Ελληνικά
  • Español
  • Suomi
  • Français
  • Gàidhlig
  • हिंदी
  • Magyar
  • Italiano
  • Қазақ
  • Latviešu
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Српски
  • Svenska
  • Türkçe
  • Yкраї́нська
  • Tiếng Việt
Log In
New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Scholalry Output
  3. Publications
  4. Image Caption Generator using Siamese Graph Convolutional Networks and LSTM
 
  • Details

Image Caption Generator using Siamese Graph Convolutional Networks and LSTM

Source
ACM International Conference Proceeding Series
Date Issued
2022-01-08
Author(s)
Kumar, Athul
Agrawal, Aarchi
Ashin Shanly, K. S.
Das, Sudip
Harilal, Nidhin
DOI
10.1145/3493700.3493754
Abstract
Image captions are those crisp descriptions that you see under images. They generally provide the viewer with a brief idea about the image context. To generate an accurate description of the scene, the model requires a semantic and spatial understanding of the contents in the scene. This paper proposes a novel approach using Siamese Graph Convolutional Network (S-GCN), making use of a non-parametric Kernel Activation function (KAF) followed by an LSTM with attention to generate natural language captions for the input image. Siamese-GCN captures deep semantic relations and makes the model more robust to class imbalances. We use an extended kernel activation function and regularize with standard lp-norm techniques, improving the overall model performance by a significant margin. The model is trained and tested on the Flickr30K data set and evaluated on BLEU-4 scores.
Unpaywall
URI
https://d8.irins.org/handle/IITG2025/26203
IITGN Knowledge Repository Developed and Managed by Library

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
Repository logo COAR Notify