Image Captioning Using Detectors and Swarm Based Learning Approach for Word Embedding Vectors
作者机构:CSE DepartmentSethu Institute of TechnologyPulloorKariapatti626115India CSE DepartmentNational Engineering CollegeK.R.NagarKovilpatti628503India
出 版 物:《Computer Systems Science & Engineering》 (计算机系统科学与工程(英文))
年 卷 期:2023年第44卷第1期
页 面:173-189页
核心收录:
学科分类:0810[工学-信息与通信工程] 0808[工学-电气工程] 08[工学] 0802[工学-机械工程] 0701[理学-数学] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
主 题:Denoising improved YoloV3 multishot multibox detector
摘 要:IC(Image Captioning)is a crucial part of Visual Data Processing and aims at understanding for providing captions that verbalize an image’s important ***,in existing works,because of the complexity in images,neglecting major relation between the object in an image,poor quality image,labelling it remains a big problem for ***,the main objective of this work attempts to overcome these challenges by proposing a novel framework for *** in this research work the main contribution deals with the framework consists of three phases that is image understanding,textual understanding and ***,the image understanding phase is initiated with image pre-pro-cessing to enhance image ***,object has been detected using IYV3MMDs(Improved YoloV3 Multishot Multibox Detectors)in order to relate the interrelation between the image and the object,and then it is followed by MBFOCNNs(Modified Bacterial Foraging Optimization in Convolution Neural Networks),which encodes and providesfinal feature ***,the tex-tual understanding phase is performed based on an image which is initiated with preprocessing of text where unwanted words,phrases,punctuations are removed in order to provide a healthy *** is then followed by MGloVEs(Modified Glo-bal Vectors for Word Representation),which provides a word embedding of fea-tures with the highest priority towards the object present in an ***,the decoding phase has been performed,which decodes the image whether it may be a normal or complex scene image and provides an accurate text by its learning ability using MDAA(Modified Deliberate Adaptive Attention).The experimental outcome of this work shows better accuracy of shows 96.24%when compared to existing and similar methods while generating captions for images.