Sumit Singh

Sumit Singh

Sumit has 10+years of product development and AI. His expertise lies in data science, AI, computer vision and NLP based product. He also co-founded Labellerr to solve data preparation challenges.
Evaluating and Fine-Tuning Multimodal Video Captioning Models - A Case Study
NLP

Evaluating and Fine-Tuning Multimodal Video Captioning Models - A Case Study

Video captioning models represent a significant advancement in the intersection of computer vision and natural language processing. These models automatically generate textual descriptions for video content, enhancing accessibility, searchability, and user engagement. As video content continues to proliferate across various platforms, the ability to accurately describe and index this content
13 min read