Alibaba Cloud Launches Open-Source Models that Understand Images


Image Recognition

ARTICLE SOURCE

Photo credit: Shutterstock|Alibaba Cloud said on Friday that it’s releasing two open-source large vision language models that understand images and text. The two models are trained based on the 7-billion-parameter version of its large language model Qwen-7B that it open-sourced earlier this month. Alibaba Cloud said that compared with other open-source large vision language models, Qwen-VL can comprehend images in higher resolution, leading to better image recognition and understanding performance. The incorporation of other sensory input into large language models opens up possibilities for new applications for researchers and commercial organizations. Alibaba Cloud said its pre-trained 7-billion-parameter large language model Qwen-7B, and its conversationally finetuned version, Qwen-7B-Chat have garnered over 400,000 downloads since their launch in a month.