Image Recognition
On August 25th, Alibaba Cloud launched two open-source large vision language models (LVLM), Qwen-VL and its conversationally fine-tuned Qwen-VL-Chat. Qwen-VL is the multimodal version of Qwen-7B, Alibaba Cloud’s 7-billion-parameter model of its large language model Tongyi Qianwen. Qwen-VL is a vision language (VL) model that supports multiple languages including Chinese and English. Following the release of M6 and OFA series multimodal models, Alibaba Cloud’s Tongyi Qianwen team has now open-sourced a large-scale vision language model (LVLM) called Qwen-VL, based on Qwen-7B. In early August, Alibaba Cloud open-sourced the Qwen-7B Generalized Questioning Model and Qwen-7B-Chat Dialogue Model, with a total of 70 billion parameters.