Alibaba reveals progress with large language model research as Chinese Big Tech firms continue to push for ChatGPT rival


By Ann Cao

A group of researchers from DAMO Academy have unveiled a new audiovisual language model called Video-LLaMA. The new DAMO Academy model is an enhancement from previous vision-LLMs as it can tackle two challenges in video understanding. — SCMP

Alibaba Group Holding’s in-house research unit is making progress with its own large language models (LLMs), as Chinese Big Tech companies continue to pile into the artificial intelligence (AI) space in an attempt to come up with a rival to OpenAI’s ChatGPT.

A group of researchers from DAMO Academy unveiled a new audiovisual language model called Video-LLaMA, which helps the system to understand visual and auditory content in videos, in a research paper published last week on ArXiv, an online scientific paper repository.

The codes have also been open-sourced by the researchers on online developer community GitHub. Alibaba owns the South China Morning Post.

LLMs, which are trained through machine learning, are the underpinning of AI-powered chatbots like ChatGPT. LLMs allow the chatbots to answer sophisticated queries, generate detailed writings, code, or other content.

The new DAMO Academy model is an enhancement from previous vision-LLMs as it can tackle two challenges in video understanding: capturing the temporal changes in visual scenes and integrating audiovisual signals, according to the three researchers, Zhang Hang, Li Xin and Bing Lidong.

In a case demonstrated by the researchers, when given a video of a man playing saxophone on stage, the model was able to describe in text both the background sound of applause and visual content of the video. By comparison, previous models, such as MiniGPT-4 and LLaVA, mainly focus on static image comprehension, the researchers said.

Meanwhile, the researchers noted that the model is still “an early-stage prototype” with a few limitations, such as its limited ability to handle long videos including films and TV shows.

The move comes as a part of broader efforts by Alibaba, which is in the midst of its largest-ever corporate restructuring, to double down on its investment in the development and application of LLMs.

Alibaba’s cloud unit in April unveiled its own alternative to ChatGPT – Tongyi Qianwen – which is based on DAMO’s LLMs, marking one of the earliest Chinese companies to join the ChatGPT bandwagon, along with search engine giant Baidu which launched its Ernie Bot in March. The service had received more than 200,000 beta testing applications from corporate clients, Alibaba chairman and CEO Daniel Zhang Yong said in a conference call with analysts last month.

DAMO first introduced its LLM called AliceMind last September, when deputy head Zhou Jingren unveiled it at the World AI Conference in Shanghai. He described it as a multimodal pre-trained language model that is able to process different types of inputs including text, images, audio, and video.

Alibaba has started to work with partners to develop industry-specific AI models, Zhang said. For instance, it is planning to launch cloud products and enterprise solutions based on its AI model, and integrate AI capabilities into various products, including its workplace collaboration tool DingTalk. – South China Morning Post

Follow us on our official WhatsApp channel for breaking news alerts and key updates!
   

Next In Tech News

Activist: ‘Terrible’ AI has given tech an existential headache
Netflix hopes for live sports knockout with Paul-Tyson fight
Bluesky has added one million users since the US election as people seek alternatives to X
South Korean LG Display to invest additional $1 billion in Vietnam, local govt says
Opinion: Replace your passwords with passkeys for an easier login experience
Rooted in reminiscence: M’sian game designers go big on the nostalgia factor
US man sentenced to 5 years over laundering crypto stolen from Bitfinex hack
Lenovo Q2 revenue jumps 24% on premium PC sales, AI push
No joke: the Onion parody website buys Alex Jones' Infowars out of bankruptcy
Blue Origin, AST Spacemobile ink New Glenn rocket launch deal

Others Also Read