A Glimpse of Chinese Internet Giants' 50 AI Models and Applications

2024-03-12 来源：搜狐时尚原文链接评论0条

(AsianFin)— A global wave of artificial intelligence (AI) in recent years, fueled by the popularity of OpenAI’s ChatGPT, has prompted major technology companies in China, such as Alibaba, Baidu, ByteDance, Tencent, Huawei, Xiaohongshu, Meitu, iFlytech, and Qihoo 360, to develop their own large models. Some homegrown innovations are as follows:

Alibaba

AtomoVideo – touted as China's "Sora" for Video Generation

AtomoVideo, introduced by Alibaba in March 2024, is a high-fidelity image and video generation framework. Similar to OpenAI's Sora, AtomoVideo utilizes multi-granularity image injection technology, providing higher fidelity for generated videos based on a given image. The architecture of AtomoVideo exhibits flexibility, extending to video frame prediction tasks, ensuring excellent performance in handling long-sequence video prediction tasks.

EMO - AI Image-Audio-Video Model

EMO, another Alibaba creation, focuses on generating expressive portrait videos directly from images and audio. It stands out by generating AI videos that synchronize with given images and audio inputs, enabling users to create dynamic and expressive videos with fluent facial expressions. This model caters to various content creation needs, including speeches, e-commerce livestream, and video content creation.

Qwen-VL-Max - Multi-Modal Large Model Comparable to GPT-4

Introduced in January 2024, Qwen-VL-Max by Alibaba is an open-source multi-modal visual model. With capabilities comparable to GPT-4V and Gemini Ultra, this model excels in accurate image deion, information reasoning, and extended creative tasks based on images.

Tongyi Qianwen, Tongyi Wanxiang, and Tongyi Tingwu

Alibaba's Tongyi Qianwen, an AI language model, serves as a smart Q&A assistant. The 2.0 version, released on October 31, 2023, exhibits significant improvements in complex instruction understanding, literary creation, general mathematics, knowledge retention, and illusion resistance.

Tongyi Wanxiang assists in image creation for artistic purposes, employing a combination of generative models to provide highly controllable and diverse image generation effects.

Tongyi Tingwu, an AI assistant, utilizes AI models for both language and audio-visual tasks, enhancing information production, organization, mining, and insight in the general audio-visual content domain.

Baidu

UniVG - Unified Modality Video Generation System

UniVG, unveiled by Baidu in January 2024, is a unified modality video generation system. Its unique feature lies in adopting different generation methods for high and low freedom tasks, balancing the relationship between the two. UniVG generates smooth and coherent videos from a single image or text prompt, showcasing stability and coherence in each frame compared to early AI video generation tools.

ERNIE Bot, Wenxin Yige, and Wenxin Qianfan

Baidu's Wenxin large model series, initiated in 2019, is a natural language processing model based on the ERNIE series. The 4.0 version, released in October 2023, marked a comprehensive upgrade in fundamental capabilities, aligning with the performance standards set by GPT-4.

ERNIE Bot, akin to Alibaba's Tongyi Qianwen, serves as a generative AI product for various Q&A interactions.

Wenxin Yige, an AI art creation platform, generates diverse AI creative images to aid in creative design.

Wenxin Qianfan serves as Baidu's enterprise-level large model production platform, providing services and tools for large model development and application.

ByteDance

SDXL-Lightning - ByteDance's Version of DALL·E for Text-to-Image Generation

SDXL-Lightning, developed by ByteDance, is an open-source text-to-image generation model, swiftly producing high-resolution images based on textual prompts. Its notable improvement lies in the accelerated generation speed, achieving text-to-image generation at 1024px resolution in minimal steps.

Meitu

MiracleVision - Meitu's AI Vision Large Model

Meitu's AI vision large model, MiracleVision, launched its closed beta in June 2023. Boasting powerful visual expressive and creative capabilities, it supports multiple renowned Meitu products. As of version 4.0, MiracleVision contributes to Meitu's product ecosystem, extending its influence to various industries, including e-commerce, advertising, gaming, animation, and film.

iFlytech

Xinghuo Yuyin - iFlyTek's AI Speech Model

Xinghuo Yuyin, iFlyTek's AI speech model, unifies recognition, translation, and multi-language classification tasks. This model excels in improving speech recognition accuracy and features super-human speech synthesis capabilities, closely mimicking natural human speech patterns.

Tencent

M2UGen - Multi-Modal Music Generation Model

M2UGen, Tencent's multi-modal music generation model, combines music understanding and generation capabilities, aiding users in artistic music creation. It supports music generation from text, images, videos, and audio inputs, offering users the ability to edit the generated music easily.

AnimateZero - Tencent's AI Video Generation Model

AnimateZero, released by Tencent's AI team, is an AI video generation model enhancing the precision of video appearance and motion through improved pre-trained video diffusion models. Users can generate videos by inputting text and images, creating dynamic and detailed content.

Xiaohongshu (Red, a social e-commerce app)

Hongshu Zhiyu - Xiaohongshu Text Generation Tool

Hongshu Zhiyu, introduced by Xiaohongshu, is an AI tool for automatically generating Xiaohongshu-style captions based on image content. The tool incorporates a text generation feature, a database of 15 million Xiaohongshu-style captions validated by users, and customizable options for tailoring captions based on individual preferences.

Qihoo 360

AIJi - 360's AI Conversational Agent

AIJi, launched by Qihoo 360, is an AI conversational agent designed for user interaction. It boasts comprehensive language understanding and generation capabilities, supporting natural and diverse conversations with users.

The Chinese technology landscape has seen significant advancements in AI development, with major companies releasing a multitude of models and applications in various domains such as language, image, audio, and video generation. These developments not only showcase the competitiveness of Chinese tech giants but also contribute to the global AI landscape, fostering innovation and pushing the boundaries of what AI can achieve.

关键词： AI generation model image video images

转载声明：本文为转载发布，仅代表原作者或原平台态度，不代表我方观点。今日新西兰仅提供信息发布平台，文章或有适当删改。对转载有异议和删稿要求的原著方，可联络[email protected]。