英伟达的人工智能游乐场

作者: 英伟达

时间: 2023-08-11

分类: AI应用

标签: NVIDIA

阅读(1309)

👍推荐(0)

在8月9号，英伟达提供了一个AI PlayGround，俗称“人工智能游乐场”。
11231616-2023-08-11T15:16:31.png

这次游乐场一共推出了四个项目：NeVA、Stable Diffusion XL、CLIP、LLaMa 2。可以根据兴趣依次体验，一定是用了最好的GPU，速度很快。

NeVA: NeMo Vision and Language Assistant

NeVA is a multi-modal vision-language model that understands text and images and generates informative responses.

先来看一下NeVA:它是一种多模态视觉语言模型，可以理解文本和图像并生成信息丰富的响应，也就是可以和图片对话的意思呗。操作界面如下，我试一下：我上传了一张5M的图片，速度真的秒传，生成的速度大概也就不到10秒。好像对中文不太友好，不过能理解中文，用英文回复的内容还算准确。
同时，提供了官方的示例，一键点击左下角的图片，即可开始对话。

11231148-2023-08-11T15:12:01.png

Stable Diffusion XL

NeVA is a multi-modal vision-language model that understands text and images and generates informative responses.
我们再来看一下Stable Diffusion XL (SDXL) ：这个大家应该不陌生，应用最广泛的AI绘图模型。到这里不得不夸一下老黄，直接给出1024*1024的大图，并且出图在10秒左右。下次买显卡一定要买老黄家的。
这是英文提示词的表现，
11231213-2023-08-11T15:12:30.png

CLIP

The CLIP (Contrastive Language-Image Pretraining) model combines vision and language using contrastive learning. It understands images and text together, enabling tasks like image classification and object detection.
CLIP（对比语言-图像预训练）模型通过对比学习将视觉和语言结合起来。它可以同时理解图像和文本，从而实现图像分类和对象检测等任务。
我上传了三张图片，然后提示梯田，它很快给出了结果。这个项目有点偏大人，我不爱玩，去下一个。

Llama 2

最后一个是Llama 2
Llama 2 is a large language AI model capable of generating text and code in response to prompts.

11231400-2023-08-11T15:14:16.png

地址：https://catalog.ngc.nvidia.com/?filters=&orderBy=weightPopularDESC&query=

英伟达的人工智能游乐场

NeVA: NeMo Vision and Language Assistant

Stable Diffusion XL

CLIP

Llama 2

最新文章

分类

标签云

阅读排行榜

推荐排行榜

回复排行榜

最近回复

邮件订阅