分类

商品

商品

店铺

资讯

热门搜索 : WordPress 织梦企业官网小说源码 Discuz

服务器低至9.9￥/月

当前位置：首页 > 资讯 > 系统环境

Langchain通过gradio_tools支持集成多模态大模型

时间：2025-12-06 22:34 作者：来源：阅读：0
扫一扫，手机访问

摘要：大家都还记得之前的AutoGPT,HuggingfaceGPT变相调用各种模型实现多模态等探索更广泛性场景。目前langchain通过gradio_tools集成涵盖(文本，语音，视频以及多个场景彼此转换等)，本质tools调用hf的api space，例如：依据图片创造音乐(ImageToMusicTool)，图片进行分割（SAMImageSegmentationTool）目前langchain

大家都还记得之前的AutoGPT,HuggingfaceGPT变相调用各种模型实现多模态等探索更广泛性场景。目前langchain通过gradio_tools集成涵盖(文本，语音，视频以及多个场景彼此转换等)，本质tools调用hf的api space，例如：

依据图片创造音乐(ImageToMusicTool)，

Langchain通过gradio_tools支持集成多模态大模型

图片进行分割（SAMImageSegmentationTool）

Langchain通过gradio_tools支持集成多模态大模型

目前langchain支持以下多模态模型集成：

Langchain通过gradio_tools支持集成多模态大模型

一.下面介绍几个常用场景

1.文本生成图片

使用模型StableDiffusion

#https://huggingface.co/spaces/gradio-client-demos/text-to-image

Langchain通过gradio_tools支持集成多模态大模型

2.语音转文字

使用模型openai-whisper

#https://huggingface.co/spaces/abidlabs/whisper

Langchain通过gradio_tools支持集成多模态大模型

3.文字转语音

使用模型suno/bark

#https://huggingface.co/spaces/suno/bark

Langchain通过gradio_tools支持集成多模态大模型

4.文本转视频

使用达摩院的模型
amo-vilab/modelscope-damo-text-to-video-synthesis

#https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis

Langchain通过gradio_tools支持集成多模态大模型

二.主要代码实现

import os

from gradio_tools.tools import StableDiffusionTool,

WhisperAudioTranscriptionTool,

BarkTextToSpeechTool,

TextToVideoTool

#text-to-image

#写英文提示词在StableDiffusion model上

sd_local_file_path = StableDiffusionTool().langchain.run("Please create a photo of a dog riding a skateboard")

#本质加载hf上的sd space ""

"""

Loaded as API: https://gradio-client-demos-text-to-image.hf.space ✔

Job Status: Status.STARTING eta: None

"""

print("sd_local_file_path:",sd_local_file_path)

from PIL import Image

im = Image.open(sd_local_file_path)

print("文本生成图片地址:",im)

im.save("./data/"+os.path.basename(sd_local_file_path))

#audio-to-texts 语音转文字

stt_text = WhisperAudioTranscriptionTool().langchain.run("audio/68570059060983616_0_15.mp3")

print("语音转文字内容:",stt_text)

"""

Loaded as API: https://abidlabs-whisper.hf.space ✔

"""

#text-to-audio 文本转语音

tts_audio=BarkTextToSpeechTool().langchain.run("我是中国人")

print("文字转语音文件:","./data/"+os.path.basename(tts_audio))

"""

Loaded as API: https://suno-bark.hf.space ✔

Job Status: Status.STARTING eta: None

Due to heavy traffic on this app, the prediction will take approximately 72 seconds.For faster predictions without waiting in queue, you may duplicate the space using: Client.duplicate(suno/bark)

Job Status: Status.IN_QUEUE eta: 72.77052729940723

Due to heavy traffic on this app, the prediction will take approximately 48 seconds.For faster predictions without waiting in queue, you may duplicate the space using: Client.duplicate(suno/bark)

Job Status: Status.IN_QUEUE eta: 48.52035075205344

Job Status: Status.PROCESSING eta: None

文字转语音文件:
./data/tmpgvejiznvv3ohaiwu.wav

"""

#text-to-viedo 文本转视频仅仅支持英文

ttv_local_file_path=TextToVideoTool().langchain.run("A panda eating bamboo on a rock.")

"""

Loaded as API: https://damo-vilab-modelscope-text-to-video-synthesis.hf.space ✔

"""

print("文本转视频地址:","./data/"+os.path.basename(ttv_local_file_path))

全部评论(0)

上一篇：什么是云原生，讲述如何学习
下一篇：Gradio机器学习App生产部署方案

最新发布的资讯信息
【系统环境|】90%车主没开这些免费功能，激活立马香到爆(2025-12-06 23:17)
【系统环境|】听听：人做五梦，是老天在帮你暗示大吉″，大多是好运的开始！(2025-12-06 23:17)
【系统环境|】每天给娃念这些儿歌，越念越聪明，口才好到让人夸(2025-12-06 23:16)
【系统环境|】获客成本直降50%，转化还能翻倍？一个案例讲透“线索型私域”的打法新范式(2025-12-06 23:16)
【系统环境|】大蒜出苗期别乱浇水！老农秘传“旱养根”绝技，苗粗根壮高产稳(2025-12-06 23:16)
【系统环境|】终极刷机：自定义开机界面与程序固件不再受制于他！(2025-12-06 23:16)
【系统环境|】手把手教你在windows/ubuntu下烧录瑞芯微开发板固件(2025-12-06 23:16)
【系统环境|】教你如何刷入安卓TV系统？(2025-12-06 23:16)
【系统环境|】全国产瑞芯微 RK3588 COMe 模块，嵌入式核心板实力派(2025-12-06 23:15)
【系统环境|】从边缘计算到网关，打造全国产化工控系统基石。(2025-12-06 23:15)

真快激活码

店铺

推荐商品

手机访问领取大礼包