安装 Steam
登录
|
语言
繁體中文(繁体中文)
日本語(日语)
한국어(韩语)
ไทย(泰语)
български(保加利亚语)
Čeština(捷克语)
Dansk(丹麦语)
Deutsch(德语)
English(英语)
Español-España(西班牙语 - 西班牙)
Español - Latinoamérica(西班牙语 - 拉丁美洲)
Ελληνικά(希腊语)
Français(法语)
Italiano(意大利语)
Bahasa Indonesia(印度尼西亚语)
Magyar(匈牙利语)
Nederlands(荷兰语)
Norsk(挪威语)
Polski(波兰语)
Português(葡萄牙语 - 葡萄牙)
Português-Brasil(葡萄牙语 - 巴西)
Română(罗马尼亚语)
Русский(俄语)
Suomi(芬兰语)
Svenska(瑞典语)
Türkçe(土耳其语)
Tiếng Việt(越南语)
Українська(乌克兰语)
报告翻译问题
I beg of you
And models/libraries have advanced a lot as well.
Exllama2 is nuts for speed, if you can load the whole model in VRAM it can answer in less than 5 seconds (but requieres modern nvidia gpus)
Llamacpp is a bit slower but can work anywhere (even without gpu but it will be slow) and use some system memory if vram is not enough.
Models deppends on what are you searching for, i have found out that nowadays there are models that can have a nice conversation with the user but they still narrate things and break sometimes.
If anyone of you is still really interested, send me a message anywhere and we can try to make it real
Does ExLLaMa have a server? I tried to move from llama.cpp to exl as soon as got powerful enough gpu but couldnt find it
https://github.com/oobabooga/text-generation-webui/wiki/12-%E2%80%90-OpenAI-API
ZERO cost. You can run one instance almost everywhere.