I’m using koboldcpp and ollama. KoboldCpp is really awesome. In terms of hardware it’s an old PC with lots of RAM but no graphics card, so it’s quite slow for me. I occasionally rent a cloud GPU instance on runpod.io Not doing anything fancy, mainly role play, recreational stuff and I occasionally ask it to give me creative ideas for something, translate something or re-word or draft an unimportant text / email.
Have tried coding, summarizing and other stuff, but the performance of current AI isn’t enough for my everyday tasks.
Probably better to ask on !localllama. Ollama should be able to give you a decent LLM, and RAG (Retrieval Augmented Generation) will let it reference your dataset.
The only issue is that you asked for a smart model, which usually means a larger one, plus the RAG portion consumes even more memory, which may be more than a typical laptop can handle. Smaller models have a higher tendency to hallucinate - produce incorrect answers.
Short answer - yes, you can do it. It’s just a matter of how much RAM you have available and how long you’re willing to wait for an answer.
You should probably hook up with the SillyTavern crowd. It’s a frontend to chat with LLMs that will do what you want. Its main purpose is chat role-play. You can assign a persona to the LLM and ST will handle the prompt to make it work. It also handles jailbreaks if you want to use one of the big ones (no idea if it works well). You can also connect to other services that run open models, including aihorde.
Hello internet users. I have tried gpt4all and like it, but it is very slow on my laptop. I was wondering if anyone here knows of any solutions I could run on my server (debian 12, amd cpu, intel a380 gpu) through a web interface. Has anyone found any good way to do this?
Depends on your needs. Best look around in !localllama or similar. (I don’t wanna say reddit but r/localLlama is much larger.)
If you’re more into creative writing, maybe look for places that discuss SillyTavern (r/SillyTavernAI is an option). It’s software for role-play chats, which may not be what you want. But the community is (relatively) large and likely to have good tips for non-coding/less technical applications.
There are “models” (usually meaning “weights” specifically). The models (weights) are the raw data that contain the wisdom of the AIs.
There is software to “run” these models on your own machine, or to connect to an API.
There are services which run models for you and let you interact via web interface, app, or API. Some services may add text to your prompt, to create a better(?) prompt.
There is a bewildering array of models out there. Mostly, they are specialized and/or merged versions of some popular foundation models (by mistral, meta, and a few others). Without endorsing any service, I find that openrouter.ai and together.ai let you try a fairly large selection. There are other services.
You can find more/better information here: !localllama
I’ve recently played with the idea of self hosting a LLM. I am aware that it will not reach GPT4 levels, but beeing free from restraining prompts with confidential data is very nice tool for me to have....
I personally use llama.cpp in a VM, however if you have a nvidia GPU with lots of VRAM you’ve got more options available, as well as much faster inference (text generation) speed.
Check out the community at !localllama, they’re pretty experienced with running LLMs locally
Any of you have a self-hosted AI "hub"? (e.g. for LLM, stable-diffusion, ...)
I’ve been looking into self-hosting LLMs or stable diffusion models using something like LocalAI and / or Ollama and LibreChat....
Self hoating an LLM for research
I am a teacher and I have a LOT of different literature material that I wish to study, and play around with....
Inside the Creation of the World’s Most Powerful Open Source AI Model (www.wired.com)
I'm increasingly unhappy with the limits on AI text generation and I have heard that it's not that hard to do it on a laptop oneself. What is the best path forward?
I saw Generative AI for Beginners from Microsoft on GitHub. I’ve looked at fmhy.pages.dev/ai but I’m not sure what I’m really looking for....
Self hosted LLM
Hello internet users. I have tried gpt4all and like it, but it is very slow on my laptop. I was wondering if anyone here knows of any solutions I could run on my server (debian 12, amd cpu, intel a380 gpu) through a web interface. Has anyone found any good way to do this?
Smaug-72B-v0.1: The New Open-Source LLM Roaring to the Top of the Leaderboard (huggingface.co)
Abacus.ai:...
13 Best Open Source ChatGPT Alternatives (itsfoss.com)
For those of you who don’t want to use a ChatGPT but want a LLM.
Have you tried LocalGPT PrivateGPT or other similar alternatives to ChatGPT?
I’m interested in hosting something like this, and I’d like to know experiences regarding this topic....
Is there any free to use AI that accepts images and can talk about them?
deleted_by_author
ChatGPT can now remember who you are and what you want (www.theverge.com)
Selfhosted LLM (ChatGPT)
I’ve recently played with the idea of self hosting a LLM. I am aware that it will not reach GPT4 levels, but beeing free from restraining prompts with confidential data is very nice tool for me to have....