Alpaca Simplifies Running Advanced AI Language Models on Linux

stuckgum , 8 hours ago

Looks great, will try later today

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

advent_fantasy , 8 hours ago

Let me know how it goes. Haven’t mustered up the courage to try on my computer yet. But I definitely will.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

cmgvd3lw , 6 hours ago

I did try with a very small model. Its quick and you can download 20+ models from the list.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

advent_fantasy , 4 hours ago

Nice. I’m definitely giving this a shot.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

thingsiplay , 7 hours ago

Also recommended (I use the Flatpak version): GPT4All

Website: www.nomic.ai/gpt4all

Source: github.com/nomic-ai/gpt4all

Flatpak: flathub.org/apps/io.gpt4all.gpt4all

And no, this has nothing to do with ChatGPT. It can download different AI models from HuggingFace and run them on CPU or GPU.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Onihikage , 7 hours ago

I actually found GPT4ALL through looking into Kompute (Vulkan Compute), and it led me to question why anyone would bother with ROCm or OpenCL at all.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

thingsiplay , 7 hours ago

OpenCL is needed for me for non AI stuff, so that Darktable (an image program) can use my GPU; which is much faster. But for AI? No idea how they compare, as I did not use it for that purpose. ROCm itself also is troubling…

Do you have the new Llama 3.1 8B Instruct 128k model? It’s quite slow on my GPU (I have a weak beginner class GPU with 8GB, but plan to upgrade). To the point its almost as slow as my CPU. I’ve read complains in the Github tracker from others too and wonder if its an issue with AMD cards. BTW the previous model Llama 3.0 8B Instruct is miles faster.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Onihikage , 4 hours ago

I have a fairly substantial 16gb AMD GPU, and when I load in Llama 3.1 8B Instruct 128k (Q4_0), it gives me about 12 tokens per second. That’s reasonably fast enough for me, but only 50% faster than CPU (which I test by loading mlabonne’s abliterated Q4_K_M version, which runs on CPU in GPT4All, though I have no idea if that’s actually meant to be comparable in performance).

Then I load in Nous Hermes 2 Mistral 7B DPO (also Q4_0) and it blazes through at 50+ tokens per second. So I don’t really know what’s going on there. Seems like performance varies a lot from model to model, but I don’t know enough to speculate why. I can’t even try Gemma2 models, GPT4All just crashes with them. I should probably test Alpaca to see if these perform any different there…

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

thingsiplay , 59 minutes ago

Wow it got worse for me. Maybe through last update? Is this probably related to he application? Now I get 12 t/s on my CPU and switching to GPU it’s only 1.5 t/s. Something is fishy. With Nous hermes 2 Mistral 7B DPO with q4 I get 33 t/s (I believe it was up to 44 before).

Now I’m curious if this will happen with a different application too, but I have nothing else than GPT4All installed.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Fisch , 6 hours ago

I run models like Stable Diffusion and Llama with ROCm but models like RealESRGAN for upscaling or Rife for interpolation with Tencents Vulkan thingy (forgot what it’s called) and that’s far easier. Would be cool if LLMs and stuff could just be run with Vulkan too.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

lord_ryvan , 7 hours ago

Bookmarking it to check this out after work

… I should really go through these bookmarks one day

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

thingsiplay , 7 hours ago

I have a separate “ToDo” bookmark folder with temporary content, that I want to look in the near future. And for things I am looking into in near future, the pages are already in the browser open as tabs and loaded everytime I start the browser (but in an unloaded state until I click it).

… I also should really go through these bookmarks and tabs one day.^^

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

lord_ryvan , 5 hours ago

I bookmarked it in Lemmy, available through both my PC browser and my mobile app. But I’m not sure if I can make bookmark folders/groups there.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

thingsiplay , 4 hours ago

Oh right. I never used the Lemmy bookmarking. And was thinking of browser bookmarks (Firefox). Right. I never thought about that.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

lord_ryvan , 3 hours ago

It’s nice to have any device with access to my Lemmy account also have access to my bookmarks.

… So I can ignore them all those devices simultaneously 😅

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

sntx , 7 hours ago

I’m happy with Open WebUI

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

cmgvd3lw , 6 hours ago

While downloading models, the progress bar is getting decreased sometimes, like from 11% it’ll go back to 10%. Wired.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...