There have been multiple accounts created with the sole purpose of posting advertisement posts or replies containing unsolicited advertising.

Accounts which solely post advertisements, or persistently post them may be terminated.

stuckgum ,

Looks great, will try later today

advent_fantasy ,

Let me know how it goes. Haven’t mustered up the courage to try on my computer yet. But I definitely will.

cmgvd3lw ,

I did try with a very small model. Its quick and you can download 20+ models from the list.

advent_fantasy ,

Nice. I’m definitely giving this a shot.

thingsiplay ,

Also recommended (I use the Flatpak version): GPT4All

And no, this has nothing to do with ChatGPT. It can download different AI models from HuggingFace and run them on CPU or GPU.

Onihikage ,
@Onihikage@beehaw.org avatar

I actually found GPT4ALL through looking into Kompute (Vulkan Compute), and it led me to question why anyone would bother with ROCm or OpenCL at all.

thingsiplay ,

OpenCL is needed for me for non AI stuff, so that Darktable (an image program) can use my GPU; which is much faster. But for AI? No idea how they compare, as I did not use it for that purpose. ROCm itself also is troubling…

Do you have the new Llama 3.1 8B Instruct 128k model? It’s quite slow on my GPU (I have a weak beginner class GPU with 8GB, but plan to upgrade). To the point its almost as slow as my CPU. I’ve read complains in the Github tracker from others too and wonder if its an issue with AMD cards. BTW the previous model Llama 3.0 8B Instruct is miles faster.

Onihikage ,
@Onihikage@beehaw.org avatar

I have a fairly substantial 16gb AMD GPU, and when I load in Llama 3.1 8B Instruct 128k (Q4_0), it gives me about 12 tokens per second. That’s reasonably fast enough for me, but only 50% faster than CPU (which I test by loading mlabonne’s abliterated Q4_K_M version, which runs on CPU in GPT4All, though I have no idea if that’s actually meant to be comparable in performance).

Then I load in Nous Hermes 2 Mistral 7B DPO (also Q4_0) and it blazes through at 50+ tokens per second. So I don’t really know what’s going on there. Seems like performance varies a lot from model to model, but I don’t know enough to speculate why. I can’t even try Gemma2 models, GPT4All just crashes with them. I should probably test Alpaca to see if these perform any different there…

thingsiplay ,

Wow it got worse for me. Maybe through last update? Is this probably related to he application? Now I get 12 t/s on my CPU and switching to GPU it’s only 1.5 t/s. Something is fishy. With Nous hermes 2 Mistral 7B DPO with q4 I get 33 t/s (I believe it was up to 44 before).

Now I’m curious if this will happen with a different application too, but I have nothing else than GPT4All installed.

Fisch ,
@Fisch@discuss.tchncs.de avatar

I run models like Stable Diffusion and Llama with ROCm but models like RealESRGAN for upscaling or Rife for interpolation with Tencents Vulkan thingy (forgot what it’s called) and that’s far easier. Would be cool if LLMs and stuff could just be run with Vulkan too.

lord_ryvan ,

Bookmarking it to check this out after work

… I should really go through these bookmarks one day

thingsiplay ,

I have a separate “ToDo” bookmark folder with temporary content, that I want to look in the near future. And for things I am looking into in near future, the pages are already in the browser open as tabs and loaded everytime I start the browser (but in an unloaded state until I click it).

… I also should really go through these bookmarks and tabs one day.^^

lord_ryvan ,

I bookmarked it in Lemmy, available through both my PC browser and my mobile app. But I’m not sure if I can make bookmark folders/groups there.

thingsiplay ,

Oh right. I never used the Lemmy bookmarking. And was thinking of browser bookmarks (Firefox). Right. I never thought about that.

lord_ryvan ,

It’s nice to have any device with access to my Lemmy account also have access to my bookmarks.

… So I can ignore them all those devices simultaneously 😅

sntx ,

I’m happy with Open WebUI

cmgvd3lw ,

While downloading models, the progress bar is getting decreased sometimes, like from 11% it’ll go back to 10%. Wired.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • [email protected]
  • random
  • lifeLocal
  • goranko
  • All magazines