There have been multiple accounts created with the sole purpose of posting advertisement posts or replies containing unsolicited advertising.

Accounts which solely post advertisements, or persistently post them may be terminated.

Gamey ,

I don’t understand shit like that, they are tools, not totally accurate ones but unless you use Bing they do produce a lot of good stuff if used correctly…

BitOneZero ,
@BitOneZero@beehaw.org avatar

I think public-facing they have to be that way, otherwise they would copyright infringe on their training material. Behind the scenes, I suspect that the wealthy can gain access to AI engines where the random response isn’t set so high and they can even fact-check and cite their own training material better. It’s really hard to imagine that they can debug these things without having any idea what training material influenced which pattern of associations. I sure don’t buy that they don’t have tools to trace back to training material.

Right now consumer-facing AI wants to put in simple prompts and get back unique term papers each time you ask it the same question.

lloram239 ,

Today’s Large Language Models are Essentially BS Machines

Apparently so are today’s bloggers and journalists. Since they just keep repeating the same nonsense and seem to lack any sense of understanding. I am really starting to question if humans are capable of original thought.

The responses all came with citations and links to sources for the fact claims. And the responses themselve all sound entirely reasonable. They are also entirely made up.

This does not compute. Bing Chat provides sources, as in links you can click on and that work. It doesn’t pull things out of thin air, it pulls information out of Bing search and summarizes it. That information is often wrong, incomplete and misleading, as it will only take a tiny number of websites to source that information. But so would most humans using Bing search. So not really a problem with the bot itself.

ChatGPT gives most of the time far better answers, as it bases the answers on knowledge gained from all the sources, not just specific ones. But that also means it can’t provide sources and if you pressure it to give you some, it will make them up. And depending on the topic, it might also not know something for which Bing can find a relevant website.

LLMs are trained not to produce answers that meet some kind of factual threshold, but rather to produce answers that sound reasonable.

And guess what answer sounds the most reasonable? A correct one. People seriously seem to have a hard time to grasp how freakishly difficult it is to generate plausible language and how much stuff has to be going on behind the scene to make that possible. That does not mean GPT will be correct all the time or be an all knowing oracle, but you’ll have to be rather stupid to expect that to begin with. It’s simple the first chatbot that actually kind of works a lot of the time. And yes, it can reason and understand within its limits, it making mistakes from time to time does not refute that, especially when badly prompted (e.g. asking it to solve a problem step by step can dramatically improve the answers).

LLMs are not people, but neither are they BS generators. In plenty of areas they already outperform humans and in others not so much. But you are not learning that from articles that treat every little mistake from an LLM like some huge gotcha moment.

FlashMobOfOne ,
@FlashMobOfOne@beehaw.org avatar

These AI systems do make up bullshit often enough that there’s even a term for it: Hallucination.

Kind of a euphemistic term, like how religious people made up the word ‘faith’ to cover for the more honest term: gullible.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

No one is saying there’s problems with the bots (though I don’t understand why you’re being so defensive of them – they have no feelings so describing their limitations doesn’t hurt them).

The problem is what humans expect from LLMs and how humans use them. Their purposes is to string words together in pretty ways. Sometimes those ways are also correct. Being aware of what they’re designed to do, and their limitations, seems important for using them properly.

lloram239 ,

they have no feelings so describing their limitations

These kinds of articles, which all repeat exactly the same extremely basic points and make lots of fallacious ones, are absolute dogshit at describing the shortcomings of AI. Many of them don’t even bother actually testing the AI themselves, but just repeat what they heard elsewhere. Even with this one I am not sure what exactly they did, as Bing Chat works completely different for me from what is reported here. It won’t hurt the AI, but it certainly hurts me reading the same old minimum effort content over and over and over again, and they are the ones accusing AI of generating bullshit.

The problem is what humans expect from LLMs and how humans use them.

Yes, humans are stupid. They saw some bad sci-fi and now they expect AI to be capable of literal magic.

radix ,
@radix@lemm.ee avatar

What else should they be?? They reflect human language.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

People think they are actually intelligent and perform reasoning. This article discusses how and why that is not true.

lloram239 ,

People think they are actually intelligent and perform reasoning.

They do both. The articles fails to successfully argue that point and just turns AIs failure to answer an irrelevant trivia question into a gotcha moment.

BitOneZero ,
@BitOneZero@beehaw.org avatar

I think their creators have deliberately disconnected the runtime AI model from re-reading their own training material because it’s a copyright and licensing nightmare.

StringTheory ,

This reminds me of an article about journalism and the internet, from ages ago. A class was asked how they would research for a topic (it was some recent political event, I don’t remember). The class confidently answered “the internet.” The professor struggled to get them to understand that wasn’t enough. Yes, there is all kinds of stuff about this event on the internet, but how did it get there?. And more importantly, what is missing?

Sure, all the sexy AI stuff gives us goosebumps and sounds great. But how did it get there, and what is missing? Someone somewhere has to do the actual original work first, or it’s just making collages from the same library over and over and over again.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

And also it’s no replacement for actual research, either on the Internet or in real life.

People assume LLMs are like people, in that they won’t simply spout bullshit if they can avoid it. But as this article properly points out, they can and do. You can’t really trust anything they output. (At least not without verifying it all first.)

upstream ,

As with any tool it is how you use it that matters.

Today’s LLM’s are capable of fairly amazing stuff.

It’s a BS machine? Sure. Have you read or written stuff for higher education?

You don’t get points for being short and concise, even though you should. You get points for following the BS formula.

You know who else is good at BS?

LLM’s. If you manage to provide it enough meaningful input it can do a great lot of BS legwork for you.

I see people who overuse it, don’t edit, isn’t critical. Sure. Then you end up with just BS.

But there’s plenty of useful applications, like writing boiler plate code (see also CoPilot), structuring code, tests, etc.

Is it worth all the hype? Nope.

Some of it? Probably.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

Yeah definitely not saying it’s not useful :) But it also doesn’t do what people widely believe it does, so I think articles like this are helpful.

HalJor ,
@HalJor@beehaw.org avatar

People assume LLMs are like people, in that they won’t simply spout bullshit if they can avoid it.

There are plenty of people who spout bullshit every chance they get.

Zaktor ,

They’re both BS machines and fact generators. It produced bullshit when asked about him because as far as I can tell he’s kind of a nobody, not because it’s just a stylistic generator. If he asked about a more prominent person likely to exist more significantly within the training corpus, it would likely be largely accurate. The hallucination problem stems from the system needing to produce a result regardless of whether it has a well trained semantic model for the question.

LLMs encode both the style of language and semantic relationships. For “who is Einstein”, both paths are well developed and the result is a reasonable response. For “who is Ryan McGreal”, the semantic relationships are weak or non-existent, but the stylistic path is undeterred, leading to the confidently plausible bullshit.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

They don’t generate facts, as the article says. They choose the next most likely word. Everything is confidently plausible bullshit. That some of it is also true is just luck.

kogasa ,
@kogasa@programming.dev avatar

It’s obviously not “just” luck. We know LLMs learn a variety of semantic models of varying degrees of correctness. It’s just that no individual (inner) model is really that great, and most of them are bad. LLMs aren’t reliable or predictable (enough) to constitute a human-trustable source of information, but they’re not pure gibberish generators.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

No, it’s true, “luck” might be overstating it. There’s a good chance most of what it says is as accurate as the corpus it was trained on. That doesn’t personally make me very confident, but ymmv.

Zaktor ,

That’s just not true. Semantic encodings work. It’s not like neural networks are some new untested concept, the LLMs have some new tricks under the hood and are way more extensive in their training goal, but they’re fundamentally the same thing. All neural networks are mimicry machines enabled and limited by their data, but mimicking largely correct data produces largely correct results when the answer, or interpolatable answers exists in the training data. The problem arises when asked to go further and further afield from their inputs. Some interpolation and substitutions work, but it gets increasingly unreliable the more niche the answer is.

While the LLM hype has very seriously oversold their abilities, the instinctive backlash to say they’re useless is similarly way off-base.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

No one is saying “they’re useless.” But they are indeed bullshit machines, for the reasons the author (and you yourself) acknowledged. Their purposes is to choose likely words. That likely and correct are frequently the same shouldn’t blind us to the fact that correctness is a coincidence.

Zaktor ,

That likely and correct are frequently the same shouldn’t blind us to the fact that correctness is a coincidence.

That’s an absurd statement. Do you have any experience with machine learning?

Veraticus OP ,
@Veraticus@lib.lgbt avatar

It isn’t; I do; do you?

Zaktor ,

Yes, it’s been my career for the last two decades and before that was the focus of my education. The idea that “correctness is a coincidence” is absurd and either fails to understand how training works or rejects the entire premise of large data revealing functional relationships in the underlying processes.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

Or you’ve simply misunderstood what I’ve said despite your two decades of experience and education.

If you train a model on a bad dataset, will it give you correct data?

If you ask a question a model it doesn’t have enough data to be confident about an answer, will it still confidently give you a correct answer?

And, more importantly, is it trained to offer CORRECT data, or is it trained to return words regardless of whether or not that data is correct?

I mean, it’s like you haven’t even thought about this.

Blapoo ,

They’re glorified autocompletes. Way too much attention is being given to LLMs in isolation. By themselves: Not a silver bullet.

But when called in a chain . . . eyebrows

gaytswiftfan ,

hmm i think we need twelve more articles on this

RickyRigatoni ,
@RickyRigatoni@lemmy.ml avatar

We should feed the ones already made to a LLM and have it write the next 12 for the irony.

Fizz ,
@Fizz@lemmy.nz avatar

Humans are bullshit machines as well.

renard_roux ,

A chip off the ol’ block, then 🙂

Thorny_Thicket ,

This is what I find the most amusing about the criticism of LLMs and many other AI systems aswell. People often talk about them as if they’re somehow uniquely flawed, while in reality what they’re doing isn’t that different from what humans do aswell. The biggest difference is that when a human hallucinates it’s often obvious but when chatGPT does that it’s harder to spot.

dr_catman ,

This is… really not true at all.

LLMs differ from humans in a very very important way when it comes to language: we know the meanings of the words we use. LLMs do not “know” things, are unconcerned with “meanings”, and thus cannot be said to be “using” words in any meaningful way.

Zaktor ,

we know the meanings of the words we use.

Uh, but we don’t? Not really. People use the wrong words all the time and each person’s definition (i.e., encoding) is slightly different. We mimic phrases and structures we’ve heard to sound smarter and forge on with uncertain statements because frequently they go unchallenged or simply aren’t important.

We’re more structurally complex than a LLM, but we fool ourselves in thinking we’re somehow uniquely thoughtful and reliable.

scrubbles ,
@scrubbles@poptalk.scrubbles.tech avatar

And everyone in tech who has worked on ML before collectively says “yeah that’s what we’ve been trying to tell you”. Don’t get me wrong, LLMs are a huge leap, but god did it show how greedy corporations are, just immediately jumping to “how quick can we lay people off?”. The tech is not to that spec. Yet. It will get there, but goddamn do we need to be demanding some regulations now

Veraticus OP , (edited )
@Veraticus@lib.lgbt avatar

I was mostly posting this because the last time LLMs came up, people kept on going on and on about how much their thoughts are like ours and how they know so much information. But as this article makes clear, they have no thoughts and know no information.

In many ways they are simply a mathematical party trick; formulas trained on so much language, they can produce language themselves. But there is no “there” there.

sincle354 ,

Sadly we don’t even know what “knowing” is, considering human memory changes every time it is accessed. We might just need language and language only. Right now they’re testing if generating verbalized trains of thought helps (it might?). The question might change to: Does the sum total of human language have enough consistency to produce behavior we might call consciousness? Can we brute force the Chinese room with enough data?

lily33 , (edited )

have no thoughts

True

know no information

False. There’s plenty of information stored in the models, and plenty of papers that delve into how it’s stored, or how to extract or modify it.

I guess you can nitpick over the work “know”, and what it means, but as someone else pointed out, we don’t actually know what that means in humans anyway. But LLMs do use the information stored in context, they don’t simply regurgitate it verbatim. For example (from this article):

If you ask an LLM what’s near the Eiffel Tower, it’ll list location in Paris. If you edit its stored information to think the Eiffel Tower is in Rome, it’ll actually start suggesting you sights in Rome instead.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

They only use words in context, which is their problem. It doesn’t know what the words mean or what the context means; it’s glorified autocomplete.

I guess it depends on what you mean by “information.” Since all of the words it uses are meaningless to it (it doesn’t understand anything of what it either is asked or says), I would say it has no information and knows nothing. At least, nothing more than a calculator knows when it returns 7 + 8 = 15. It doesn’t know what those numbers mean or what it represents; it’s simply returning the result of a computation.

So too LLMs responding to language.

lily33 ,

Why is that a problem?

For example, I’ve used it to learn the basics of Galois theory, and it worked pretty well.

  • The information is stored in the model, do it can tell me the basics
  • The interactive nature of taking to LLM actually helped me learn better than just reading.
  • And I know enough general math so I can tell the rare occasions (and they indeed were rare) when it makes things up.
  • Asking it questions can be better than searching Google, because Google needs exact keywords to find the answer, and the LLM can be more flexible (of course, neither will answer if the answer isn’t in the index/training data).

So what if it doesn’t understand Galois theory - it could teach it to me well enough. Frankly if it did actually understand it, I’d be worried about slavery.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

Basically the problem is point 3.

You obviously know some of what it’s telling you is inaccurate already. There is the possibility it’s all bullshit. Granted a lot of it probably isn’t, but it will tell you the bullshit with the exact same level of confidence as actual facts… because it doesn’t know Galois theory and it isn’t teaching it to you, it’s simply stringing sentences together in response to your queries.

If a human were doing this we would rightly proclaim the human a bad teacher that didn’t know their subject, and that you should go somewhere else to get your knowledge. That same critique should apply to the LLM as well.

That said it definitely can be a useful tool. I just would never fully trust knowledge I gained from an LLM. All of it needs to be reviewed for correctness by a human.

lily33 ,

That same critique should apply to the LLM as well.

No, it shouldn’t. Instead, you should compare it to the alternatives you have on hand.

The fact is,

  • Using LLM was a better experience for me then reading a textbook.
  • And it was also a better experience for me then watching recorded video lectures.

So, if I have to learn something, I have enough background to spot hallucinations, and I don’t have a teacher (having graduated college, that’s always true), I would consider using it, because it’s better then the alternatives.

I just would never fully trust knowledge I gained from an LLM

There are plenty of cases where you shouldn’t fully trust knowledge you gained from a human, too.

And there are, actually, cases where you can trust the knowledge gained from an LLM. Not because it sounds confident, but because you know how it behaves.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

Obviously you should do what you think is right, so I mean, I’m not telling you you’re living wrong. Do what you want.

The reason to not trust a human is different from the reasons not to trust an LLM. An LLM is not revealing to you knowledge it understands. Or even knowledge it doesn’t understand. It’s literally completing sentences based on word likelihood. It doesn’t understand any of what it’s saying, and none of it is rooted in any knowledge of the subject of any kind.

I find that concerning in terms of learning from it. But if it worked for you, then go for it.

pbjamm ,
@pbjamm@beehaw.org avatar

They are the perfect embodiment of the internet.

They know everything, but understand nothing

Dark_Arc ,
@Dark_Arc@social.packetloss.gg avatar

The tech is not to that spec. Yet.

I’m not sure it will. At least, not this tech, not this approach to the problem. From my understanding there’s fundamentally no comprehension; it’s not bugged, broken, or incomplete, it’s just not there… it’s missing from the design.

communist ,
@communist@beehaw.org avatar

We don’t know that for sure yet, we saw a lot of emergent intelligent properties appear as we scaled up, and we’re nowhere near done scaling LLM’s, I’m not saying it will be solved, just that we don’t know one way or the other yet.

BotCheese ,

And we’re nowhere near dome scalimg LLM’s

I think we might be, I remember hearing openAI was training on so much literary data that they didn’t and couldn’t find enough for testing the model. Though I may be misrememberimg.

newde ,

No that’s definitely the case. However, Microsoft is now working making LLM’s more dependent on several high quality sources. For example: encyclopedias will be more important sources than random reddit posts.

HobbitFoot ,

Microsoft is also using LinkedIn to help as well, getting users to correct articles generated by AI.

Zaktor ,

Cunningham’s Law may be very helpful in this respect.

“the best way to get the right answer on the internet is not to ask a question; it’s to post the wrong answer.”

lloram239 ,

There are still plenty of videos to watch and games to play. We might be running short on books, but there are many other sources of information that aren’t accessible to LLMs at the moment.

Also just because the training set contained most of the books, doesn’t mean the model itself was large enough to learn from all of them. The more detailed your questions get, the bigger the change it will get them wrong, even if that knowledge should have been in the training set. For example ChatGPT as walkthrough for games is pretty terrible, even so there should be more than enough walkthroughs in the training set to learn from, same for summarizing movies, it will do the most popular ones, but quickly fall apart with anything a little lesser known.

There is of course also the possibility that using the LLM as knowledge store by itself is a bad idea. Humans use books for that, not their brain. So an LLM that is very good at looking things up in a library could answer a lot more without the enormous models size and training cost.

Basically, there are still a ton of unexplored areas, even if we have collected all the digital books.

Dark_Arc , (edited )
@Dark_Arc@social.packetloss.gg avatar

I don’t believe in scaling as a way to discover understanding. Doing that is just praying that the machine comes alive… these machines weren’t programmed to come alive in that way. That’s my fundamental argument, the design of LLMs ignores understanding of the content… it doesn’t matter how much content it’s been scaled up to.

If I teach a real AI about fishing, it should be able to reason about fishing and it shouldn’t need to have read a supplementary knowledge of mankind to do it.

What the LLMs seem to be moving towards is more of a search and summary engine (for existing content). That’s a very similar and potentially quite useful thing, but it’s not the same thing as understanding.

It’s the difference between the kid that doesn’t know much but is really good at figuring it out based on what they know vs the kid that’s read all the text books front to back and can’t come up with anything original to save their life but can quickly regurgitate and summarize anything they’ve ever read.

communist ,
@communist@beehaw.org avatar

If I teach a real AI about fishing, it should be able to reason about fishing and it shouldn’t need to have read a supplementary knowledge of mankind to do it.

This is a faulty assumption.

In order for you to learn about fishing, you had to learn a shitload about the world. Babies don’t come out of the womb able to do such tasks, there is a shitload of prerequisite knowledge in order to fish, it’s unfair to expect an ai to do this without prerequisite knowledge.

Furthermore, LLM’s have been shown to do many things that aren’t in their training data, so the notion that it’s a stochastic parrot is also false.

Dark_Arc ,
@Dark_Arc@social.packetloss.gg avatar

Furthermore, LLM’s have been shown to do many things that aren’t in their training data, so the notion that it’s a stochastic parrot is also false.

And (from what I’ve seen) they get things wrong with extreme regularity, increasingly so as thing diverge from the training data. I wouldn’t say they’re a “stochastic parrot” but they don’t seem to be much better when things need to be correct… and again, based on my (admittedly limited) understanding of their design, I don’t anticipate this technology (at least without some kind of augmented approach that can reason about the substance) overcoming that.

In order for you to learn about fishing, you had to learn a shitload about the world. Babies don’t come out of the womb able to do such tasks, there is a shitload of prerequisite knowledge in order to fish, it’s unfair to expect an ai to do this without prerequisite knowledge.

That’s missing the forest for the trees. Of course an AI isn’t going to go fishing. However, I should be able to assert some facts about fishing and it should be able to reason based on those assertions. e.g. a child can work off of facts presented about fishing, “fish are hard to catch in muddy water” -> “the water is muddy, does that impact my chances of a catching a bluegill?” -> “yes, it does, bluegill are fish, and fish don’t like muddy water”.

There are also “teachings” brought about by how these are programmed that make the flaws less obvious, e.g., if I try to repeat the experiment in the post here Google’s Bard outright refuses to continue because it doesn’t have information about Ryan McGee. I’ve also seen Bard get notably better as it’s been scaled up, early on I tried asking it about RuneScape and it spewed absolute nonsense. Now… It’s reasonable-ish.

I was able to reproduce a nonsense response (once again) by asking about RuneScape. I asked how to get 99 firemaking, and it invented a mechanic that doesn’t exist “Using a bonfire in the Charred Stump: The Charred Stump is a bonfire located in the Wilderness. It gives 150% Firemaking experience, but it is also dangerous because you can be attacked by other players.” This is a novel (if not creative) invention of Bard likely derived from advice for training Prayer (which does have something in the Wilderness which gives 350% experience).

communist , (edited )
@communist@beehaw.org avatar

And (from what I’ve seen) they get things wrong with extreme regularity, increasingly so as thing diverge from the training data. I wouldn’t say they’re a “stochastic parrot” but they don’t seem to be much better when things need to be correct… and again, based on my (admittedly limited) understanding of their design, I don’t anticipate this technology (at least without some kind of augmented approach that can reason about the substance) overcoming that.

Keep in mind, you’re talking about a rudimentary, introductory version of this, my argument is that we don’t know what will happen when they’ve scaled up, we know for certain hallucinations become less frequent as the model size increases (see the statistics on gpt3 vs 4 on hallucinations), perhaps this only occurs because they haven’t met a critical size yet? We don’t know.

There’s so much we don’t know.

That’s missing the forest for the trees. Of course an AI isn’t going to go fishing. However, I should be able to assert some facts about fishing and it should be able to reason based on those assertions. e.g. a child can work off of facts presented about fishing, “fish are hard to catch in muddy water” -> “the water is muddy, does that impact my chances of a catching a bluegill?” -> “yes, it does, bluegill are fish, and fish don’t like muddy water”.

blog.research.google/…/language-models-perform-re…

they do this already, albeit imperfectly, but again, this is like, a baby LLM.

and just to prove it:

chat.openai.com/…/54455afb-3eb8-4b7f-8fcc-e144a48…

Veraticus OP ,
@Veraticus@lib.lgbt avatar

LLMs are fundamentally different from human consciousness. It isn’t a problem of scale, but kind.

They are like your phone’s autocomplete, but very very good. But there’s no level of “very good” for autocomplete that makes it a human, or will give it sentience, or allow it to understand the words it is suggesting. It simply returns the next most-likely word in a response.

If we want computerized intelligence, LLMs are a dead end. They might be a good way for that intelligence to speak pretty sentences to us, but they will never be that themselves.

communist ,
@communist@beehaw.org avatar

You’re guessing, you don’t actually know that for sure, it seems intuitively correct, but we simply do not know enough about cognition to make that assumption.

Perhaps our ability to reason exclusively comes from our ability to predict, and by scaling up the ability to predict, we become more and more able to reason.

These are guesses, all we have now are guesses, you can say “it doesn’t reason” and “it’s just autocorrect” all you want, but if that were the case why did scaling it up eventually enable it to perform basic math? Why did scaling it up improve its ability to problemsolve significantly (gpt3 vs gpt4), there’s so many unknowns in this field, to just say “nah, can’t be, it works differently from us” doesn’t mean it can’t do the same things as us given enough scale.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

I’m not guessing. When I say it’s a difference of kind, I really did mean that. There is no cognition here; and we know enough about cognition to say that LLMs are not performing anything like it.

Believing LLMs will eventually perform cognition with enough hardware is like saying, “if we throw enough hardware at a calculator, it will eventually become alive.” Even if you throw all the hardware in the world at it, there is no emergent property of a calculator that would create sentience. So too LLMs, which really are just calculators that can speak English. But just like calculators they have no conception of what English is and they do not think in any way, and never will.

communist ,
@communist@beehaw.org avatar

I’m not guessing. When I say it’s a difference of kind, I really did mean that. There is no cognition here; and we know enough about cognition to say that LLMs are not performing anything like it.

We do not know that, I challenge you to find a source for that, in fact, i’ve seen sources showing the opposite, they seem to reason in tokens, for example, LLM’s perform significantly better at tasks when asked to give a step by step reasoned explanation, this indicates that they are doing a form of reasoning, and their reasoning is limited by what I have no better term for than laziness.

blog.research.google/…/language-models-perform-re…

Veraticus OP ,
@Veraticus@lib.lgbt avatar

It is your responsibility to prove your assertion that if we just throw enough hardware at LLMs they will suddenly become alive in any recognizable sense, not mine to prove you wrong.

You are anthropomorphizing LLMs. They do not reason and they are not lazy. The paper discusses a way to improve their predictive output, not a way to actually make them reason.

But don’t take my word for it. Go talk to ChatGPT. Ask it anything like this:

“If an LLM is provided enough processing power, would it eventually be conscious?”

“Are LLM neural networks like a human brain?”

“Do LLMs have thoughts?”

“Are LLMs similar in any way to human consciousness?”

Just always make sure to check the output of LLMs. Since they are complicated autosuggestion engines, they will sometimes confidently spout bullshit, so must be examined for correctness. (As my initial post discussed.)

communist , (edited )
@communist@beehaw.org avatar

You’re assuming i’m saying something that i’m not, and then arguing with that, instead of my actual claim.

I’m saying we don’t know for sure what they will be able to do when they’re scaled up. That’s the end of my assertion. I don’t have to prove that they will suddenly come alive, i’m not claiming they will, i’m just claiming we don’t know what will happen when they’re scaled, and they seem to have emergent properties as they scale up. Nobody has devised a way of predicting what emergent properties happen when, nobody has made any progress whatsoever on knowing what scaling up accomplishes.

Can they reason? Yes, but poorly right now, will that get better? Who knows.

The end of my claim is that we don’t know what’ll happen when they scale up, and that you can’t just write it off like you are.

If you want proof that they reason, see the research article I linked. If they can do that in their rudimentary form that we’ve created with very little time, we can’t write off the possibility that they will scale.

Whether or not they reason LIKE HUMANS is irrelevant if they can do the job.

And i’m not anthropomorphizing them without reason, there aren’t terms for this already, what would you call this behavior of answering questions significantly better when asked to fully explain reasoning? I would say it is taking the easiest option that still meets the qualifications of what it is requested to do, following the path of least resistance, I don’t have a better word for this than laziness.

…org.in/…/artificial-intelligence-gpt-4-shows-spa…

Furthermore predictive power is just another way of achieving reasoning, better predictive power IS better reasoning, because you can’t predict well without reasoning.

Slotos ,

It’s your job to prove your assertion that we know enough about cognition to make reasonable comparisons.

emptiestplace ,

Are you just fucking around here? C’mon. In your hypothetical scenario, the emergent property would not be “of a calculator”.

lloram239 ,

LLMs are fundamentally different from human consciousness.

They are also fundamentally different from a toaster. But that’s completely irrelevant. Consciousness is something you get when you put intelligent in an agent that has to move around in and interact with an environment. A chatbot has no use for that, it’s just there to mush through lots of data and produce some, it doesn’t have or should worry about its own existence.

It simply returns the next most-likely word in a response.

So does the all knowing oracle that predicts the lotto numbers from next week. It being autocomplete does not limit its power.

LLMs are a dead end.

There might be better or faster approaches, but it’s certainly not a dead end. It’s a building block. Add some long term memory, bigger prompts, bigger model, interaction with the Web, etc. and you can build a much more powerful bit of software than what we have today, without even any real breakthrough on the AI side. GPT as it is today is already “good enough” for a scary number of things that used to be exclusively done by humans.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

A chatbot has no use for that, it’s just there to mush through lots of data and produce some, it doesn’t have or should worry about its own existence.

It literally can’t worry about its own existence; it can’t worry about anything because it has no thoughts or feelings. Adding computational power will not miraculously change that.

Add some long term memory, bigger prompts, bigger model, interaction with the Web, etc. and you can build a much more powerful bit of software than what we have today, without even any real breakthrough on the AI side.

I agree this would be a very useful chatbot. But it is still not a toaster. Nor would it be conscious.

communist , (edited )
@communist@beehaw.org avatar

It literally can’t worry about its own existence; it can’t worry about anything because it has no thoughts or feelings. Adding computational power will not miraculously change that.

Who cares? This has no real world practical usecase. Its thoughts are what it says, it doesn’t have a hidden layer of thoughts, which is quite frankly a feature to me. Whether it’s conscious or not has nothing to do with its level of functionality.

emptiestplace ,

You seem unfamiliar with the concept of consciousness as an emergent property.

What if we dramatically reduce the cost of training - what if we add realtime feedback mechanisms as part of a perpetual model refinement process?

As far as I’m aware, we don’t know.

How are you so confident that your feelings are not simply a consequence of complexity?

poweruser ,

Suppose you were saying that about me. How would I prove you wrong? How could a thinking being express that it is actually sentient to meet your standards?

emptiestplace ,

I am picking up a hint of the autocompletion you describe, in your writing.

Zormat ,

So for context, I am an applied mathematician, and I primarily work in neural computation. I have an essentially cursory knowledge of LLMs, their architecture, and the mathematics of how they work.

I hear this argument, that LLMs are glorified autocomplete and merely statistical inference machines and are therefore completely divorced from anything resembling human thought.

I feel the need to point out that not only is there no compelling evidence that any neural computation that humans do anything different from a statistical inference machine, there’s actually quite a bit of evidence that that is exactly what real, biological neural networks do.

Now, admittedly, real neurons and real neural networks are way more sophisticated than any deep learning network module, real neural networks are extremely recurrent and extremely nonlinear, with some neural circuits devoted to simply changing how other neural circuits process signals without actually processing said signals on their own. And in the case of humans, several orders of magnitude larger than even the largest LLM.

All that said, it boils down to an insanely powerful statistical machine.

There are questions of motivation and input: we all want to stay alive (ish), avoid pain, and have constant feedback from sensory organs while a LLM just produces what it was supposed to. But in an abstraction the ideas of wants and needs and rewards aren’t substantively different from prompts.

Anyway. I agree that modern AI is a poor substitute for real human intelligence, but the fundamental reason is a matter of complexity, not method.

Some reading:

Large scale neural recordings call for new insights to link brain and behavior

A unifying perspective on neural manifolds and circuits for cognition

a comparison of neuronal population dynamics measured with calcium imaging and electrophysiology

p03locke ,
@p03locke@lemmy.dbzer0.com avatar

And everyone in tech who has worked on ML before collectively says “yeah that’s what we’ve been trying to tell you”.

Everybody in tech would even have a passing understanding of the technology was collectively saying that. We understand the limits of technology and can feel out the bounds easily. But, too many of these dumbasses with dollar signs in their eyes are all “to the moon!”, and tripping and failing on implementing the tech in unreasonable ways.

It was never a factoid machine, like some people wanted to believe. It was always about creatively writing something, and only one with so much attention.

interolivary ,
@interolivary@beehaw.org avatar

It was never a factoid machine

Funny tidbit about the word “factoid”: its original meaning was “an item of unreliable information that is reported and repeated so often that it becomes accepted as fact”, but the modern usage is “a brief or trivial item of news or information”.

This means that the modern usage of “factoid” is in itself a factoid, and that in the old sense LLMs sort of are factoid machines.

Note that I’m not saying the modern use is wrong. Languages evolve, and words taking on new meanings doesn’t mean the new meanings are “wrong” (and surprisingly words changing to mean the opposite of what they used to mean isn’t all that uncommon either.)

MasterBuilder ,

I’ve been unemployed for 7 months. Every online job I see that’s been posted for at least 6 hours has over 200 applications. I’m a senior Dev with 30 years experience, and I can’t find work.

I’d say generative AI is an existential threat as bad as offshoring was for steel in the early 80s. I’m now left with the prospect of spending the last 20 years of my work life at or near minimum wage.

After all, I can’t afford to spend $250,000 on a new bachelor’s degree, and a community college degree might get me to $25/hr, and still costs thousands. This is causing impoverishment on a massive scale.

Ignore this threat at your peril.

HelixTitan ,

Hard to believe a senior dev can’t find work. Those positions are the most needed. Also 25 an hour is 50k a year. No where in the US are senior devs paid that little. I suppose you may not be US based, but your cost for college seems to imply US, albeit at an expensive school.

MasterBuilder ,

I was not saying 25 for a dev job, i was saying that for other kinds of work i might be able to get without getting a new degree.

seang96 ,

Your issue sounds more like a capitalism issue. FANG companies lay off thousands of employees to cut costs and prepare for changes in the economy. AI didn’t make them lay off all those employees, just corporate greed. Until AI can gather requirements, accurately produce code with at least 80%, can compile the software itself, it isn’t a threat.

Edit fix autocorrect

scrubbles ,
@scrubbles@poptalk.scrubbles.tech avatar

and 100% accuracy. Only a fool would trust something coming out of AI and slap it right into production right now.

seang96 ,

I agree though I was following the 80/20 rule. if the softwares essentially free and does 80% of your business needs businesses would be happy. Either way AI is nowhere near that since it requires someone with the knowledge currently to get it anywhere close to a complete project.

MasterBuilder ,

I understand how it works. The fact remains that companies already laid off people because of AI, and until now, I have never been unemployed more than 2 months.

I’ve also never seen a market in which most job posting garner 200 to 500 applications within 24 hours. It is armageddon out here.

scrubbles ,
@scrubbles@poptalk.scrubbles.tech avatar

I’m a senior dev too, and at first I thought the same, but really it’s a market downturn. Companies are just afraid to hire right now. I’d look into generative AI, try to understand how it works. That’s how I’ve been spending my time, and yeah, it’s intuitive the way they do it but the more you understand how it works the more you realize that it’s not ready to take our jobs. Yet. Again maybe someday, but there is a lot of work that needs to be done to get something semi up and running, and the models that Google uses are not going to be usable for every company. (Take a look at all the specialized models already).

Our job never goes away, but it does constantly evolve. This is just another point where we have to learn new skills, and that may be that we all need to be model tuners some day. At the end of the day the user still needs to correctly describe what they want to have happen on the screen, and there are currently no ways to take what they describe into a full piece of software.

biddy ,

I disagree, a lot of white collar work is simply writing bullshit.

crow ,

And what does that mean about the jobs it can replace?

avidamoeba ,
@avidamoeba@lemmy.ca avatar

They can replace the bullshit jobs of which we have many, serving the essential purpose of keeping the people doing them fed and thus the economy and society stable. 🥲

30p87 ,

It can replace nothing. It can make the job of eg. developers easier. And on a small, private scale ML can replace writers and stock photo libraries, if they have support for pictures. However, on a larger scale, both would have massive problems of quality, diversity and copyright. You can’t use the output of a ML algorithm for things you earn anything with if there are active cases exploring if the copyright belongs to the ML itself, the producers of the training data, who probably didn’t give anyone consent, no one, or actually you.

lloram239 ,

You can’t use the output of a ML algorithm for things you earn anything

Better tell Adobe, as they are loading their Photoshop full of AI stuff.

If you type “paint me an elephant”, yeah, you might not get copyright on that, but nobody would buy your elephant picture anyway. So that’s hardly an issue. The moment you actually produce something complex with the help of AI, there will be so many steps involved that you’ll get copyright on it no problem.

And once the AI gets smart enough to produce complex things by itself, without a human hand holding it along the way, you’ll have bigger problems to worry about anyway. Since at that point the AI isn’t just replacing the artist, it’s replacing the whole media production chain. No more need to wait for Hollywood to make a movie, you can just tell your AI what you want to see and it will produce one on demand, customized specifically for you. What we see today is basically the beginnings of the Holodeck, endless on-demand entertainment customized for the user.

bpalmerau ,

“has a model of how words relate to each other, but does not have a model of the objects to which the words refer.

It engages in predictive logic, but cannot perform syllogistic logic - reasoning to a logical conclusion from a set of propositions that are assumed to be true”

Is this true of all current LLMs?

Veraticus OP ,
@Veraticus@lib.lgbt avatar

Yes, this is how all LLMs function.

Zaktor ,

does not have a model of the objects to which the words refer

I’m not even sure what this is supposed to be saying. Sounds kind of like a bullshit generator.

Words are encodings of knowledge and their expression and use represent that knowledge, and these machines ingest a repository containing a significant percent of written human communication. It encodes that the words “dog” and “bark” are often used together, but it also encodes that “dog” and “cat” are things that are both “mammals” and “mammals” are “animals”, and that the pair of them are much more likely to appear in a human household than a “porpoise”. What is this other kind of model of objects that hasn’t been in some way represented in all of the internet?

Veraticus OP ,
@Veraticus@lib.lgbt avatar

It is not a model of objects. It’s a model of words. It doesn’t know what those words themselves mean or what they refer to; it doesn’t know how they relate together, except that some words are more likely to follow other words. (It doesn’t even know what an object is!)

When we say “cat,” we think of a cat. If we then talk about a cat, it’s because we love cats, or hate them, or want to communicate something about them.

When an LLM says “cat,” it has done so because a tokenization process selected it from a chain of word weights.

That’s the difference. It doesn’t think or reason or feel at all, and that does actually matter.

Zaktor ,

This is just the same hand-waving repeated. What does it mean to “know what a word means”? How is a word, indexed into a complex network of word embeddings, meaningfully different as a token from this desired “object model”? Because the indexing and encoding very much does relate words together separately from their likelihood to appear in a sentence together. These embeddings may be learned from language, but language is simply a method of communicating meaning, and notably humans also learn meaning through consuming it.

What do things like “love” or “want” or “feeling” have to do with a model of objects? How would you even recognize a system that does that and why would it be any more capable than a LLM at producing good and trustable information? Does feeling love for a concept help you explain what a random blogger does? Do you need to want something to produce meaningful output?

This just all seems like poorly defined techno-spiritualism.

Veraticus OP ,
@Veraticus@lib.lgbt avatar

It is not hand-waving; it is the difference between an LLM, which, again, has no cognizance, no agency, and no thought – and humans, which do. Do you truly believe humans are simply mechanistic processes that when you ask them a question, a cascade of mathematics occurs and they spit out an output? People actually have an internal reality. For example, they could refuse to answer your question! Can an LLM do even something that simple?

I find it absolutely mystifying you claim you’ve studied this when you so confidently analogize humans and LLMs when they truly are nothing alike.

Zaktor ,

no cognizance, no agency, and no thought

Define your terms. And explain why any of them matter for producing valid and “intelligent” responses to questions.

Do you truly believe humans are simply mechanistic processes that when you ask them a question, a cascade of mathematics occurs and they spit out an output?

Why are you so confident they aren’t? Do you believe in a soul or some other ephemeral entity that wouldn’t leave us as a biological machine?

People actually have an internal reality. For example, they could refuse to answer your question! Can an LLM do even something that simple?

Define your terms. And again, why is that a requirement for intelligence? Most of the things we do each day don’t involve conscious internal planning and reasoning. We simply act and if asked will generate justifications and reasoning after the fact.

It’s not that I’m claiming LLMs = humans, I’m saying you’re throwing out all these fuzzy concepts as if they’re essential features lacking in LLMs to explain their failures in some question answering as something other than just a data problem. Many people want to believe in human intellectual specialness, and more recently people are scared of losing their jobs to AI, so there’s always a kneejerk reaction to redefine intelligence whenever an animal or machine is discovered to have surpassed the previous threshold. Your thresholds are facets of the mind that you both don’t define, have no means to recognize (I assume your consciousness, but I cannot test it), and have not explained why they’re important for fact rather than BS generation.

How the brain works and what’s important for various capabilities is not a well understood subject, and many of these seemingly essential features are not really testable or comparable between people and sometimes just don’t exist in people, either due to brain damage or a simple quirk in their development. The people with these conditions (and a host of other psychological anomalies) seem to function just fine and would not be considered unthinking. They can certainly answer (and get wrong) questions.

barsoap ,

Do you truly believe humans are simply mechanistic processes that when you ask them a question, a cascade of mathematics occurs and they spit out an output? People actually have an internal reality.

Those two things can be true at the same time.

I find it absolutely mystifying you claim you’ve studied this when you so confidently analogize humans and LLMs when they truly are nothing alike.

“Nothing alike” is kinda harsh, we do have about as much in common with ChatGPT as we have with flies purpose-bred to fly left or right when exposed to certain stimuli.

lloram239 ,

People actually have an internal reality.

So do LLMs.

Can an LLM do even something that simple?

Ask it about any NSFW topic and it will refuse.

analogize humans and LLMs when they truly are nothing alike.

They seem way more similar than different. The part were they are different trivially follow from the LLMs architecture (e.g. LLMs are static, tokenizing makes character-based problems difficult, memory is limited to the prompt, no interaction with the external world, no vision, no hearing, …) and most of that can be overcome by extending the model, e.g. multi-model models with vision and hearing are on their way, DeepMind is working on models that interact with the real world, etc. This is all coming and coming fast.

barsoap , (edited )

What does it mean to “know what a word means”?

For one, ChatGPT has no idea what a cat or dog looks like. It has no understanding of their differences in character of movement. Lacking that kind of non-verbal understanding, when analysing art that’s actually in its domain, that is, poetry, it couldn’t even begin to make sense of the question “has this poem feline or canine qualities” – best it can do is recognise that there’s neither cats nor dogs in it and, being stumped, make up some utter nonsense. Maybe it has heard of catty and that dogs are loyal and will be looking for those themes, but feline and canine as in elegance? Forget it, unless it has read a large corpus of poet analysis that uses those terms: It can parrot that pattern matching, but it can’t do the pattern matching itself, it cannot transfer knowledge from one domain to another when it has no access to one of those domains.

And that’s the tip of the iceberg. As humans we’re not really capable of purely symbolic thought so it’s practically impossible to appreciate just how limited those systems are because they’re not embodied.

(And, yes, Stable Diffusion has some understanding of feline vs. canine as in elegance – but it’s an utter moron in other areas. It can’t even count to one).


Then, that all said, and even more fundamentally, ChatGPT (as all other current AI algos we have) is a T2 system, not a T3 system. It comes with rules how to learn, it doesn’t come with rules enabling it to learn how to learn. As such it never thinks – it cannot think, as in “mull over”. It reacts with what passes as a gut in AI land, and never with “oh I’m not sure about this so let me mull it over”. It is in principle capable of not being sure but that doesn’t mean it can rectify the situation.

lloram239 ,

it couldn’t even begin to make sense of the question “has this poem feline or canine qualities”

Which is obviously false, as a quick try will show. Poems are just language and LLMs understand that very well. That LLMs don’t have any idea how cats actually look like or move, beyond what they can gather from text books, is irrelevant here, they aren’t tasked with painting a picture (which the upcoming multi-modal models can do anyway).

Now there can of course be problems that can be expressed in language, but not solve in the realm of language. But I find those to be incredible rare, rare enough that I never really seen a good example. ChatGPT captures an enormous amount of knowledge about the world, and humans have written about a lot of stuff. Coming up with questions that would be trivial to answer for any human, but impossible for ChatGPT is quite tricky.

And that’s the tip of the iceberg.

Have you actually ever actually seen an iceberg or just read about them?

It comes with rules how to learn, it doesn’t come with rules enabling it to learn how to learn

ChatGPT doesn’t learn. It’s a completely static model that doesn’t change. All the learning happened in a separate step back when it was created, it doesn’t happen when you interact with it. That illusion comes from the text prompt, which includes both your text as well as its output, getting feed into the model as input. But outside that text prompt, it’s just static.

“oh I’m not sure about this so let me mull it over”.

That’s because it fundamentally can’t mull it over. It’s a feed forward neural network, meaning everything that goes in on one side comes out on the other in a fixed amount of time. It can’t do loops by itself. It has no hidden internal monologue. The only dynamic part is the prompt, which is also why its ability to problem solve improves quite a bit when you require it to do the steps individually instead of just presenting the answer, as that allows the prompt to be it’s “internal monologue”.

barsoap ,

Coming up with questions that would be trivial to answer for any human, but impossible for ChatGPT is quite tricky.

Which is why I came up with the “feline poetry” example. It’s a quite simple concept for a human even if not particularly poetry-inclined, yet, if noone ever has written about the concept it’s going to be an uphill battle for ChatGPT. And, no, I haven’t tried. I also didn’t mean it as some kind of dick measuring contest I simply wanted to explain what kind of thing ChatGPT really has trouble with.

Have you actually ever actually seen an iceberg or just read about them?

As a matter of fact yes, I have. North cape, Norway.

ChatGPT doesn’t learn. It’s a completely static model that doesn’t change.

ChatGPT is also its training procedure if you ask me, same as humanity is also its evolution.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • [email protected]
  • lifeLocal
  • goranko
  • All magazines