I was curious if (since these are statistical models and not actually counting letters) maybe this or something like it is a common “gotcha” question used as a meme on social media. So I did a search on DDG and it also has an AI now which turned up an interestingly more nuanced answer.
It’s picked up on discussions specifically about this problem in chats about other AI! The ouroboros is feeding well! I figure this is also why they overcorrect to 4 if you ask them about “strawberries”, trying to anticipate a common gotcha answer to further riddling.
DDG correctly handled “strawberries” interestingly, with the same linked sources. Perhaps their word-stemmer does a better job?
many words should run into the same issue, since LLMs generally use less tokens per word than there are letters in the word. So they don’t have direct access to the letters composing the word, and have to go off indirect associations between “strawberry” and the letter “R”
duckassist seems to get most right but it claimed “ouroboros” contains 3 o’s and “phrasebook” contains one c.
Copilot may be a stupid LLM but the human in the screenshot used an apostrophe to pluralize which, in my opinion, is an even more egregious offense.
It’s incorrect to pluralizing letters, numbers, acronyms, or decades with apostrophes in English. I will now pass the pedant stick to the next person in line.
That’s half-right. Upper-case letters aren’t pluralised with apostrophes but lower-case letters are. (So the plural of ‘R’ is ‘Rs’ but the plural of ‘r’ is ‘r’s’.) With numbers (written as ‘123’) it’s optional - IIRC, it’s more popular in Britain to pluralise with apostrophes and more popular in America to pluralise without. (And of course numbers written as words are never pluralised with apostrophes.) Acronyms are indeed not pluralised with apostrophes if they’re written in all caps. I’m not sure what you mean by decades.
It’s not stupid. It’s just the bastard child of Germany, Dutch, French, Celtic and Scandinavian and tries to pretend this mix of influences is cool and normal.
Because otherwise if you have too many small letters in a row it stops looking like a plural and more like a misspelled word. Because capitalization differences you can make more sense of As but not so much as.
I absolutely love that there’s a group called the Apostrophe Protection Society. Is there something like that for the Oxford Comma? I’d gladly join them!
I tried it with my abliterated local model, thinking that maybe its alteration would help, and it gave the same answer. I asked if it was sure and it then corrected itself (maybe reexamining the word in a different way?) I then asked how many Rs in "strawberries" thinking it would either see a new word and give the same incorrect answer, or since it was still in context focus it would say something about it also being 3 Rs. Nope. It said 4 Rs! I then said "really?", and it corrected itself once again.
LLMs are very useful as long as know how to maximize their power, and you don't assume whatever they spit out is absolutely right. I've had great luck using mine to help with programming (basically as a Google but formatting things far better than if I looked up stuff), but I've found some of the simplest errors in the middle of a lot of helpful things. It's at an assistant level, and you need to remember that assistant helps you, they don't do the work for you.