• 0 Posts
  • 323 Comments
Joined 1 year ago
cake
Cake day: June 11th, 2023

help-circle




  • The problem is not them being random.

    They are not random, that’s the point. They’re entirely deterministic and very precise, and they aren’t hiding anything; they will give you the most likely (not blacklisted) sequence of characters to follow your input according to their model. What they won’t give you is information, except by accident.

    If they were random (hidden or not) they’d be harmless, no one would trust them any more than one of those eight ball toys, or your average horoscope.

    The issue is that they’re very not random, so much that there’s no way to know if what they are saying bears any accidental semblance to the truth without fact checking… and that very soon they’ll have replaced any feasible way to fact check them, since all the supposed “facts” we’ll have access to will have been generated by LLMs train on LLM generated garbage.


  • If the models are random then we shouldn’t be trusting them to do anything, let alone serious applications.

    That’s not the reason we shouldn’t be using them for anything other than generating lorem ipsum style text or dialogue for non quest critical NPCs in games.

    The reason is that, paraphrasing Neil Gaiman, LLMs don’t generate information, they generate information shaped sentences.

    Specifically, an LLM takes a sequence of characters (not a word or text; LLMs have no concept of words, or text, or anything else for that matter; they’re just an application of statistics on large volumes of sequences of characters; no meaning or intelligence involved, artificial or not)… as I was saying, an LLM takes a sequence of characters, pushes it through its model, and outputs the sequence of characters most likely to follow it in the texts its model has been trained on (or rather, the most likely after discarding the ones its creators have labelled as politically incorrect).

    That’s all they do, and they’ll excellent at it (or would be if it weren’t for the aforementioned filters), but that’ll never give you a cure for cancer unless there already was one in their training data.

    They take texts written by humans, shred them, and give you their badly put back together dessicated corpses, drained of any and all meaning or information, but looking very convincingly (until you fact check them) like actually meaningful or informative texts.

    That is what makes them dangerous. That and the fact that the bastards selling them are marketing them for the jobs they’re least capable of doing, that is, providing reliable information.

    (And that’s while they can still be trained on meaningful and informative texts written by humans — inasmuch as anything found on reddit, facebook, or xitter can be considered to be meaningful or informative —, but given that a higher and higher percentage of the text on the internet is being generated by LLMs soon enough it’ll be impossible to train new models on anything but 99% LLM generated garbage, at which point the whole bubble will implode, as anyone who’s wasted time, paper, and toner playing with a photocopier or anyone familiar with the phrase “garbage in, garbage out” will already have realised… which is probably why the LLM peddlers are ignoring robots.txt and copyright laws in a desperate effort to scrape whatever’s left of the bottom of the barrel.)



  • LLMs process information

    No, they don’t. They merely tell you which sequence of characters comes most often in their training set after the sequence of characters you gave them. That’s all. No processing going on, no information being generated or retrieved other than statistical trivia about their training set.

    AI can be extremely dangerous in either case. LLMs are no different from that perspective.

    General AI could be dangerous because it could be smarter than us while having interests, objectives, and morals that could clash with our own, causing it to antagonise us.

    That’s obviously impossible for LLMs, which have as much intelligence, interests, objectives, or morals as your average paperweight.

    LLMs are dangerous because they’re good enough at sounding like they know what they’re saying that you people actually believe them to be intelligent (and the fact that the bastards selling them are using their apparent intelligence as their main selling point obviously doesn’t help either), and they can be convincing enough that when they randomly tell you to get a bleach and ammonia enema to help with that headache you might actually believe them since by that point there’ll be no way left to check your facts. Which, hey, fair enough, natural selection and all that… but at some point one of you is going to fart that chlorine gas in my general vicinity, and that isn’t so good.


  • There’s nothing resembling intelligence, general or not, in any autocorrect implementation so far, including LLMs.

    LLMs don’t make mistakes. If you think they do, you’re completely misunderstanding what LLMs are, how they work, and what they do (probably because of the aforementioned misinformation by LLM peddlers trying to equate them to intelligence, artificial or not).

    LLMs simply give you the most statistically likely word to follow a given text. Then they do it again, adding the word they generated in the previous cycle to the text. That’s all they do, they’re excellent at it, and they don’t make mistakes, the word they output will be the most statistically likely, regardless of whether it makes sense or not (though attempts by their peddlers to keep them politically correct might cause them to discard the first several most likely words, leaving them able to only output a significantly unlikely — but hopefully politically correct — one, which might seem like a mistake to the user).

    You seem to be assuming that LLMs are trained on knowledge. They’re not. They’re trained on text. They have no idea what the text means (they don’t even have anything to have ideas with), and they don’t care (nor have more ability to care than a desk lamp).

    They have a model of what words (meaning sequences of characters, not concepts with any actual meaning) may come after certain others, they push the input sequence of meaningless characters through that model, and out comes the most statistically likely meaningless sequence of characters to follow said text. That’s all.

    Paraphrasing Neil Gaiman, “LLMs don’t produce information. They produce information shaped sentences.”

    They produce the dessicated corpses of the texts they were fed, shredded and put back together, drained of any actual information but indistinguishable enough from texts containing actual information to give the illusion of also containing it.

    They’re great as an alternative to lorem ipsum, or possibly as speech generators for non quest critical NPCs in games, but they’re extremely dangerous for anything else, especially the uses LLM peddlers are peddling them for.











  • Someone learning Spanish as a second language will have to remember that it’s máquina and not máquino when speaking or writing it, though (and will then probably be quite confused if they ever meet some guy nicknamed El Máquina, which would somehow be a perfectly cromulent nickname in Spanish).

    Confusing genders when speaking or writing is one of the most common mistakes amongst people new to the language, because while everything else has some form of rule, this doesn’t (sure, when reading or listening you can most of the time use the word ending, and you’ll probably have an article, too, but when you are the one speaking or writing you have no option but to just know a word’s gender, or how it ends, which is the same thing).