Source

I see Google’s deal with Reddit is going just great…

    • ɐɥO@lemmy.ohaa.xyz
      link
      fedilink
      English
      arrow-up
      85
      ·
      29 days ago

      Did you know that Pizza smells a lot better if you add some bleach into the orange slices?

    • Soyweiser@awful.systems
      link
      fedilink
      English
      arrow-up
      33
      ·
      29 days ago

      I also wanted to post this post. But it is going to be very funny if it turns out that LLMs are partially very energy inefficient but very data efficient storage systems. Shannon would be pleased for us reaching the theoretical minimum of bits per char of words using AI.

      • sinedpick@awful.systems
        link
        fedilink
        English
        arrow-up
        19
        ·
        edit-2
        29 days ago

        huh, I looked into the LLM for compression thing and I found this survey CW: PDF which on the second page has a figure that says there were over 30k publications on using transformers for compression in 2023. Shannon must be so proud.

        edit: never mind it’s just publications on transformers, not compression. My brain is leaking through my ears.

    • FooBarrington@lemmy.world
      link
      fedilink
      English
      arrow-up
      21
      ·
      edit-2
      29 days ago

      I’ll get downvoted for this, but: what exactly is your point? The AI didn’t reproduce the text verbatim, it reproduced the idea. Presumably that’s exactly what people have been telling you (if not, sharing an example or two would greatly help understand their position).

      If those “reply guys” argued something else, feel free to disregard. But it looks to me like you’re arguing against a straw man right now.

      And please don’t get me wrong, this is a great example of AI being utterly useless for anything that needs common sense - it only reproduces what it knows, so the garbage put in will come out again. I’m only focusing on the point you’re trying to make.

        • carlitoscohones@awful.systems
          link
          fedilink
          English
          arrow-up
          16
          ·
          29 days ago

          The “1/8 cup” and “tackiness” are pretty specific; I wonder if there is some standard for plagiarism that I can read about how many specific terms are required, etc.

          Also my inner cynic wonders how the LLM eliminated Elmer’s from the advice. Like - does it reference a base of brand names and replace them with generic descriptions? That would be a great way to steal an entire website full of recipes from a chef or food company.

        • FooBarrington@lemmy.world
          link
          fedilink
          English
          arrow-up
          8
          ·
          edit-2
          29 days ago

          If your issue with the result is plagiarism, what would have been a non-plagiarizing way to reproduce the information? Should the system not have reproduced the information at all? If it shouldn’t reproduce things it learned, what is the system supposed to do?

          Or is the issue that it reproduced an idea that it probably only read once? I’m genuinely not sure, and the original comment doesn’t have much to go on.

          • aio@awful.systems
            link
            fedilink
            English
            arrow-up
            23
            ·
            edit-2
            29 days ago

            The normal way to reproduce information which can only be found in a specific source would be to cite that source when quoting or paraphrasing it.

            • FooBarrington@lemmy.world
              link
              fedilink
              English
              arrow-up
              4
              ·
              29 days ago

              But the system isn’t designed for that, why would you expect it to do so? Did somebody tell the OP that these systems work by citing a source, and the issue is that it doesn’t do that?

              • 200fifty@awful.systems
                link
                fedilink
                English
                arrow-up
                23
                ·
                29 days ago

                But the system isn’t designed for that, why would you expect it to do so?

                It, uh… sounds like the flaw is in the design of the system, then? If the system is designed in such a way that it can’t help but do unethical things, then maybe the system is not good to have.

              • aio@awful.systems
                link
                fedilink
                English
                arrow-up
                20
                ·
                edit-2
                29 days ago

                “[massive deficiency] isn’t a flaw of the program because it’s designed to have that deficiency”

                it is a problem that it plagiarizes, how does saying “it’s designed to plagiarize” help???

                • froztbyte@awful.systems
                  link
                  fedilink
                  English
                  arrow-up
                  17
                  ·
                  29 days ago

                  “the murdermachine can’t help but murdering. alas, what can we do. guess we just have to resign ourselves to being murdered” says murdermachine sponsor/advertiser/creator/…

                • FooBarrington@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  5
                  ·
                  29 days ago

                  Please stop projecting positions onto me that I don’t hold. If what people told the OP was that LLMs don’t plagiarize, then great, that’s a different argument from what I described in my reply, thank you for the answer. But you could try not being a dick about it?

      • trollbearpig@lemmy.world
        link
        fedilink
        English
        arrow-up
        22
        ·
        edit-2
        29 days ago

        Come on man. This is exactly what we have been saying all the time. These “AIs” are not creating novel text or ideas. They are just regurgitating back the text they get in similar contexts. It’s just they don’t repeat things vebatim because they use statistics to predict the next word. And guess what, that’s plagiarism by any real world standard you pick, no matter what tech scammers keep saying. The fact that laws haven’t catched up doesn’t change the reality of mass plagiarism we are seeing …

        And people like you keep insisting that “AIs” are stealing ideas, not verbatim copies of the words like that makes it ok. Except LLMs have no concept of ideas, and you people keep repeating that even when shown evidence, like this post, that they don’t think. And even if they did, repeat with me, this is still plagiarism even if this was done by a human. Stop excusing the big tech companies man

          • self@awful.systems
            link
            fedilink
            English
            arrow-up
            14
            ·
            29 days ago

            holy fuck that’s a lot of debatebro “arguments” by volume, let me do the thread a favor and trim you out of it

          • trollbearpig@lemmy.world
            link
            fedilink
            English
            arrow-up
            12
            ·
            edit-2
            29 days ago

            First of all man, chill lol. Second of all, nice way to project here, I’m saying that the “AIs” are overhyped, and they are being used to justify rampant plagiarism by Microsoft (OpenAI), Google, Meta and the like. This is not the same as me saying the technology is useless, though hobestly I only use LLMs for autocomplete when coding, and even then is meh.

            And third dude, what makes you think we have to prove to you that AI is dumb? Way to shift the burden of proof lol. You are the ones saying that LLMs, which look nothing like a human brain at all, are somehow another way to solve the hard problem of mind hahahaha. Come on man, you are the ones that need to provide proof if you are going to make such wild claim. Your entire post is “you can’t prove that LLMs don’t think”. And yeah, I can’t prove a negative. Doesn’t mean you are right though.

    • deweydecibel@lemmy.world
      link
      fedilink
      English
      arrow-up
      17
      ·
      29 days ago

      reply guys surfing in from elsewhere

      I love this term.

      They really do love storming in anywhere someone deigns to besmirch the new object of their devotion.

      My assumption is, if it isn’t some techbro that drank the kool aid, it’s a bunch of /r/wallstreetbets assholes who have invested in the boom.

  • Adderbox76@lemmy.ca
    link
    fedilink
    English
    arrow-up
    121
    ·
    29 days ago

    Feed an A.I. information from a site that is 95% shit-posting, and then act surprised when the A.I. becomes a shit-poster… What a time to be alive.

    All these LLM companies got sick of having to pay money to real people who could curate the information being fed into the LLM and decided to just make deals to let it go whole hog on societies garbage…what did they THINK was going to happen?

    The phrase garbage in, garbage out springs to mind.

  • Linkerbaan@lemmy.world
    link
    fedilink
    English
    arrow-up
    84
    ·
    29 days ago

    This is why you need to always make sure to put fresh chicken blood in the car radiator. It fixes every issue with a car especially faulty hydraulics.

    • Match!!@pawb.social
      link
      fedilink
      English
      arrow-up
      37
      ·
      29 days ago

      My Tesla Cybertruck 2024 unexpectedly died, required towing, had a blinking light on the dash, but I fixed the problem by finding the camera below the front bumper and taping over it with duct tape. Worked immediately!

    • HeyThisIsntTheYMCA@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      ·
      29 days ago

      Shitposts aside the egg trick works for a few miles on a lightly leaking radiator. It got me twenty five miles and three days until the new radiator arrived in this rattletrap I had what was literally held together with duct tape and hope.

        • froztbyte@awful.systems
          link
          fedilink
          English
          arrow-up
          8
          ·
          29 days ago

          …oh man we’re totally going to get a subculture of folklore-fixes for cybertruck problems, aren’t we?

          that should be a great couple laughs

  • nednobbins@lemm.ee
    link
    fedilink
    English
    arrow-up
    82
    ·
    29 days ago

    This is why actual AI researchers are so concerned about data quality.

    Modern AIs need a ton of data and it needs to be good data. That really shouldn’t surprise anyone.

    What would your expectations be of a human who had been educated exclusively by internet?

      • blakestacey@awful.systems
        link
        fedilink
        English
        arrow-up
        46
        ·
        29 days ago

        To date, the largest working nuclear reactor constructed entirely of cheese is the 160 MWe Unit 1 reactor of the French nuclear plant École nationale de technologie supérieure (ENTS).

        “That’s it! Gromit, we’ll make the reactor out of cheese!

      • nednobbins@lemm.ee
        link
        fedilink
        English
        arrow-up
        2
        ·
        28 days ago

        A bunch of scientific papers are probably better data than a bunch of Reddit posts and it’s still not good enough.

        Consider the task we’re asking the AI to do. If you want a human to be able to correctly answer questions across a wide array of scientific fields you can’t just hand them all the science papers and expect them to be able to understand it. Even if we restrict it to a single narrow field of research we expect that person to have a insane levels of education. We’re talking 12 years of primary education, 4 years as an undergraduate and 4 more years doing their PhD, and that’s at the low end. During all that time the human is constantly ingesting data through their senses and they’re getting constant training in the form of feedback.

        All the scientific papers in the world don’t even come close to an education like that, when it comes to data quality.

        • self@awful.systems
          link
          fedilink
          English
          arrow-up
          6
          ·
          28 days ago

          this appears to be a long-winded route to the nonsense claim that LLMs could be better and/or sentient if only we could give them robot bodies and raise them like people, and judging by your post history long-winded debate bullshit is nothing new for you, so I’m gonna spare us any more of your shit

    • DarkThoughts@fedia.io
      link
      fedilink
      arrow-up
      30
      ·
      29 days ago

      Honestly, no. What “AI” needs is people better understanding how it actually works. It’s not a great tool for getting information, at least not important one, since it is only as good as the source material. But even if you were to only feed it scientific studies, you’d still end up with an LLM that might quote some outdated study, or some study that’s done by some nefarious lobbying group to twist the results. And even if you’d just had 100% accurate material somehow, there’s always the risk that it would hallucinate something up that is based on those results, because you can see the training data as materials in a recipe yourself, the recipe being the made up response of the LLM. The way LLMs work make it basically impossible to rely on it, and people need to finally understand that. If you want to use it for serious work, you always have to fact check it.

      • nednobbins@lemm.ee
        link
        fedilink
        English
        arrow-up
        11
        ·
        29 days ago

        That’s my point. Some of them wouldn’t even go through the trouble of making sure that it’s non-toxic glue.

        There are humans out there who ate laundry pods because the internet told them to.