The White House wants to ‘cryptographically verify’ videos of Joe Biden so viewers don’t mistake them for AI deepfakes::Biden’s AI advisor Ben Buchanan said a method of clearly verifying White House releases is “in the works.”

  • Natanael@slrpnk.net
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 months ago

    Perceptual hash collision generators can take arbitrary images and tweak them in invisible ways to make them collide with whichever hash value you want.

    • Pup Biru@aussie.zone
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 months ago

      from the comment above, it seems like it took a week for a single image/frame though… it’s possible sure but so is a collision in a regular hash function… at some point it just becomes too expensive to be worth it, AND the phash here isn’t being used as security because the security is that the original was posted on some source of truth site (eg the whitehouse)

      • Natanael@slrpnk.net
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 months ago

        No, it took a week to refine the attack algorithm, the collision generation itself is fast

        The point of perceptual hashes is to let you check if two things are similar enough after transformations like scaling and reencoding, so you can’t rely on that here

        • Pup Biru@aussie.zone
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 months ago

          oh yup that’s a very fair point then! you certainly wouldn’t use it for security in that case, however there are a lot of ways to implement this that don’t rely on the security of the hash function, but just uses it (for example) to point to somewhere in a trusted source to manually validate that they’re the same

          we already have the trust frameworks; that’s unnecessary… we just need to automatically validate (or at least provide automatic verifyability) that a video posted on some 3rd party - probably friendly or at least cooperative - platform represents reality

          • Natanael@slrpnk.net
            link
            fedilink
            English
            arrow-up
            1
            ·
            5 months ago

            I think the best bet is really video formats with multiple embedded streams carrying complementary frame data (already exists) so you decide video quality based on how many streams you want to merge in playback.

            If you then hashed the streams independently and signed the list of hashes, then you have a video file which can be “compressed” without breaking the signature by stripping out some streams.