• henfredemars@infosec.pub
      link
      fedilink
      English
      arrow-up
      19
      ·
      edit-2
      4 months ago

      My quick lazy manual transcription:

      What data was used to train Sora?
      We used publicly available data and licensed data.

      So, videos on YouTube?
      I’m actually not sure about that.

      OK, videos from Facebook? Instagram?
      You know if they were publicly available, um yeah, publicly available to use there might be the data but I’m not sure. I’m not confident about it.

      What about Shutterstock? I know you guys have a deal with them.
      I’m just not gonna go into the details of the data that was used but it was publicly available or licensed data.

      EDIT: Please help, can’t figure out how preserve line breaks. Edit: Improved it a bit.