• Marcbmann@lemmy.world
    link
    fedilink
    arrow-up
    8
    ·
    4 months ago

    I mean, I don’t think it’s an easy thing to fix. How do you eliminate bias in the training data without eliminating a substantial percentage of your training data. Which would significantly hinder performance.

    • bamboo@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      10
      ·
      4 months ago

      Rather than eliminating the some of the training data, you could add more training data to create an even balance.

      • kromem@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        4 months ago

        Indeed - there’s a very good argument for using synthetic data to introduce diversity as long as you can avoid model collapse.