• Zarxrax@lemmy.world
    link
    fedilink
    English
    arrow-up
    22
    ·
    11 days ago

    Well, hopefully this will at least force stability’s hand in some way and get them to at least make an official statement instead of just remaining silent.

        • j4k3@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          ·
          11 days ago

          It has a lot of potential if the T5 can be made conversational. After diving into a custom DPM adaptive sampler, there is a lot more specificity required. I believe the vast majority of people are not using the model with the correct workflow. Applying the old model workflows to SD3 makes garbage results. The 2 CLIPS models and the T5 need separate prompts, and the negative prompt needs an inverted channel with a slight delay before reintegration. I also think the smaller quantized version of the T5 is likely the primary problem overall. Any Transformer text model that small, that is them quantized to extremely small size is problematic.

          The license is garbage. The company is toxic. But the tool is more complex than most of the community seems to understand. I can generate a woman lying on grass in many intentional and iterative ways.

          • brucethemoose@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            11 days ago

            Yeah, and it’s just fp8 truncation right? Not actual “smart” quantization? That’s even a big hit for huge decoder-only llms.