Abstract: Recent text-to-image (T2I) diffusion models have demonstrated remarkable capabilities in visual synthesis, yet their performance heavily relies on the quality of input prompts. However, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results