-
The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
Paper • 2309.17421 • Published • 4 -
Improved Baselines with Visual Instruction Tuning
Paper • 2310.03744 • Published • 36 -
Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency
Paper • 2310.03734 • Published • 14 -
Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
Paper • 2310.08541 • Published • 17
Collections
Discover the best community collections!
Collections including paper arxiv:2401.01952
-
Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images
Paper • 2308.16582 • Published • 10 -
DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation
Paper • 2310.13119 • Published • 11 -
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Paper • 2310.16818 • Published • 30 -
Text-to-3D with classifier score distillation
Paper • 2310.19415 • Published • 4
-
OmnimatteRF: Robust Omnimatte with 3D Background Modeling
Paper • 2309.07749 • Published • 7 -
AudioSR: Versatile Audio Super-resolution at Scale
Paper • 2309.07314 • Published • 24 -
Generative Image Dynamics
Paper • 2309.07906 • Published • 52 -
MagiCapture: High-Resolution Multi-Concept Portrait Customization
Paper • 2309.06895 • Published • 27
-
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 75 -
Natural Language Supervision for General-Purpose Audio Representations
Paper • 2309.05767 • Published • 9 -
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Paper • 2309.08532 • Published • 52 -
AudioSR: Versatile Audio Super-resolution at Scale
Paper • 2309.07314 • Published • 24