YYY's picture

YYY

zzfive

·

ZZfive

AI & ML interests

None yet

Organizations

None yet

zzfive's activity

upvoted 2 papers about 2 hours ago

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published 1 day ago • 45

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published 1 day ago • 71

upvoted 2 papers 14 days ago

WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild

Paper • 2409.03753 • Published 14 days ago • 17

Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

Paper • 2409.02634 • Published 16 days ago • 84

upvoted a paper 15 days ago

GenAgent: Build Collaborative AI Systems with Automated Workflow Generation -- Case Studies on ComfyUI

Paper • 2409.01392 • Published 17 days ago • 9

upvoted a paper 21 days ago

CogVLM2: Visual Language Models for Image and Video Understanding

Paper • 2408.16500 • Published 22 days ago • 55

upvoted 3 papers about 1 month ago

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

Paper • 2408.08152 • Published Aug 15 • 51

ControlNeXt: Powerful and Efficient Control for Image and Video Generation

Paper • 2408.06070 • Published Aug 12 • 52

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

Paper • 2408.01800 • Published Aug 3 • 74

upvoted 3 papers about 2 months ago

Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model

Paper • 2407.16982 • Published Jul 24 • 40

MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence

Paper • 2407.16655 • Published Jul 23 • 28

Artist: Aesthetically Controllable Text-Driven Stylization without Training

Paper • 2407.15842 • Published Jul 22 • 13

upvoted 7 papers 2 months ago

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18 • 52

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15 • 153

H2O-Danube3 Technical Report

Paper • 2407.09276 • Published Jul 12 • 18

MUSCLE: A Model Update Strategy for Compatible LLM Evolution

Paper • 2407.09435 • Published Jul 12 • 20

VEnhancer: Generative Space-Time Enhancement for Video Generation

Paper • 2407.07667 • Published Jul 10 • 11

LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models

Paper • 2407.07895 • Published Jul 10 • 40

FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

Paper • 2407.04051 • Published Jul 4 • 35

upvoted 2 papers 3 months ago

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3 • 92

FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds

Paper • 2407.01494 • Published Jul 1 • 13

upvoted a collection 4 months ago

LLMs

204 items • Updated about 2 hours ago • 14