Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2409.01704

Image Processing

briaai/RMBG-1.4

Image Segmentation • Updated May 23 • 1.87M • 1.45k
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published 17 days ago • 72

about 18 hours ago

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

Paper • 2403.09029 • Published Mar 14 • 54
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

Paper • 2403.12968 • Published Mar 19 • 24
RAFT: Adapting Language Model to Domain Specific RAG

Paper • 2403.10131 • Published Mar 15 • 66
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14 • 69

about 22 hours ago

LocalMamba: Visual State Space Model with Windowed Selective Scan

Paper • 2403.09338 • Published Mar 14 • 7
GiT: Towards Generalist Vision Transformer through Universal Language Interface

Paper • 2403.09394 • Published Mar 14 • 25
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29 • 32
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Paper • 2405.10300 • Published May 16 • 26

BioMistral/BioMistral-7B

Text Generation • Updated Feb 21 • 16.1k • 374
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published 17 days ago • 72

Speculative Streaming: Fast LLM Inference without Auxiliary Models

Paper • 2402.11131 • Published Feb 16 • 41
Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15 • 51
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 94
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 17

Specialized Language Models with Cheap Inference from Limited Domain Data

Paper • 2402.01093 • Published Feb 2 • 45
Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published 14 days ago • 83
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published 17 days ago • 72
jina-embeddings-v3: Multilingual Embeddings With Task LoRA

Paper • 2409.10173 • Published 4 days ago • 15

about 14 hours ago

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Paper • 2312.08578 • Published Dec 14, 2023 • 16
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

Paper • 2312.08583 • Published Dec 14, 2023 • 9
Vision-Language Models as a Source of Rewards

Paper • 2312.09187 • Published Dec 14, 2023 • 11
StemGen: A music generation model that listens

Paper • 2312.08723 • Published Dec 14, 2023 • 47

spacy/en_core_web_lg

Token Classification • Updated Nov 21, 2023 • 266 • 24
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published 17 days ago • 72

InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning

Paper • 2305.06500 • Published May 11, 2023 • 4
PaLI-3 Vision Language Models: Smaller, Faster, Stronger

Paper • 2310.09199 • Published Oct 13, 2023 • 24
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models

Paper • 2306.05424 • Published Jun 8, 2023 • 7
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published 17 days ago • 72

One Wide Feedforward is All You Need

Paper • 2309.01826 • Published Sep 4, 2023 • 31
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published 17 days ago • 72

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs