ltl's picture

1 47 1

ltl

ltl

·

2793145003

AI & ML interests

None yet

Organizations

None yet

ltl's activity

upvoted a paper 26 days ago

Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

Paper • 2409.02634 • Published Sep 4 • 85

upvoted 2 papers about 2 months ago

Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8 • 154

Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

Paper • 2408.02657 • Published Aug 5 • 32

upvoted 9 papers 2 months ago

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

Paper • 2408.01800 • Published Aug 3 • 74

Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion

Paper • 2408.00458 • Published Aug 1 • 10

Tails Tell Tales: Chapter-Wide Manga Transcriptions with Character Names

Paper • 2408.00298 • Published Aug 1 • 9

SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1 • 105

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31 • 103

Diffusion Feedback Helps CLIP See Better

Paper • 2407.20171 • Published Jul 29 • 34

VILA^2: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24 • 38

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Paper • 2407.16741 • Published Jul 23 • 67

KAN or MLP: A Fairer Comparison

Paper • 2407.16674 • Published Jul 23 • 40

upvoted 7 papers 3 months ago

Human-like Episodic Memory for Infinite Context LLMs

Paper • 2407.09450 • Published Jul 12 • 56

Vision language models are blind

Paper • 2407.06581 • Published Jul 9 • 81

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3 • 92

TokenPacker: Efficient Visual Projector for Multimodal LLM

Paper • 2407.02392 • Published Jul 2 • 21

Can LLMs Learn by Teaching? A Preliminary Study

Paper • 2406.14629 • Published Jun 20 • 17

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Paper • 2406.16860 • Published Jun 24 • 55

Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning

Paper • 2406.14283 • Published Jun 20 • 2

upvoted 8 papers 4 months ago

TroL: Traversal of Layers for Large Language and Vision Models

Paper • 2406.12246 • Published Jun 18 • 34

With Greater Text Comes Greater Necessity: Inference-Time Training Helps Long Text Generation

Paper • 2401.11504 • Published Jan 21 • 1

Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering

Paper • 2406.10208 • Published Jun 14 • 21

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13 • 50

An Image is Worth 32 Tokens for Reconstruction and Generation

Paper • 2406.07550 • Published Jun 11 • 55

Vript: A Video Is Worth Thousands of Words

Paper • 2406.06040 • Published Jun 10 • 22

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10 • 64

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3 • 98

upvoted 4 papers 5 months ago

How Far Are We From AGI

Paper • 2405.10313 • Published May 16 • 2

FIFO-Diffusion: Generating Infinite Videos from Text without Training

Paper • 2405.11473 • Published May 19 • 53

Layer-Condensed KV Cache for Efficient Inference of Large Language Models

Paper • 2405.10637 • Published May 17 • 19

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24 • 26

upvoted a paper 6 months ago

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Paper • 2404.07143 • Published Apr 10 • 103

upvoted 3 papers 7 months ago

Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation

Paper • 2403.12015 • Published Mar 18 • 63

Beyond Language Models: Byte Models are Digital World Simulators

Paper • 2402.19155 • Published Feb 29 • 49

Genie: Generative Interactive Environments

Paper • 2402.15391 • Published Feb 23 • 70

upvoted 2 papers 9 months ago

Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 157

SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Paper • 2312.15166 • Published Dec 23, 2023 • 56

upvoted 3 papers 10 months ago

LLM360: Towards Fully Transparent Open-Source LLMs

Paper • 2312.06550 • Published Dec 11, 2023 • 56

Beyond Surface: Probing LLaMA Across Scales and Layers

Paper • 2312.04333 • Published Dec 7, 2023 • 18

The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning

Paper • 2312.01552 • Published Dec 4, 2023 • 30

upvoted 7 papers about 1 year ago

FLM-101B: An Open LLM and How to Train It with $100K Budget

Paper • 2309.03852 • Published Sep 7, 2023 • 43

MVDream: Multi-view Diffusion for 3D Generation

Paper • 2308.16512 • Published Aug 31, 2023 • 101

CausalLM is not optimal for in-context learning

Paper • 2308.06912 • Published Aug 14, 2023 • 18

Self-Alignment with Instruction Backtranslation

Paper • 2308.06259 • Published Aug 11, 2023 • 40

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 240

AlpaGasus: Training A Better Alpaca with Fewer Data

Paper • 2307.08701 • Published Jul 17, 2023 • 22

Retentive Network: A Successor to Transformer for Large Language Models

Paper • 2307.08621 • Published Jul 17, 2023 • 170