fffiloni (Sylvain Filoni)

upvoted 3 papers 4 days ago

Disco4D: Disentangled 4D Human Generation and Animation from a Single Image

Paper • 2409.17280 • Published 10 days ago • 8

Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction

Paper • 2409.18124 • Published 9 days ago • 23

PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation

Paper • 2409.18964 • Published 8 days ago • 20

upvoted an article 9 days ago

Article

🌟 Easy Fine-Tuning with Hugging Face SQL Console, Notebook Creator, and SFT

By

•

11 days ago

• 12

upvoted a collection 9 days ago

Flux.1-dev ControlNets

Collection

A collection of ControlNet models for Flux.1-dev by Jasper Research • 4 items • Updated 12 days ago • 7

upvoted a paper 9 days ago

Portrait Video Editing Empowered by Multimodal Generative Priors

Paper • 2409.13591 • Published 15 days ago • 15

upvoted a paper 10 days ago

MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling

Paper • 2409.16160 • Published 11 days ago • 30

upvoted an article 13 days ago

Article

Exploring the Daily Papers Page on Hugging Face

13 days ago

• 25

upvoted an article 15 days ago

Article

Introducing Community Tools on HuggingChat

20 days ago

• 26

upvoted 10 papers 15 days ago

Seed-Music: A Unified Framework for High Quality and Controlled Music Generation

Paper • 2409.09214 • Published 22 days ago • 45

SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer

Paper • 2409.08425 • Published 23 days ago • 9

StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation

Paper • 2409.12576 • Published 17 days ago • 14

FlexiTex: Enhancing Texture Generation with Visual Guidance

Paper • 2409.12431 • Published 17 days ago • 9

3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

Paper • 2409.12957 • Published 16 days ago • 17

LVCD: Reference-based Lineart Video Colorization with Diffusion Models

Paper • 2409.12960 • Published 16 days ago • 20

upvoted an article 22 days ago

Article

"Diffusers Image Fill" guide

By

•

22 days ago

• 31

upvoted 2 papers 22 days ago

LLaMA-Omni: Seamless Speech Interaction with Large Language Models

Paper • 2409.06666 • Published 25 days ago • 54

Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models

Paper • 2409.07452 • Published 24 days ago • 18

upvoted a paper 23 days ago

VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos

Paper • 2409.07450 • Published 24 days ago • 10

upvoted 6 papers 26 days ago

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3 • 31

FLUX that Plays Music

Paper • 2409.00587 • Published Sep 1 • 31

VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges

Paper • 2409.01071 • Published Sep 2 • 26

FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation

Paper • 2409.02245 • Published Sep 3 • 9

Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation

Paper • 2409.03718 • Published about 1 month ago • 25

Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing

Paper • 2409.01322 • Published Sep 2 • 95

upvoted 6 papers about 1 month ago

Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

Paper • 2409.02634 • Published Sep 4 • 85

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Paper • 2409.02095 • Published Sep 3 • 33

Kalman-Inspired Feature Propagation for Video Face Super-Resolution

Paper • 2408.05205 • Published Aug 9 • 8

Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation

Paper • 2408.15239 • Published Aug 27 • 27

MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement

Paper • 2408.14211 • Published Aug 26 • 8

Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27 • 121

upvoted 9 papers about 2 months ago

Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion

Paper • 2408.00458 • Published Aug 1 • 10

TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models

Paper • 2408.00735 • Published Aug 1 • 15

SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1 • 105

MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models

Paper • 2408.01337 • Published Aug 2 • 10

TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling

Paper • 2408.01291 • Published Aug 2 • 11

ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer

Paper • 2408.03284 • Published Aug 6 • 9

Facing the Music: Tackling Singing Voice Separation in Cinematic Audio Source Separation

Paper • 2408.03588 • Published Aug 7 • 6

Fast Sprite Decomposition from Animated Graphics

Paper • 2408.03923 • Published Aug 7 • 7

Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches

Paper • 2408.04567 • Published Aug 8 • 23

upvoted an article about 2 months ago

Article

A Complete Guide to Audio Datasets

Dec 15, 2022

• 17

upvoted 15 papers 2 months ago

T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Paper • 2403.14610 • Published Mar 21 • 3

Animate3D: Animating Any 3D Model with Multi-view Video Diffusion

Paper • 2407.11398 • Published Jul 16 • 8

Kinetic Typography Diffusion Model

Paper • 2407.10476 • Published Jul 15 • 1

Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle

Paper • 2407.19548 • Published Jul 28 • 22

Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models

Paper • 2407.19474 • Published Jul 28 • 22

Bridging the Gap: Studio-like Avatar Creation from a Monocular Phone Capture

Paper • 2407.19593 • Published Jul 28 • 12

Artist: Aesthetically Controllable Text-Driven Stylization without Training

Paper • 2407.15842 • Published Jul 22 • 13

AccDiffusion: An Accurate Method for Higher-Resolution Image Generation

Paper • 2407.10738 • Published Jul 15 • 3

DreamDissector: Learning Disentangled Text-to-3D Generation from 2D Diffusion Priors

Paper • 2407.16260 • Published Jul 23 • 1

SHIC: Shape-Image Correspondences with no Keypoint Supervision

Paper • 2407.18907 • Published Jul 26 • 39

Text2Place: Affordance-aware Text Guided Human Placement

Paper • 2407.15446 • Published Jul 22 • 2

BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation

Paper • 2407.17952 • Published Jul 25 • 27

Floating No More: Object-Ground Reconstruction from a Single Image

Paper • 2407.18914 • Published Jul 26 • 18

EVLM: An Efficient Vision-Language Model for Visual Understanding

Paper • 2407.14177 • Published Jul 19 • 42

FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds

Paper • 2407.01494 • Published Jul 1 • 13

Sylvain Filoni

AI & ML interests

Articles

Breaking Barriers: The Critical Role of Art and Design in Advancing AI Capabilities

Organizations

fffiloni's activity

🌟 Easy Fine-Tuning with Hugging Face SQL Console, Notebook Creator, and SFT

Exploring the Daily Papers Page on Hugging Face

Introducing Community Tools on HuggingChat

"Diffusers Image Fill" guide

A Complete Guide to Audio Datasets