Kashif Rasul's picture

Kashif Rasul

kashif

·

AI & ML interests

Time Series Forecasting, Denoising Diffusion, Generative Modeling, Reinforcement Learning

Articles

How NuminaMath Won the 1st AIMO Progress Prize

Preference Optimization for Vision Language Models

🧨 Diffusers welcomes Stable Diffusion 3

Patch Time Series Transformer in Hugging Face

Constitutional AI with Open LLMs

PatchTSMixer in HuggingFace

Preference Tuning LLMs with Direct Preference Optimization Methods

Finetune Stable Diffusion Models with DDPO via TRL

Introducing Würstchen: Fast Diffusion for Image Generation

Fine-tune Llama 2 with DPO

Yes, Transformers are Effective for Time Series Forecasting (+ Autoformer)

StackLLaMA: A hands-on guide to train LLaMA with RLHF

Multivariate Probabilistic Time Series Forecasting with Informer

Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU

Probabilistic Time Series Forecasting with 🤗 Transformers

The Annotated Diffusion Model

Organizations

kashif's activity

upvoted a paper 13 days ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published 16 days ago • 128

upvoted a paper about 1 month ago

Spectrum: Targeted Training on Signal to Noise Ratio

Paper • 2406.06623 • Published Jun 7 • 7

upvoted a collection about 1 month ago

Power-LM

Dense & MoE LLMs trained with power learning rate scheduler. • 3 items • Updated 24 days ago • 14

upvoted 2 papers about 2 months ago

Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents

Paper • 2408.07199 • Published Aug 13 • 20

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Paper • 2408.06292 • Published Aug 12 • 115

upvoted a paper 2 months ago

Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF

Paper • 2405.21046 • Published May 31 • 3

upvoted 4 articles 3 months ago

Article

Putting RL back in RLHF

Jun 12

• 60

Article

🧨 Diffusers welcomes Stable Diffusion 3

Jun 12

• 87

Article

The Annotated Diffusion Model

Jun 7, 2022

• 83

Article

Welcome Gemma 2 - Google's new open LLM

Jun 27

• 118

upvoted 2 papers 3 months ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25 • 85

GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks

Paper • 2406.12925 • Published Jun 14 • 22

upvoted a paper 4 months ago

Margin-aware Preference Optimization for Aligning Diffusion Models without Reference

Paper • 2406.06424 • Published Jun 10 • 11

upvoted 2 papers 7 months ago

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12 • 60

VILA: On Pre-training for Visual Language Models

Paper • 2312.07533 • Published Dec 12, 2023 • 20

upvoted 2 collections 7 months ago

Moirai-1.0-R models

6 items • Updated Aug 1 • 26

Chronos Models & Datasets

Chronos: Pretrained (language) models for time series forecasting based on the T5 architecture. • 8 items • Updated Jun 27 • 29

upvoted a collection 8 months ago

datasets-SPIN

Generated synthetic data used to finetune SPIN. • 8 items • Updated Feb 9 • 11

upvoted 3 papers 10 months ago

A General Theoretical Paradigm to Understand Learning from Human Preferences

Paper • 2310.12036 • Published Oct 18, 2023 • 13

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 120

NERetrieve: Dataset for Next Generation Named Entity Recognition and Retrieval

Paper • 2310.14282 • Published Oct 22, 2023 • 5

upvoted 4 papers 11 months ago

Diffusion Model Alignment Using Direct Preference Optimization

Paper • 2311.12908 • Published Nov 21, 2023 • 47

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 182

Fine-tuning Language Models for Factuality

Paper • 2311.08401 • Published Nov 14, 2023 • 28

LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Paper • 2311.05556 • Published Nov 9, 2023 • 79

upvoted a collection 11 months ago

Reward models on the hub

UNMAINTAINED: See RewardBench... A place to collect reward models, an often not released artifact of RLHF. • 18 items • Updated Apr 13 • 25

upvoted a paper 12 months ago

Matryoshka Diffusion Models

Paper • 2310.15111 • Published Oct 23, 2023 • 40