Trangle Heshvp's picture

Trangle Heshvp

Trangle

·

AI & ML interests

None yet

Organizations

Trangle's activity

upvoted a collection 10 days ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 11 items • Updated 10 days ago • 328

upvoted an article about 2 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 244

upvoted 4 collections 2 months ago

Gemma Scope Release

A comprehensive, open suite of sparse autoencoders for Gemma 2 2B and 9B. • 10 items • Updated Aug 11 • 13

Llama 3.1 Evals

This collection provides detailed information on how we derived the reported benchmark metrics for the Llama 3.1 models, including the configurations, • 6 items • Updated 10 days ago • 16

Minitron

A family of compressed models obtained via pruning and knowledge distillation • 9 items • Updated 3 days ago • 54

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 174

upvoted 2 papers 3 months ago

MUSCLE: A Model Update Strategy for Compatible LLM Evolution

Paper • 2407.09435 • Published Jul 12 • 20

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15 • 154

upvoted an article 3 months ago

Article

BM25 for Python: Achieving high performance while simplifying dependencies with BM25S⚡

By

•

Jul 9

• 35

upvoted a paper 3 months ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3 • 43

upvoted a collection 3 months ago

Step-DPO

Resources for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs" • 11 items • Updated Jul 1 • 5

upvoted an article 3 months ago

Article

Welcome Gemma 2 - Google's new open LLM

Jun 27

• 118

upvoted a paper 3 months ago

SpeechVerse: A Large-scale Generalizable Audio Language Model

Paper • 2405.08295 • Published May 14 • 14

upvoted a collection 3 months ago

TaskMeAnything

A collection of TaskMeAnything resources [https://github.com/JieyuZ2/TaskMeAnything] • 12 items • Updated Aug 4 • 3

upvoted 2 articles 3 months ago

Article

Unlocking Longer Generation with Key-Value Cache Quantization

May 16

• 28

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24

• 170

upvoted a collection 4 months ago

WildBench

4 items • Updated 11 days ago • 3

upvoted a paper 4 months ago

Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering

Paper • 2406.10208 • Published Jun 14 • 21

upvoted a collection 4 months ago

System Message Generalization

11 items • Updated Jun 7 • 3

upvoted a paper 4 months ago

Block Transformer: Global-to-Local Language Modeling for Fast Inference

Paper • 2406.02657 • Published Jun 4 • 36

upvoted an article 4 months ago

Article

Fish Speech V1 - New Multilingual Open Source TTS Model

By

•

May 3

• 13

upvoted 3 collections 4 months ago

Neo-Models

Neo • 9 items • Updated May 29 • 17

Neo-Datasets

2 items • Updated Jun 12 • 2

PM-pair

This is a collection of materials for training pairwise preference model. • 3 items • Updated May 10 • 2

upvoted 2 collections 5 months ago

EcomXL-ControlNet

3 items • Updated May 15 • 2

Eurus

Advancing LLM Reasoning Generalists with Preference Trees • 11 items • Updated Apr 15 • 24

upvoted an article 5 months ago

Article

DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive

By

•

Apr 9

• 29

upvoted an article 6 months ago

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15

• 161

upvoted 2 papers about 1 year ago

An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

Paper • 2309.09958 • Published Sep 18, 2023 • 18

A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale

Paper • 2309.06497 • Published Sep 12, 2023 • 4