Philipp Schmid

Welcome Gemma - Google's new open LLM

Feb 21

• 16

From OpenAI to Open LLMs with Messages API

Feb 8

• 11

Hugging Face Text Generation Inference available for AWS Inferentia2

Feb 1

Hugging Face and Google partner for open AI collaboration

Jan 25

Mixture of Experts Explained

Dec 11, 2023

• 162

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

Dec 11, 2023

• 9

Deploy Embedding Models with Hugging Face Inference Endpoints

Oct 24, 2023

Llama 2 on Amazon SageMaker a Benchmark

Sep 26, 2023

Fine-tuning Llama 2 70B using PyTorch FSDP

Sep 13, 2023

• 13

Spread Your Wings: Falcon 180B is here

Sep 6, 2023

Code Llama: Llama 2 learns to code

Aug 25, 2023

• 5

Introducing SafeCoder

Aug 22, 2023

Hugging Face Platform on the AWS Marketplace: Pay with your AWS Account

Aug 10, 2023

Llama 2 is here - get it on Hugging Face

Jul 18, 2023

• 20

Deploy LLMs with Hugging Face Inference Endpoints

Jul 4, 2023

• 11

The Falcon has landed in the Hugging Face ecosystem

Jun 5, 2023

• 9

Introducing the Hugging Face LLM Inference Container for Amazon SageMaker

May 31, 2023

Hugging Face Collaborates with Microsoft to Launch Hugging Face Model Catalog on Azure

May 24, 2023

Creating a Coding Assistant with StarCoder

May 9, 2023

Accelerating Hugging Face Transformers with AWS Inferentia2

Apr 17, 2023

Hugging Face and AWS partner to make AI more accessible

Feb 21, 2023

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

Aug 22, 2022

Convert Transformers to ONNX with Hugging Face Optimum

Jun 22, 2022

• 3

Accelerated Inference with Optimum and Transformers Pipelines

May 10, 2022

Accelerate BERT inference with Hugging Face Transformers and AWS inferentia

Mar 16, 2022

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

Jan 13, 2022

Deploy GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker

Jan 11, 2022

Few-shot learning in practice: GPT-NEO and the 🤗 Accelerated Inference API

Jun 3, 2021

• 3

Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker

Apr 8, 2021

Organizations

philschmid's activity

New activity in facebook/Self-taught-evaluator-DPO-data 8 days ago

Please provide chosen & rejected inputs in the messages format

#2 opened 8 days ago by

lewtun

New activity in utter-project/EuroLLM-1.7B 11 days ago

Update README.md

#1 opened 11 days ago by

New activity in huggingface/documentation-images 23 days ago

Upload `google-cloud/thumbnail.png`

#365 opened 23 days ago by

alvarobartt

New activity in mattshumer/Reflection-Llama-3.1-70B 30 days ago

Fix library for correct "use" and "deploy" options

#15 opened 30 days ago by

New activity in featherless-ai/try-this-model 30 days ago

Reflection not using correct system prompt

#3 opened 30 days ago by

Paper • 2407.01906 • Published Jul 2 • 34 •

New activity in philschmid/llm-pricing about 1 month ago

Update src/lib/data.ts

#11 opened about 1 month ago by

wassemgtk

Update src/lib/data.ts

#10 opened about 1 month ago by

yuvalai21

New activity in hf-doc-build/doc-build about 1 month ago

Create Google-Cloud-Containers/_versions.yml

#26 opened about 1 month ago by

alvarobartt

New activity in huggingface/documentation-images about 1 month ago

Add `thumnail.png` for `Google-Cloud-Containers`

#358 opened about 1 month ago by

alvarobartt

New activity in philschmid/llm-pricing 2 months ago

New model provider: Novita AI

#8 opened 2 months ago by

Jason12234

New activity in mistralai/Mistral-Nemo-Base-2407 3 months ago

1-click to GCP Vertex AI Endpoint fails

#10 opened 3 months ago by

aflansburg

New activity in philschmid/llm-pricing 3 months ago

Update src/lib/data.ts

#5 opened 3 months ago by

yuvalai21

New activity in zeitgeist-ai/financial-rag-nvidia-sec 3 months ago

Librarian Bot: Add language metadata for dataset

#2 opened 3 months ago by

librarian-bot

commented a paper 3 months ago

Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models

New activity in philschmid/llm-pricing 3 months ago

Small difference in the IBM WatsonX price compared to Excel

#3 opened 3 months ago by

paul-roro589

commented a paper 3 months ago

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Paper • 2407.01370 • Published Jul 1 • 85 •

New activity in philschmid/llm-pricing 3 months ago

Adding Writer LLMs

#1 opened 3 months ago by

wassemgtk

Update src/lib/data.ts

#2 opened 3 months ago by

wassemgtk

New activity in mistralai/Mistral-7B-Instruct-v0.3 3 months ago

Deploying a fine-tuned model with custom inference code

#53 opened 3 months ago by

maz-qualtrics

commented a paper 3 months ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25 • 85 •

New activity in philschmid/stable-diffusion-v1-4-endpoints 3 months ago

Update API

#4 opened 3 months ago by

julien-c

New activity in BAAI/bge-reranker-base 3 months ago

Add TEI support tag

#19 opened 3 months ago by

New activity in BAAI/Infinity-Instruct 4 months ago

Different dataset versions ? 3M / 0608 / 06012

#4 opened 4 months ago by

Paper • 2406.11813 • Published Jun 17 • 29 •

commented a paper 4 months ago

How Do Large Language Models Acquire Factual Knowledge During Pretraining?

New activity in RLHFlow/ArmoRM-Llama3-8B-v0.1 4 months ago

Update README.md

#3 opened 4 months ago by

Paper • 2406.04744 • Published Jun 7 • 40 •

commented a paper 4 months ago

CRAG -- Comprehensive RAG Benchmark

New activity in microsoft/Phi-3-medium-128k-instruct 5 months ago

Add attention_bias to make TGI work

#5 opened 5 months ago by

New activity in microsoft/Phi-3-medium-4k-instruct 5 months ago

Add attention_bias to make TGI work

#2 opened 5 months ago by

New activity in microsoft/Phi-3-small-128k-instruct 5 months ago

Add attention_bias to make TGI work

#4 opened 5 months ago by

New activity in microsoft/Phi-3-small-8k-instruct 5 months ago

Add attention_bias to make TGI work

#3 opened 5 months ago by

New activity in microsoft/Phi-3-mini-128k-instruct 5 months ago

Add attention_bias to make TGI work

#68 opened 5 months ago by

New activity in microsoft/Phi-3-mini-4k-instruct 5 months ago

Add attention_bias to make TGI work

#64 opened 5 months ago by

Paper • 2405.04434 • Published May 7 • 13 •

commented a paper 5 months ago

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

New activity in mlabonne/Meta-Llama-3-120B-Instruct 5 months ago

fix snippet

#8 opened 5 months ago by

New activity in meta-llama/Meta-Llama-3-70B-Instruct 5 months ago

[Don't merge] inferentia2 workaround

#34 opened 5 months ago by

New activity in meta-llama/Meta-Llama-3-8B-Instruct 6 months ago

Fix chat template to add generation prompt only if the option is selected

#9 opened 6 months ago by

ArkaAbacus

New activity in meta-llama/Meta-Llama-3-70B-Instruct 6 months ago

Fix chat template to add generation prompt only if the option is selected

#6 opened 6 months ago by

ArkaAbacus

New activity in meta-llama/Meta-Llama-Guard-2-8B 6 months ago

template-format

#5 opened 6 months ago by

pcuenq

New activity in meta-llama/Meta-Llama-3-8B-Instruct 6 months ago

Example for AutoModelForCausalLM

#11 opened 6 months ago by

pcuenq

New activity in mistral-community/Mixtral-8x22B-v0.1 6 months ago

mention official weights

#15 opened 6 months ago by

Paper • 2404.02258 • Published Apr 2 • 104 •

commented a paper 6 months ago

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

New activity in aws-neuron/optimum-neuron-cache 6 months ago

Create stable-diffusion.json

#43 opened 6 months ago by

Jingya

commented a paper 6 months ago

Improved Baselines with Visual Instruction Tuning

Paper • 2310.03744 • Published Oct 5, 2023 • 36 •

New activity in ai21labs/Jamba-v0.1 6 months ago

Remove TGI tag

#8 opened 6 months ago by

osanseviero

New activity in databricks/dbrx-instruct 6 months ago

Add license with external link to databricks

#4 opened 6 months ago by

New activity in databricks/dbrx-base 6 months ago

Add license with external link to databricks

#4 opened 6 months ago by

New activity in google/gemma-7b 6 months ago

Fine-Tune Gemma with ChatML and Transformer Reinforcement Learning

#80 opened 6 months ago by

Ateeqq

New activity in google/gemma-7b 7 months ago

Deploy in Sagemaker

#42 opened 7 months ago by

XuanNg

New activity in philschmid/gemma-7b-dolly-chatml 7 months ago

Thanks. GGUF coming?

#1 opened 7 months ago by deleted

New activity in philschmid/gemma-tokenizer-chatml 7 months ago

Update tokenizer_config.json

#1 opened 7 months ago by

Paper • 2402.13064 • Published Feb 20 • 46 •

commented a paper 7 months ago

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

New activity in google/gemma-7b-it 8 months ago

Type Error when executing dummy script

#8 opened 8 months ago by

LKriesch

New activity in google/gemma-2b 8 months ago

add tos

#6 opened 8 months ago by

New activity in google/gemma-2b-it 8 months ago

add tos

#7 opened 8 months ago by

New activity in google/gemma-7b 8 months ago

add tos

#13 opened 8 months ago by

New activity in google/gemma-7b-it 8 months ago

add tos

#5 opened 8 months ago by

New activity in TheBloke/Llama-2-13B-chat-GPTQ 8 months ago

add template

#51 opened 8 months ago by