Philipp Schmid

Philipp Schmid is a Staff Engineer at Google DeepMind, building AI Developer Experience and DevRel initiatives. He specializes in LLMs, RLHF, and making advanced AI accessible to developers worldwide.

https://www.philschmid.de

RSS Feed

1/22/2026

AI LLMs developer experience Google DeepMind RLHF

Articles from this Blog

189 articles from this blog

4/18/2024 • EN

Deploy Llama 3 on Amazon SageMaker

A technical guide on deploying Meta's Llama 3 70B model on Amazon SageMaker using the Hugging Face LLM DLC and Text Generation Inference.

large language models Hugging Face Llama 3

4/2/2024 • EN

Accelerate Mixtral 8x7B with Speculative Decoding and Quantization on Amazon SageMaker

A technical guide on accelerating the Mixtral 8x7B LLM using speculative decoding (Medusa) and quantization (AWQ) for deployment on Amazon SageMaker.

Quantization LLM Inference Amazon Sagemaker

3/26/2024 • EN

Deploy Llama 2 70B on AWS Inferentia2 with Hugging Face Optimum

A technical guide on deploying Meta's Llama 2 70B large language model on AWS Inferentia2 hardware using Hugging Face Optimum and SageMaker.

Amazon Sagemaker Hugging Face Optimum LLM Deployment

3/12/2024 • EN

Fine-Tune and Evaluate LLMs in 2024 with Amazon SageMaker

A technical guide on fine-tuning and evaluating open-source Large Language Models (LLMs) using Amazon SageMaker and Hugging Face libraries.

Hugging Face Model Evaluation Amazon Sagemaker

3/5/2024 • EN

Evaluate LLMs with Hugging Face Lighteval on Amazon SageMaker

A tutorial on evaluating Large Language Models using Hugging Face's Lighteval library on Amazon SageMaker, focusing on benchmarks like TruthfulQA.

benchmarking Hugging Face LLM Evaluation

3/1/2024 • EN

How to fine-tune Google Gemma with ChatML and Hugging Face TRL

A technical guide on fine-tuning Google's Gemma open LLMs using the ChatML format and Hugging Face's TRL library for efficient training on consumer GPUs.

Hugging Face LLM Fine Tuning Trl

1/23/2024 • EN

How to Fine-Tune LLMs in 2024 with Hugging Face

A practical guide to fine-tuning open-source large language models (LLMs) using Hugging Face's TRL and Transformers libraries in 2024.

Transformers Hugging Face Datasets

1/23/2024 • EN

RLHF in 2024 with DPO and Hugging Face

A technical guide on using Direct Preference Optimization (DPO) with Hugging Face's TRL library to align and improve open-source large language models in 2024.

llm Hugging Face Dpo

1/11/2024 • EN

Scale LLM Inference on Amazon SageMaker with Multi-Replica Endpoints

Guide to scaling LLM inference on Amazon SageMaker using new multi-replica endpoints for improved throughput and cost efficiency.

Hugging Face LLM Inference Amazon Sagemaker

12/21/2023 • EN

Fine-tune Llama 7B on AWS Trainium

A technical tutorial on fine-tuning the Llama 2 7B large language model using AWS Trainium instances and Hugging Face libraries.

Hugging Face Model Fine Tuning AWS Trainium

12/20/2023 • EN

Programmatically manage 🤗 Inference Endpoints

Learn to programmatically manage Hugging Face Inference Endpoints using the huggingface_hub Python library for automated model deployment.

Python generative ai Infrastructure As Code

12/12/2023 • EN

Deploy Mixtral 8x7B on Amazon SageMaker

A technical guide on deploying the Mixtral 8x7B open-source LLM from Mistral AI to Amazon SageMaker using the Hugging Face LLM DLC.

Hugging Face Mixture Of Experts Amazon Sagemaker

11/21/2023 • EN

Deploy Embedding Models on AWS inferentia2 with Amazon SageMaker

Tutorial on deploying embedding models using AWS Inferentia2 and Amazon SageMaker for accelerated inference performance.

aws Amazon Sagemaker Optimum Neuron

11/14/2023 • EN

Deploy Llama 2 7B on AWS inferentia2 with Amazon SageMaker

A tutorial on deploying Meta's Llama 2 7B model on AWS Inferentia2 using Amazon SageMaker and the optimum-neuron library.

Model Deployment Amazon Sagemaker Optimum Neuron

11/7/2023 • EN

Deploy Stable Diffusion XL on AWS inferentia2 with Amazon SageMaker

A tutorial on deploying Stable Diffusion XL for accelerated inference using AWS Inferentia2 and Amazon SageMaker.

stable diffusion Model Deployment Amazon Sagemaker

11/3/2023 • EN

Amazon Bedrock: How good (bad) is Titan Embeddings?

An evaluation of Amazon Titan Embeddings on the MTEB benchmark, analyzing its performance, use cases, and lack of transparency.

Vector Embeddings Amazon Bedrock Text Embeddings

10/30/2023 • EN

Evaluate LLMs and RAG a practical example using Langchain and Hugging Face

A hands-on guide to evaluating LLMs and RAG systems using Langchain and Hugging Face, covering criteria-based and pairwise evaluation methods.

Langchain Rag Gpt 4

10/12/2023 • EN

Deploy Idefics 9B and 80B on Amazon SageMaker

A technical guide on deploying Hugging Face's IDEFICS visual language models (9B & 80B parameters) to Amazon SageMaker using the LLM DLC.

large language models Multimodal AI Model Deployment

10/5/2023 • EN

Train and Deploy Mistral 7B with Hugging Face on Amazon SageMaker

A technical guide on fine-tuning the Mistral 7B large language model using QLoRA and deploying it on Amazon SageMaker with Hugging Face tools.

Hugging Face Amazon Sagemaker Qlora

9/26/2023 • EN

Llama 2 on Amazon SageMaker a Benchmark

A benchmark analysis of deploying Meta's Llama 2 models on Amazon SageMaker using Hugging Face's LLM Inference Container, evaluating cost, latency, and throughput.

large language models benchmark Model Deployment

Previous 1 2 3 4 5 6 ... 10 Next