Philipp Schmid

Philipp Schmid is a Staff Engineer at Google DeepMind, building AI Developer Experience and DevRel initiatives. He specializes in LLMs, RLHF, and making advanced AI accessible to developers worldwide.

https://www.philschmid.de

RSS Feed

1/22/2026

AI LLMs developer experience Google DeepMind RLHF

Articles from this Blog

189 articles from this blog

10/25/2022 • EN

Deploy T5 11B for inference for less than $500

A tutorial on deploying the T5 11B language model for inference using Hugging Face Inference Endpoints on a budget.

Hugging Face Transformer Model Deployment

10/18/2022 • EN

Outperform OpenAI GPT-3 with SetFit for text-classification

Learn how SetFit, a new approach from Intel Labs and Hugging Face, outperforms GPT-3 for text classification with minimal labeled data.

Machine Learning Text Classification Few Shot Learning

10/13/2022 • EN

Fine-tuning LayoutLM for document-understanding using Keras and Hugging Face Transformers

A tutorial on fine-tuning Microsoft's LayoutLM model for document understanding using TensorFlow, Keras, and the FUNSD dataset.

Kera Fine Tuning Hugging Face Transformers

10/6/2022 • EN

Deploy LayoutLM with Hugging Face Inference Endpoints

A tutorial on deploying the LayoutLM document understanding model using Hugging Face Inference Endpoints for production API integration.

Hugging Face Inference Endpoints Layoutlm

10/4/2022 • EN

Document AI: Fine-tuning LayoutLM for document-understanding using Hugging Face Transformers

A tutorial on fine-tuning Microsoft's LayoutLM model for document understanding and information extraction using the Hugging Face Transformers library.

computer vision Hugging Face Transformers Document AI

9/29/2022 • EN

Custom Inference with Hugging Face Inference Endpoints

A tutorial on creating custom inference handlers for Hugging Face Inference Endpoints to add business logic and dependencies.

Machine Learning Transformers Hugging Face

9/13/2022 • EN

Accelerate GPT-J inference with DeepSpeed-Inference on GPUs

Learn to optimize GPT-J inference using DeepSpeed-Inference and Hugging Face Transformers for faster GPU performance.

large language models Gpu Optimization Transformer Models

9/6/2022 • EN

Document AI: Fine-tuning Donut for document-parsing using Hugging Face Transformers

A tutorial on fine-tuning the Donut model for document parsing using Hugging Face Transformers and the SROIE dataset.

Transformers Hugging Face Document AI

8/30/2022 • EN

Use Sentence Transformers with TensorFlow

A tutorial on using Sentence Transformers models with TensorFlow and Keras to create text embeddings for semantic search and similarity tasks.

Kera Tensorflow Bert

8/24/2022 • EN

Pre-Training BERT with Hugging Face Transformers and Habana Gaudi

A tutorial on pre-training a BERT model from scratch using Hugging Face Transformers and Habana Gaudi accelerators on AWS.

Pretraining Bert Hugging Face Transformers

8/16/2022 • EN

Accelerate BERT inference with DeepSpeed-Inference on GPUs

Learn to optimize BERT and RoBERTa models for faster GPU inference using DeepSpeed-Inference, reducing latency from 30ms to 10ms.

Transformers Gpu Inference Optimization

8/2/2022 • EN

Accelerate Sentence Transformers with Hugging Face Optimum

Learn to optimize Sentence Transformers models for faster inference using Hugging Face Optimum, ONNX Runtime, and dynamic quantization.

performance optimization Model Quantization Onnx Runtime

7/26/2022 • EN

Deep Learning setup made easy with EC2 Remote Runner and Habana Gaudi

A guide to simplifying deep learning workflows using AWS EC2 Remote Runner and Habana Gaudi processors for efficient, cost-effective model training.

Python cloud computing aws ec2

7/19/2022 • EN

Accelerate Vision Transformer (ViT) with Quantization using Optimum

Learn to accelerate Vision Transformer (ViT) models using quantization with Hugging Face Optimum and ONNX Runtime for improved latency.

Quantization Vision Transformer Onnx Runtime

7/13/2022 • EN

Optimizing Transformers for GPUs with Optimum

Learn to optimize Hugging Face Transformers models for GPU inference using Optimum and ONNX Runtime to reduce latency.

Transformers Gpu Optimization Onnx Runtime

7/5/2022 • EN

Hugging Face Transformers and Habana Gaudi AWS DL1 Instances

Learn how to fine-tune the XLM-RoBERTa model for multilingual text classification using Hugging Face libraries on cost-efficient Habana Gaudi AWS instances.

Deep Learning Text Classification Hugging Face Transformers

6/30/2022 • EN

Optimizing Transformers with Hugging Face Optimum

Learn to optimize Hugging Face Transformers models using Optimum and ONNX Runtime for faster inference with dynamic quantization.

Transformers Hugging Face Quantization

6/21/2022 • EN

Convert Transformers to ONNX with Hugging Face Optimum

A guide on converting Hugging Face Transformers models to the ONNX format using the Optimum library for optimized deployment.

Neural Networks Transformers Onnx

6/14/2022 • EN

Setup Deep Learning environment for Hugging Face Transformers with Habana Gaudi on AWS

Guide to setting up a deep learning environment on AWS using Habana Gaudi accelerators and Hugging Face libraries for transformer models.

aws Transformers Deep Learning

6/7/2022 • EN

Static Quantization with Hugging Face `optimum` for ~3x latency improvements

Learn how to use Hugging Face Optimum and ONNX Runtime to apply static quantization to a DistilBERT model, achieving ~3x latency improvements.

Quantization Onnx Runtime Model Optimization

Previous 1 ... 5 6 7 8 9 10 Next