Distilbert Articles

Page 1 of 1 (3 articles)

7/13/2022 • EN

Learn to optimize Hugging Face Transformers models for GPU inference using Optimum and ONNX Runtime to reduce latency.

Distilbert Gpu Optimization Onnx Runtime Optimum Transformers

6/7/2022 • EN

Learn how to use Hugging Face Optimum and ONNX Runtime to apply static quantization to a DistilBERT model, achieving ~3x latency improvements.

Distilbert Hugging Face Optimum Model Optimization Onnx Runtime Quantization

4/21/2022 • EN

A guide to deploying Hugging Face's DistilBERT model for serverless inference using Amazon SageMaker, including setup and deployment steps.

Amazon Sagemaker Distilbert Hugging Face Serverless Inference Transformers

Select Language