Machine Learning articles

9/15/2021 • EN

MLOps Community - System Design for RecSys & Search

A talk on system design principles for building production recommendation systems and search engines, presented at an MLOps Community meetup.

Machine Learning Mlop Recsys Search system design

Eugene Yan

8/3/2021 • EN

The History of Speech Recognition to the Year 2030

A forecast of speech recognition technology's evolution from 2010 to 2030, analyzing past progress and predicting future trends.

artificial intelligence Deep Learning Machine Learning Natural Language Processing speech recognition

Awni Hannun

8/2/2021 • EN

Amazon Science - Eugene Yan and the Art of Writing about Science

Profile of Amazon applied scientist Eugene Yan, focusing on his career in data science and his influential technical writing about machine learning.

Amazon career development Data Science Machine Learning Writing

Eugene Yan

8/1/2021 • EN

Bootstrapping Labels via ___ Supervision & Human-In-The-Loop

Explores methods like semi-supervised and active learning to create training labels when labeled datasets are unavailable, with industry examples.

Active Learning Labeling Machine Learning Semi Supervised Learning Weak Supervision

Eugene Yan

7/20/2021 • EN

Mailbag: How to Bootstrap Labels for Relevant Docs in Search

Explains how to bootstrap training labels for a semantic search system using initial lexical search and user click data instead of costly human annotation.

Bm25 Information Retrieval Machine Learning Recallk Semantic Search

Eugene Yan

7/19/2021 • EN

ICML 2021 Invited Speakers — ML for Science

Highlights ICML 2021 invited talks on applying machine learning to scientific domains like drug discovery, climate science, poverty alleviation, and neuroscience.

AI Research Drug Discovery Icml Machine Learning Scientific Computing

John Langford

7/13/2021 • EN

SF Big Analytics - System Design for RecSys & Search

A talk on system design for recommendation and search systems, covering architecture and production considerations.

Machine Learning production Recsys Search system design

Eugene Yan

7/11/2021 • EN

What are Diffusion Models?

An in-depth technical explanation of diffusion models, a class of generative AI models that create data by reversing a noise-adding process.

Deep Learning Diffusion Models Generative Models Machine Learning Neural Networks

Lilian Weng

7/9/2021 • EN

Introduction to Deep Learning

A comprehensive deep learning course overview with PyTorch tutorials, covering fundamentals, neural networks, and advanced topics like CNNs and GANs.

computer vision Deep Learning Machine Learning Neural Networks Pytorch

Sebastian Raschka

7/9/2021 • EN

Introduction to Deep Learning

A comprehensive deep learning course covering fundamentals, neural networks, computer vision, and generative models using PyTorch.

computer vision Deep Learning Machine Learning Neural Networks Pytorch

Sebastian Raschka

7/4/2021 • EN

Is GitHub a derivative work of GPL'd software?

Analyzes the legal implications of GitHub Copilot potentially being a derivative work of GPL-licensed code used in its training.

Copyright Derivative Work Gpl Machine Learning Software Licensing

Drew DeVault

7/4/2021 • EN

Influencing without Authority for Data Scientists

A data scientist shares practical strategies and mindsets for influencing technical teams and driving change without formal authority.

aws Data Science Kubernetes leadership Machine Learning

Eugene Yan

6/27/2021 • EN

System Design for Recommendations and Search

Explores system design patterns for industrial-scale recommendation and search engines, focusing on offline/online components and retrieval/ranking stages.

Approximate Nearest Neighbors Machine Learning recommendation systems Search Systems system design

Eugene Yan

6/13/2021 • EN

Patterns for Personalization in Recommendations and Search

Explores machine learning patterns like bandits, sequential, and graph-based models for personalizing recommendations and search results.

Contextual Bandits Machine Learning Personalization recommendation systems search algorithms

Eugene Yan

6/5/2021 • EN

Few-shot learning in practice with GPT-Neo

A guide to implementing few-shot learning using the GPT-Neo language model and Hugging Face's inference API for NLP tasks.

Few Shot Learning Gpt Neo Language Models Machine Learning Natural Language Processing

Philipp Schmid

6/2/2021 • EN

Towards Data Science - Author Spotlight with Eugene Yan

An interview with data scientist Eugene Yan discussing his career path from psychology to Amazon, favorite ML projects, and advice for aspiring data scientists.

Applied Scientist Career Data Analyst Datascience Machine Learning

Eugene Yan

5/10/2021 • EN

Introducing mltrace

Introducing mltrace, an open-source lineage and tracing tool for debugging and maintaining production machine learning pipelines.

Lineage Tracking Machine Learning Mlop Mltrace Pipeline Tracing

Shreya Shankar

5/4/2021 • EN

Explainable AI Cheat Sheet

A high-level guide to tools and methods for understanding AI/ML models and their predictions, known as Explainable AI (XAI).

ai ethics Cheat Sheet Explainable AI Machine Learning Model Interpretability

Jay Alammar

5/2/2021 • EN

The Metagame of Applying Machine Learning

Explores the strategic 'metagame' of applying machine learning in industry, focusing on problem selection and business impact over pure technical knowledge.

Applied Science career advice Data Science Industry Applications Machine Learning

Eugene Yan

5/2/2021 • EN

Generalisability, prediction, and causation

Explores the distinction between using regression models for causal inference versus predictive inference, and the role of generalizability in prediction.

Causal Inference Data Science Machine Learning Predictive Modeling statistics

Thomas Lumley