Naive Bayes and Text Classification
Explores Naive Bayes classifiers for text classification, covering theory and applications like spam filtering and song lyric analysis.
Explores Naive Bayes classifiers for text classification, covering theory and applications like spam filtering and song lyric analysis.
A guide to performing nonlinear dimensionality reduction using RBF Kernel PCA, including theory, implementation, and examples.
An overview of predictive modeling, supervised machine learning, and the core workflow for pattern classification tasks.
An overview of predictive modeling, supervised machine learning, and pattern classification concepts, workflows, and applications.
A technical guide to Linear Discriminant Analysis (LDA) for dimensionality reduction and classification in machine learning, with comparisons to PCA.
A technical guide to Linear Discriminant Analysis (LDA) for dimensionality reduction and classification in machine learning, including a Python implementation.
Highlights of the scikit-learn 0.15 release, including performance improvements, new features, and deprecations.
Explains feature scaling and normalization in machine learning, comparing standardization and Min-Max scaling, with examples using scikit-learn.
A guide to feature scaling and normalization in machine learning, covering standardization, Min-Max scaling, and their implementation in scikit-learn.
A Python tutorial covering essential tools and techniques for machine learning, including data visualization, PCA, LDA, and classification.
A tutorial on using Python tools for machine learning, covering data loading, visualization, preprocessing, and classification with scikit-learn.
A blog post sharing the author's cover letter for an internship at iHub Research, focusing on their interest in automating hate speech detection using AI and NLP.
Explores how personas, data science, and k-means clustering can be used together to analyze user data and gain actionable business insights.
Announcing the four students accepted for Google Summer of Code 2024 to work on scikit-learn projects, including neural networks and performance improvements.
A technical guide to implementing Principal Component Analysis (PCA) for dimensionality reduction, comparing it with MDA and providing code examples.
An author critiques the overuse of PCA in data science, arguing it's not a universal solution for classification problems.
Introduces Stochastic Outlier Selection (SOS), an unsupervised machine learning algorithm for detecting outliers based on affinity between data points.
Explains the mathematical relationship between the tanh and logistic sigmoid functions, and why tanh is preferred in neural networks.
Overview of scikit-learn 0.14 release, highlighting new features like AdaBoost and performance improvements in benchmarks.
Explores using Principal Component Analysis on t-shirt images to build a gender classification model, visualizing data as 'eigenshirts'.