Simulations and modes of convergence
Discusses why simulation summaries should focus on quantiles and robust statistics rather than moments when evaluating asymptotic approximations.
Discusses why simulation summaries should focus on quantiles and robust statistics rather than moments when evaluating asymptotic approximations.
The author reflects on R's rise in programming language rankings and its unexpected adoption across diverse fields over 20 years.
Explains how to compute the Huber/White sandwich estimator incrementally in R's biglm package for large-scale linear regression.
Explores the surprising effectiveness and conservative nature of the Bonferroni correction for multiple hypothesis testing, even with many tests.
A guide for academics with math/physics backgrounds transitioning into data science, covering skills, learning paths, and practical advice.
A data analysis of a radio station's song rotation patterns using vector math and statistical methods to test anecdotal claims about repetitive playtimes.
Explores the statistical concept of 'design consistency' in survey sampling, comparing it to model consistency and discussing asymptotic theory.
Analyzing a classic probability problem involving dice rolls, its historical context with Newton and Pepys, and the mathematical intuition behind it.
Analyzes the pseudorandom number generator defined in NZ Flag Referendum law, comparing it to the Wichmann-Hill algorithm and noting a potential flaw.
Explores valid reasons for using simplified assumptions like 'spherical cows' in statistical modeling and theoretical work.
A technical critique of the Net Reclassification Index (NRI), a statistical measure for evaluating prediction model improvements, highlighting its surprising biases.
Critique of using Shapiro-Wilk normality tests on large, complex survey data like NHANES, explaining why it's statistically inappropriate.
A guide to getting started with Structural Equation Modeling (SEM) in R using the Lavaan package, based on a user group presentation.
Explores different proofs of the Continuous Mapping Theorem in probability theory, discussing their merits and pedagogical value.
The article debunks common misinterpretations of the Dunning-Kruger effect by analyzing the original study's data and findings.
A tutorial introducing the ggplot2 package for data visualization in R, presented at a user group meeting.
A philosophical and technical exploration of the practical meaning of measurability in mathematical statistics, questioning its necessity for real-world data analysis.
Author's 2014 review: writing a data science book from scratch in Python and preparing for/starting a software engineering job at Google.
A technical guide to Dixon's Q test for identifying outliers in small datasets, including its method, application, and criticisms.
Explores the critical difference between frequentist confidence intervals and Bayesian credible regions, arguing why frequentism often fails scientific inquiry.