Is Seattle Really Seeing an Uptick In Cycling?
A data scientist analyzes Seattle's bicycle counter data using Python to determine if cycling is truly increasing or just affected by good weather.
A data scientist analyzes Seattle's bicycle counter data using Python to determine if cycling is truly increasing or just affected by good weather.
Explores the statistical power of monotonicity vs. smoothness assumptions in modeling, analyzing their asymptotic and finite-sample impacts.
A practical introduction to the philosophical and practical differences between frequentist and Bayesian statistics, with Python examples.
A critique of a proposal to lower the p-value threshold for statistical significance from 0.05 to 0.005, arguing it addresses symptoms, not root causes.
Explores the concept of 'barren proxies' in causal inference, arguing that measurement reliability is more critical than the proxy's barrenness.
Explores non-transitivity in games like rock-paper-scissors, its history, and connections to statistics, evolution, and voting systems.
Argues for the importance of statistical theory in data science, using examples from medical research to show where abstract theory solved practical problems.
A critique of common pitfalls and unproductive patterns in statistics research presentations, aimed at improving academic discourse.
Explores the equivalence between causal graphs and counterfactual reasoning in statistics, simplifying the connection between two major causal inference frameworks.
Examines statistical challenges with the causal Markov and faithfulness properties, focusing on measurement error's impact on causal inference.
Explores the concept of 'error' in regression models, clarifying when it represents measurement error versus model prediction error.
A statistics professor details his hardware and software setup, including Mac laptops, R, LaTeX, and plans to learn JavaScript.
A summary of upcoming technical talks on statistical computing, rare DNA variant analysis, and handling large datasets with R and SQL.
A critique of a New York Times article's explanation of p-values, clarifying common statistical misinterpretations for a non-technical audience.
A presentation and tutorial on using the `plyr` package in R for data manipulation, summarization, and automated statistical analysis.
A tutorial introduction to using mixed models in R for statistical analysis, covering linear and generalized linear mixed models with code examples.
Introduces Rcmdr, a GUI for performing basic business statistics in R without coding, and explains its installation and usage.
A personal recap of the Scipy 2011 conference, highlighting keynotes on scientific software, data mining with Python, and trends in statistics and parallel computing.
A research group seeks a post-doc for the AzureBrain project, using Python for parallel computing and statistics on brain imaging/genetics data.
A developer reflects on a month of daily blogging, sharing traffic stats and popular posts about Python, Django, and web development.