R.I.P., Fritz Leisch
A personal tribute to Fritz Leisch, creator of Sweave, reflecting on his impact on reproducible research and the R ecosystem.
A personal tribute to Fritz Leisch, creator of Sweave, reflecting on his impact on reproducible research and the R ecosystem.
A personal tribute to statistician John Fox, recalling his mentorship and impact on the R community through tools like Rcmdr and car.
A technical analysis of Overture Maps' latest Places dataset, covering over 72 million global points of interest, including setup and tools used.
A blog post arguing that statistical inference is often used as a tool of rhetoric and persuasion, rather than pure objective science.
A data scientist shares workflow automation tools and custom settings for Positron, Raycast, and Espanso to streamline data analysis tasks.
Analysis of the most popular personal blogs on Hacker News in 2025, based on a tracking project that ranks domains by their performance on the platform.
Analyzing All The Places' open-source location data project, detailing the technical setup and process for downloading and examining millions of brand locations.
Analyzes data showing autonomous vehicles significantly reduce crashes and injuries compared to human drivers, based on Waymo's safety performance.
A technical analysis and comparison of various administrative boundary datasets, including OpenStreetMap, using Python, DuckDB, and QGIS.
Introduces dremioframe, a Python DataFrame library for querying Dremio with a pandas-like API, generating SQL under the hood.
Analyzing Business Insider's dataset on US data center locations, ownership, and resource consumption using Python, DuckDB, and QGIS.
Argues that AI's real challenge isn't data scarcity, but the vast amount of generated data that goes unanalyzed, presenting an opportunity for AI.
A blog archive listing posts about data visualization, statistical analysis, and data science using the R programming language.
A developer saves their company $4,500/month by replacing an expensive Oracle tool with a free R script for data analysis and visualization.
A lecture on the foundational statistical concept of orderings and ordinal data, exploring their analysis and complications in fields like health research.
The author discusses updates to gssrdoc, an R package that provides integrated help documentation for the General Social Survey (GSS) dataset.
A data-driven analysis of LLM performance on a simple retrieval task, highlighting the need for evidence-based AI testing.
Analyzing pedestrian fatality data using polar coordinate visualizations to reveal cyclical patterns in daily accident counts.
A technical exploration of the ICMM's global mining dataset, detailing the setup, tools, and process for data analysis using Python, DuckDB, and QGIS.
An analysis of Statistics Canada's Open Database of Buildings (ODB) dataset, covering data processing, tools used, and technical setup.