Defining Normal to See Abnormal
Explores the history of data science through early 20th-century rat diet experiments, drawing parallels to modern statistical methods.
Explores the history of data science through early 20th-century rat diet experiments, drawing parallels to modern statistical methods.
A 2025 AI research review covering tabular machine learning, the societal impacts of AI scale, and open-source data-science tools.
Explores whether predictive statistical models require causal relationships to be useful, using examples from data sampling and real-world scenarios.
The author announces their new role as Probabl's CSO to accelerate development of the scikit-learn machine learning library and its ecosystem.
An infrastructure engineer explores AI Engineering, defining the role and its focus on using pre-trained models, prompt engineering, and practical application building.
A machine learning professor critiques the foundational concept of a 'data-generating distribution' and shares insights from teaching a truly distribution-free course.
Discusses handling class imbalance in predictive modeling, using medical and zebra analogies to explain adjusting for prior probabilities and error costs.
A keynote on trustworthy data visualization, exploring trust in an era of fake results, AI confabulation, and data infrastructure decay.
Explains the difference between AI and Machine Learning, with AI as the goal of intelligent systems and ML as a key approach to achieve it.
Introducing the {RKaggle} R package for downloading Kaggle datasets directly into the R console, covering installation and basic usage.
A tutorial on using the {fs} package in R for easier file path manipulation, extension management, and directory information retrieval.
Announces 9 new free and paid books added to the Big Book of R collection, covering data science, visualization, and package development.
Interview with Dr. Nick Feamster on network measurement, machine learning, and the Internet Equity Initiative's work on broadband access.
Announces 7 new free R programming books added to the Big Book of R collection, covering topics like machine learning, data science, and software engineering.
The Big Book of R adds 10 new books, including Spanish titles and English works on data science, statistics, and fantasy football analytics using R.
A guide for R users to learn basics of Python, HTML, CSS, JS, and C++ to enhance their data science and web development projects.
Positron is a new data science IDE from Posit that combines features from RStudio and VS Code, offering a specialized environment for R and Python.
Announces a major update to the Big Book of R website, including a migration to Quarto, a new Psychology chapter, and the addition of new R programming books.
Explains the key difference between AI models and algorithms, using linear regression and OLS as examples.
The article discusses the spin-off of scikit-learn's open-source development from Inria to a new mission-driven enterprise, Probabl, focusing on sustainable funding and growth.