Reflecting on a Year of Making Machine Learning Actually Useful
A machine learning engineer reflects on the gap between ML research and real-world production, emphasizing the critical importance of data over models.
A machine learning engineer reflects on the gap between ML research and real-world production, emphasizing the critical importance of data over models.
Introduces the 'data exploration calculus', a theoretical model capturing the unique programming patterns used by data scientists and journalists for exploratory data analysis.
A psychology graduate shares his unconventional journey into data science, detailing his career transition and lessons learned to help others.
Highlights key themes from rstudio::conf 2020, including putting R in production, R Markdown advancements, parallel processing, and tidyverse programming.
Explores how applying design thinking principles can improve data science projects by focusing on user needs and storytelling.
A developer shares their experience participating in the free F# mentorship program, both as a mentee and a mentor, and encourages others to join.
Announces version 3.37 of the R 'survey' package, detailing new features for statistical analysis with complex survey data.
A summary of a meetup talk on advanced recommender systems, exploring techniques beyond baselines using graph and NLP methods.
A data scientist explores intellectual humility and reframing imposter syndrome as a learning alarm to improve professional well-being.
A researcher reviews their 2019 scientific work, focusing on computational statistics for brain imaging and data science.
A professor details the curriculum and practical challenges of teaching an undergraduate 'Data Science Practice' course, covering data prep, predictive models, and tools like R and keras.
A tutorial on Probability and Statistics concepts, from basics to generalized linear models, presented at PyData NYC with Python examples.
A review of Janelle Shane's AI humor book, discussing neural network limitations and the real-world impact of class imbalance in machine learning.
A data scientist's journey from dogmatic Bayesianism to a pragmatic, 'secular' use of Bayesian tools without requiring belief in the model's literal existence.
An analysis and English translation of Jacek Kaczmarski's poem 'The Statues', exploring the legacy of tyranny.
The article critiques the overuse and devaluation of the titles 'Engineer' and 'Scientist' in modern IT, focusing on data science and engineering roles.
A case study on building and deploying a machine learning system for hospital bill estimation, reducing prediction errors by over 50%.
A critique of the Oxford-Munich Code of Conduct for Data Scientists, focusing on its technical recommendations on sampling and data retention.
An exploration of predictive analytics, its historical roots in human nature, and its modern implementation through data science and AI technologies.
Explains the theory behind linear regression models, a fundamental machine learning algorithm for predicting continuous numerical values.