Setting up GPU Data Science Environments for Hackathons
A guide to setting up a GPU-powered JupyterHub environment on AWS for a data science hackathon, including driver installation and configuration.
A guide to setting up a GPU-powered JupyterHub environment on AWS for a data science hackathon, including driver installation and configuration.
A crash course on the theory behind linear regression models, a fundamental machine learning algorithm for predicting numerical values.
Explores the psychological reasons behind heated debates in data science, like R vs. Python, and why they are often unproductive.
Announcing the second edition of 'Data Science from Scratch', updated for Python 3 with cleaner code, type hints, and a new deep learning chapter.
A review and tips for the OMSCS CS7646 Machine Learning for Trading course, covering the author's experience and key takeaways.
A data scientist shares a personal job search story to argue that landing a tech job is about skill alignment, not just mass-applying.
A data scientist discusses the importance of a business mindset, prioritization, and effective communication for creating real-world value in data science projects.
A data scientist explains their method for crowdsourcing and ranking 41 data science podcasts by analyzing Google search results.
Analysis of 2019 Stack Overflow survey data comparing global and US salaries for R vs. Python programmers and data professionals.
A case study on building a production ML system to predict patient hospitalization costs for Southeast Asia's largest healthcare group.
A guide on how to customize ggplot graphs to match your company's visual branding, making your data visualizations stand out and be more impactful.
A look back at a 2018 PyData talk on end-to-end GPU data science workflows using OmniSci and RAPIDS, highlighting concepts still relevant today.
Announcement for a lecture series on machine learning, covering topics like Weka, deep learning, algorithmic fairness, and sparse supervised learning.
A researcher's 2018 highlights: using machine learning for cognitive brain mapping, analyzing non-curated data, and contributing to scikit-learn development.
A guide on building a personal brand as a data scientist, covering path selection, blogging, and sharing knowledge within the community.
A data scientist shares strategies for managing and mitigating failure in data science projects, emphasizing risk analysis and realistic planning.
Introduces the 'namer' R package for automatically labeling unnamed R Markdown chunks to improve debugging and cache management.
Explores Microsoft Azure Cosmos DB's features and benefits for data scientists, focusing on its schemaless design and suitability for IoT and modern applications.
Explains four levels of customer targeting, from no segmentation to advanced recommendation systems, and their business applications.
A critique of traditional statistics education, arguing for a more data-driven, question-focused approach using modern tools.