Not Your Father's Cloud: Microsoft Azure HDInsights Explained
An explanation of Microsoft Azure HDInsights, a managed Apache Hadoop service for processing big data on Azure.
An explanation of Microsoft Azure HDInsights, a managed Apache Hadoop service for processing big data on Azure.
A tutorial on handling dates and times in R, covering essential classes like Date and POSIXct, formatting, calculations, and sequences.
RSiteCatalyst v1.3 adds regex search, Realtime API support, and configurable request timing for the Adobe Analytics R package.
Final tutorial on analyzing airline data with Hadoop using Hive for SQL queries and Pig for scripting, covering setup and basic analytics.
A developer's side project to analyze PyPI download logs, extracting insights about Python versions, installers, and operating systems used by package consumers.
A developer shares their journey learning Python, including recommended courses, books, and IDEs, and their decision to take a university course.
Exploring function pointers in IDL (Interactive Data Language) for refactoring legacy scientific code, with insights into the language's syntax and quirks.
Authors respond to critique of their computational linguistics paper on analyzing movie characters, discussing interdisciplinary research methods.
A comprehensive, curated list of Python programming resources for all skill levels, covering tutorials, libraries, frameworks, and best practices.
RSiteCatalyst 1.1 released with new API features, faster calls, and extended timeout for Adobe Analytics data in R.
Explains how Bayesian A/B testing improves online headline optimization, overcoming challenges of traditional frequentist methods for faster, more accurate results.
A critique of common pitfalls and unproductive patterns in statistics research presentations, aimed at improving academic discourse.
Explores the concept of 'error' in regression models, clarifying when it represents measurement error versus model prediction error.
A summary of upcoming technical talks on statistical computing, rare DNA variant analysis, and handling large datasets with R and SQL.
A guide to installing and using R on Amazon EC2 instances to overcome in-memory limitations for big data analysis.
A critique of a New York Times article's explanation of p-values, clarifying common statistical misinterpretations for a non-technical audience.
A presentation and tutorial on using the `plyr` package in R for data manipulation, summarization, and automated statistical analysis.
A tutorial on using R and the Google Analytics API to analyze and visualize '(not provided)' organic search data.
A tutorial video demonstrating how to execute SQL queries within the R programming language using the 'sqldf' package for data analysis.
A technical presentation on using R to create and analyze stochastic, age-structured matrix population models for ecological simulations.