Thomas Lumley 3/5/2018

Faster generalised linear models in largeish data

Read Original

This technical article discusses an optimization for fitting generalized linear models (GLMs) on large datasets. It proposes using a starting estimator from a subsample, followed by a single Newton-Raphson iteration computed via a single database query, to achieve asymptotic efficiency. This approach aims to be faster than iterative methods like `bigglm` in R, especially when data resides in a database, and includes a practical example with a logistic regression on a vehicle dataset.

Faster generalised linear models in largeish data

Comments

No comments yet

Be the first to share your thoughts!

Browser Extension

Get instant access to AllDevBlogs from your browser

Top of the Week

1
The Beautiful Web
Jens Oliver Meiert 2 votes
3
LLM Use in the Python Source Code
Miguel Grinberg 1 votes
4
Wagon’s algorithm in Python
John D. Cook 1 votes