Working With Messy Data Using Pandas in Python
A guide to cleaning and processing messy CSV data using Python's Pandas library, including reading files and assigning custom headers.
A guide to cleaning and processing messy CSV data using Python's Pandas library, including reading files and assigning custom headers.
A tutorial on using pandas and regex to conditionally populate missing columns in a CSV file based on data from another column.
A tutorial on using Python's pandas library to clean CSV data and export it to JSON format for data layer integration.
Compares the runtime performance of pandas' crosstab, groupby, and pivot_table methods for data aggregation.
A technical guide on fixing timestamp corruption in CSV data using pandas and uploading the corrected data to OmniSci using pymapd.
A technical analysis of Stack Overflow's 2018 survey data, visualizing global developer response rates per capita using Python, pandas, and GeoPandas.
Analysis of the 2018 Stack Overflow Developer Survey results, ranking technologies developers worked with and want to work with.
A technical tutorial on using Python, pandas, and geospatial data to create a world map visualizing the origins of metal bands from a dataset.
A PyCon US 2018 talk on Python application monitoring basics, covering terminology, metrics, and integration using pandas.
A technical tutorial on using Python and pandas to process electricity data and load it into OmniSci (formerly MapD) for dashboard creation.
Explores implementing group-by operations from scratch in Python, comparing performance of Pandas, NumPy, and SciPy for data aggregation.
A technical guide on analyzing personal Google Location History data using Python, Pandas, and visualization libraries to map and gain insights from location data.
A tutorial on analyzing Seattle's Pronto CycleShare data using Python, Pandas, and the PyData stack for data science.
A tutorial on using Dask for out-of-core data analysis with a large OpenStreetMap dataset, demonstrating scalable Python data manipulation.
Using Python and unsupervised machine learning to analyze Seattle bicycle count data and uncover insights about commuting work habits.