Accelerating ETL on KubeFlow with RAPIDS
A guide to using RAPIDS to accelerate ETL and data processing workflows within a KubeFlow environment by leveraging GPUs.
A guide to using RAPIDS to accelerate ETL and data processing workflows within a KubeFlow environment by leveraging GPUs.
A technical guide on running RSQL for Redshift within an AWS Fargate container, including setup, configuration, and containerization steps.
Explains the differences between batch and streaming data processing, covering OLTP, OLAP, and ETL concepts for data engineers.
Analysis of an AWS serverless ETL pattern using EventBridge, Lambda, Fargate, and S3 to process CSV files into DynamoDB.
A former Application DBA shares advanced SQL and database optimization techniques for developers, focusing on performance and efficiency.
Explains why Apache Airflow jobs appear to run a day late due to its scheduling logic, contrasting it with cron jobs.
Explores a new feature in SQL Server 2019's SET STATISTICS IO output, revealing detailed I/O metrics for INSERT operations into target tables.
A technical deep dive into solving PostgreSQL disk space issues by optimizing a deduplication query, focusing on reducing sort key size.
Announcing a workshop on optimizing ETL processes for SQL Server and Azure SQL at the SQLBits 2019 conference.
A case study on using Elasticsearch as the primary data store for a large e-commerce platform's search ETL pipeline, replacing legacy Oracle systems.
A technical guide on fixing timestamp corruption in CSV data using pandas and uploading the corrected data to OmniSci using pymapd.
A technical guide on using R and PostgreSQL to load and manage large-scale Adobe Analytics Clickstream Data Feed into a relational database.
A tutorial on building data pipelines using Microsoft Azure Data Factory, covering ingestion, transformation, and orchestration.
A technical guide on loading detailed AWS billing data into a SQL Server data warehouse for advanced cost analysis.
Argues against using Oracle's automatic optimizer statistics collection in data warehouses, advocating for manual stats management as part of ETL processes.
SQL queries to analyze and identify performance bottlenecks in Oracle Data Integrator (ODI) batch jobs with many tasks.
Oracle's performance recommendations for OBIA 7.9.6, focusing on ETL improvements and hardware sizing.
A developer shares initial frustrations and technical details while upgrading Oracle BI Applications from version 7.9.5 to 7.9.6.
Explains Oracle Business Intelligence Applications (OBIA) and its relationship to OBIEE, addressing common confusion in the Oracle BI toolset.
A technical walkthrough of configuring and troubleshooting Mark Rittman's OBIEE repository for querying DAC metadata, including RPD modifications and connection issues.