Strata x Hadoop 2016 - How Lazada Ranks Products
A presentation on Lazada's machine learning framework for ranking products in catalog and search results to improve user experience.
A presentation on Lazada's machine learning framework for ranking products in catalog and search results to improve user experience.
Fixing Spark SQL's 'No input paths specified' error when reading JSON files by using the correct file:// or hdfs:// URI prefix.
Fixing a CDH installation failure on LXC/Proxmox caused by an erroneous SwapFree value in /proc/meminfo when swap is disabled.
Explains Lambda Architecture for Big Data, combining batch processing (Hadoop) and real-time stream processing (Spark, Storm) to handle large datasets.
A technical guide on using Julia to integrate data from Hadoop and Teradata Aster for visualization, demonstrating its role as a 'glue' language.
A data engineer shares five practical lessons and performance tips for working with Apache Hive, focusing on common pitfalls and optimizations.
Fixing MongoDB Connector for Hadoop authentication errors by granting the clusterManager role to the user.
A tutorial on connecting to Apache Hive using the open-source SQL Workbench tool via JDBC, covering driver setup and connection configuration.
An explanation of Microsoft Azure HDInsights, a managed Apache Hadoop service for processing big data on Azure.
Final tutorial on analyzing airline data with Hadoop using Hive for SQL queries and Pig for scripting, covering setup and basic analytics.
A tutorial on using Apache Hive to create tables and views from data loaded into a Hadoop cluster, continuing a multi-part series.
Tutorial on loading data into Hadoop's HDFS using the Hue File Browser interface and the Airline Dataset.
A tutorial on installing and configuring an 18-node Hadoop cluster on Amazon EC2 using Cloudera Manager.
A practical guide introducing Hadoop's ecosystem and setting up a proof-of-concept cluster on Amazon EC2 using Cloudera for big data processing.