Speaker Bio & Abstract

SpeakerHazem Mahmoud Data Solution Architect
Cloudera Foundation

BiographyHazem Mahmoud is a Data Solution Architect at Cloudera Foundation, helping NGO's realize the impact of using data for social good. Prior, Hazem was a Solution Architect and Support Engineer with Cloudera, helping Cloudera's largest customer with architecting their Big Data solutions and addressing challenges encountered. Hazem has over 17 years working in various industry sectors (finance, marketing, telco, technology, and non-profits). He holds a Masters in Management of Information Systems, a Bachelors in Electrical Engineering, and numerous technical certifications.AbstractUsing Open Source Tools for Processing and Analytics of Large Spatial DatasetsCloudera Foundation works with several non-profit organizations to help them utilize their data to accomplish their mission of bringing about positive impact on the world. As a part of that effort, one of the grantees is AidData - a research lab at the College of William & Mary. Researchers at AidData developed a tool, GeoQuery, that is used by aid agencies (ie: USAID), research organizations and universities to perform impact evaluations on a sub-national level on various aid projects all over the world. AidData utilizes a highly curated set of geospatial datasets, along with various analytical methods (zonal stats, etc) and machine learning algorithms to support such impact evaluation efforts. Aidata, along with Cloudera Foundation data solution architects, have worked on (and still in the process of) migrating to an open source set of geospatial tools (GeoTrellis and RasterFrames) as well as utilizing an open source analytics framework/engine (Spark) to distribute the analytics over a cluster of compute and storage servers.