Speaker Bio & Abstract
Public Affairs Centre
India
BiographyAsha Subramanian heads the Centre for Open Data Research at Public Affairs Centre with a focus on applying data science to promote data empowered decisions towards good governance. She has a rich Information Technology industry background with over two decades of program management and delivery experience. She holds a Ph.D in Data Science from the International Institute of Information Technology, Bangalore and a Masters in Statistics from Indian Statistical Institute Calcutta.AbstractAsha Subramanian, Pavan Kumar RR, Manikanta Vikkurthi, Mohanapreethi Attuluri | Public Affairs Centre, India Governance data published by the government of India typically contain geospatial locations, the temporal details such as the time period of reporting and the contextual information. Examples included daily prices of essential commodities in cities, health indicators such as IMR, MMR recorded at specific locations etc. The challenge is to make information from such disparate datasets across various sectors easily accessible, understandable and actionable to empower government towards data enabled decision making. We introduce two unique knowledge integration products - SIDDHI (Semantic Innovation and Harmonisation of Sustainable Development Data in India) and DRSHTI (Data Integration and Insights powered by Semantics). SIDDHI builds a knowledge graph from the diverse datasets using geospatial ontologies (GeoNames.org), UN Environment endorsed Sustainable Development Goals Interface Ontology (SDGIO) and custom ontologies specifically built for the Indian context. We use machine learning and information theory techniques to extract actionable knowledge from the raw datasets and create a knowledge graph using spatial, temporal and contextual themes. SIDDHI provides a powerful query and visualisation engine to retrieve and view comprehensive information from related datasets for a given spatial location. DRSHTI exploits this knowledge graph to derive insights on related data across geospatial entities and associated temporal and contextual themes.