17 May 2014

Hyderabad Based Startup Built Largest Ever Big data Electoral Repository

#The largest big data repository:
#The largest big data repository:
The real challenge was extraction of voter info from 2.5 crore PDF pages and translation of the same into English to fuse with other sources. Technology was a big hurdle. 


The infrastructure, built especially for the project, included 64 node Hadoop, PostgreSQL and servers that process master file containing over 8 Terabytes of Data. Besides, Testing and Validation was another big task. ‘First of a Kind’ Heuristic (machine learning) algorithms were developed for people classification based on name, geography etc., which help in identification of religion, caste and even ethnicity. 

“Data from multiple sources like census, economic and social surveys were mapped to polling booths. Simultaneously, external and propriety data sources had to be fused with individual voters’ data,” informed Joshi.  


No comments:

Post a Comment

Thanks for your message. We will get back you.

© 2010-2017 Biganalytics.me. All rights reserved.. Powered by Blogger.

Total Pageviews

All material, files, logos and trademarks within this site are properties of their respective organizations.