Featured Post

SQL Interview Success: Unlocking the Top 5 Frequently Asked Queries

Image
 Here are the five top commonly asked SQL queries in the interviews. These you can expect in Data Analyst, or, Data Engineer interviews. Top SQL Queries for Interviews 01. Joins The commonly asked question pertains to providing two tables, determining the number of rows that will return on various join types, and the resultant. Table1 -------- id ---- 1 1 2 3 Table2 -------- id ---- 1 3 1 NULL Output ------- Inner join --------------- 5 rows will return The result will be: =============== 1  1 1   1 1   1 1    1 3    3 02. Substring and Concat Here, we need to write an SQL query to make the upper case of the first letter and the small case of the remaining letter. Table1 ------ ename ===== raJu venKat kRIshna Solution: ========== SELECT CONCAT(UPPER(SUBSTRING(name, 1, 1)), LOWER(SUBSTRING(name, 2))) AS capitalized_name FROM Table1; 03. Case statement SQL Query ========= SELECT Code1, Code2,      CASE         WHEN Code1 = 'A' AND Code2 = 'AA' THEN "A" | "A

Hyderabad Based Startup Built Largest Ever Big data Electoral Repository

I was gone through an email from my friend saying that they are creating a Hadoop project to analyze voters data. This project in my view is both academic and research oriented.
hadoop project
The real challenge was extraction of voter info from 2.5 crore PDF pages and translation of the same into English to fuse with other sources. The technology was a big hurdle. 

Hadoop Project

The infrastructure, built especially for the project, included 64 node Hadoop, PostgreSQL and servers that process a master file containing over 8 Terabytes of Data.

Besides, Testing and Validation was another big task. ‘First of a Kind’ Heuristic (machine learning) algorithms were developed for people classification based on name, geography etc., which help in the identification of religion, caste, and even ethnicity.

Data from Sources

“Data from multiple sources like census, economic and social surveys were mapped to polling booths. Simultaneously, external and propriety data sources had to be fused with individual voters’ data,” informed Joshi. 



Also Read

Comments

Popular posts from this blog

How to Fix datetime Import Error in Python Quickly

Explained Ideal Structure of Python Class

How to Check Kafka Available Brokers