Posts

Showing posts from March, 2016

Featured Post

SQL Interview Success: Unlocking the Top 5 Frequently Asked Queries

Image
 Here are the five top commonly asked SQL queries in the interviews. These you can expect in Data Analyst, or, Data Engineer interviews. Top SQL Queries for Interviews 01. Joins The commonly asked question pertains to providing two tables, determining the number of rows that will return on various join types, and the resultant. Table1 -------- id ---- 1 1 2 3 Table2 -------- id ---- 1 3 1 NULL Output ------- Inner join --------------- 5 rows will return The result will be: =============== 1  1 1   1 1   1 1    1 3    3 02. Substring and Concat Here, we need to write an SQL query to make the upper case of the first letter and the small case of the remaining letter. Table1 ------ ename ===== raJu venKat kRIshna Solution: ========== SELECT CONCAT(UPPER(SUBSTRING(name, 1, 1)), LOWER(SUBSTRING(name, 2))) AS capitalized_name FROM Table1; 03. Case statement SQL Query ========= SELECT Code1, Code2,      CASE         WHEN Code1 = 'A' AND Code2 = 'AA' THEN "A" | "A

3 top IT Skills every new IT Professionals learn to progress in software career

Image
What are the skills needed by the new IT professionals or job seekers who help the  organisation  transition to IT-as-a-Service.  In order  to lead their  organisations  to the cloud, IT professionals must focus on three fundamental areas: Core  Virtualisation  Skill Sets IT professionals must think and operate in the virtual world. No longer can they be tied to the old paradigm of physical assets dedicated to specific users or applications. They must think in terms of “services” riding on top of a fully virtualized infrastructure, and how applications will take advantage of shared resources with both servers and storage. This requires comprehensive skills in both server and storage virtualization technology, and enough experience as a practitioner to understand the intricacies and critical elements of managing virtual platforms. Rules of Old IT and New IT Cross-training Competency Leaders of IT innovation cannot be completely siloed and hyper-focused. Although there will

Linux Must Read Course Contents

Image
The complete syllabus for the Linux certification course you need to know before start preparation for the test. List of Course Contents The Linux community and a career in open source Finding your way on a Linux system The power of the command line The Linux operating system Security and file permissions Topic 1: The Linux Community and a Career in Open Source (weight: 7) 1.1 Linux Evolution and Popular Operating Systems Weight: 2 Description: Knowledge of Linux development and major distributions. Key Knowledge Areas: Open Source Philosophy Distributions Embedded Systems The following is a partial list of the used files, terms and utilities: Android Debian, Ubuntu (LTS) CentOS, openSUSE, Red Hat Linux Mint, Scientific Linux 1.2 Major Open Source Applications Weight: 2 Description: Awareness of major applications as well as their uses and development. Key Knowledge Areas: Desktop Applications Server Applications Development Languages Package Management

AWS Certified Developer: Eligibility Criteria

Image
The below is complete eligibility criteria is as follows- One or more years of hands-on experience designing and maintaining an AWS-based application. In-depth knowledge of at least one high-level programming language. Understanding of core AWS services, uses, and basic architecture best practices. ... Proficiency in designing, developing, and deploying cloud-based solutions using AWS. Experience with developing and maintaining applications written for Amazon Simple Storage Service, Amazon DynamoDB, Amazon Simple Queue Service, Amazon Simple Notification Service, Amazon Simple Workflow Service, AWS Elastic Beanstalk, and AWS CloudFormation. Related: AWS Basics for Software Engineer The requirement for Developer Exam Professional experience using AWS technology Hands-on experience programming with AWS APIs Understanding of AWS Security best practices Understanding of automation and AWS deployment tools Understanding storage options and their underlying consistency models Excellent

Unix: How to Write Shell Script Using vi Editor

Image
Stockphotos.io When you login into UNIX, you are first in the home directory: $/home: Then you can issue $/home: cd jthomas Then you come to your own directory: $/home/jthomas: How to write your first script: $/home/ jthomas : vi test.sh Here, you can write your script. The first line in the script is: #!/bin/ksh - It denotes which shell you are going to use. Example: $vi test.sh  #!/bin/ksh  ################################################### #  Written By: Jason Thomas # Purpose: This script was written to show users  how to develop their first script  ################################################### # Denotes a comment root daemon bin sys adm uucp nobody lpd How to run a script $ sh test.sh Also read:  The complete list of UNIX basic commands

11 Top PIG Interview Questions

Here are the top PIG interview questions. These are useful for your project and interviews. 1). What is PIG? PIG is a platform for analyzing large data sets that consist of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs.  PIG’s infrastructure layer consists of a compiler that produces a sequence of MapReduce Programs. 2). What is the difference between logical and physical plans? Pig undergoes some steps when a Pig Latin Script is converted into MapReduce jobs. After performing the basic parsing and semantic checking, it produces a logical plan.  The logical plan describes the logical operators that have to be executed by Pig during execution. After this, Pig produces a physical plan. The physical plan describes the physical operators that are needed to execute the script. 3). Does ‘ILLUSTRATE’ run MR job? No, illustrate will not pull any MR, it will pull the internal data. On the console, illustrate will

How to Understand AWS CloudFormation Easily

Image
AWS CloudFormation is a service that helps you model and set up your Amazon Web Services resources so that you can spend less time managing those resources and more time focusing on your applications that run in AWS. You create a template that describes all the AWS resources you want (like Amazon EC2 instances or Amazon RDS DB instances), and AWS CloudFormation provides and configures those resources for you.   You don't need to individually create and configure AWS resources and figure out what's dependent on what; AWS CloudFormation handles all of that.  Managing Infrastructure For a scalable web application that also includes a back-end database, you might use an Auto Scaling group, an Elastic Load Balancing load balancer, and an Amazon Relational Database Service database instance.  Normally, you might use each individual service to provide these resources. And after you create the resources, you would have to configure them to work together. All these tasks can a

AWS EMR Vs. Hadoop: 5 Top Differences

Image
With Amazon Elastic MapReduce Amazon EMR, you can analyze and process vast amounts of data. It distributes the computational work across a cluster of virtual servers ( run in the Amazon cloud). An open-source framework of Hadoop manages it.  Amazon EMR - Elastic MapReduce, The Unique Features Amazon EMR has made enhancements to Hadoop and other open-source applications to work seamlessly with AWS. For instance, Hadoop clusters running on Amazon EMR use EC2 instances as virtual Linux servers for the master and slave nodes,  Amazon S3   for bulk storage of input and output data, and CloudWatch to monitor cluster performance and raise alarms. Also, you can move data into and out of DynamoDB using Amazon EMR and Hive. That orchestrates by Amazon EMR control software that launches and manages the Hadoop cluster. This process is called an Amazon EMR cluster. What does Hadoop do? Hadoop uses a  distributed processing  architecture called MapReduce, in which a task maps to a set of servers f

AWS elastic cloud EC2 top tutorial

Image
Amazon Elastic Compute Cloud (Amazon EC2) - is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers. Amazon EC2’s simple web service interface allows you to obtain and configure capacity with minimal friction. Photo Credit: Srini What is Elastic Cloud Functions of EC2 It provides you with complete control of your computing resources and lets you run on Amazon’s proven computing environment.  Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale capacity, both up and down, as your computing requirements change.  Amazon EC2 changes the economics of computing. That means, pay for the usage you did. Amazon EC2 provides developers the tools to build failure resilient applications and isolate themselves from common failure scenarios. Key Points for Interviews Learn these basic points on EC2 for your next interview

Everything You Need To Know About Amazon Machine Image

Image
Essentially a virtual machine image or snapshot ”An Amazon Machine Image (AMI) is a special type of pre-configured operating system and virtual application software which is used to create a virtual machine within the Amazon Elastic Compute Cloud (EC2). It serves as the basic unit of deployment for services delivered using EC2.” AWS supports the following virtual image types for import/export: VMware ESX VMDK images, VMware ESX OVA (export only), Citrix Xen VHD images and Microsoft Hyper-V VHD images. Life Cycle of AMI The above diagram summarizes the AMI lifecycle. After you create and register an AMI, you can use it to launch new instances. (You can also launch instances from an AMI if the AMI owner grants you launch permissions.) You can copy an AMI to the same region or to different regions. When you are finished launching an instance from an AMI, you can deregister the AMI.

Why MySQL You Need to Master for Data Analytics Jobs

Image
MySQL Before you can start analysing data, you are going to actually have to have some data on hand. That means a database – preferably a relational one. If you had your sights set on a non-relational, NoSQL database solution, you might want to step back and catch your breath. NoSQL databases are unique because of their independence from the Structured Query Language (SQL) found in relational databases. Relational databases all use SQL as the domain-specific language for ad hoc queries, whereas non-relational databases have no such standard query language, so they can use whatever they want –including SQL. Non-relational databases also have their own APIs designed for maximum scalability and flexibility. When You Need to Learn NoSQL Databases? NoSQL databases are typically designed to excel in two specific areas: speed and scalability. But for the purposes of learning about data concepts and analysis, such super-powerful tools are pretty much overkill. In other words, you

AWS Cloud Computing Tutorial for Beginners

Image
Complete tutorial for beginners on AWS. Also explained AWS security features. 1. What you can do with AWS Store public or private data. Host a static website. These websites use client-side technologies (such as HTML, CSS, and JavaScript) to display content that doesn't change frequently. A static website doesn't require server-side technologies (such as PHP and ASP.NET). Host a dynamic website, or web app. These websites include classic three-tier applications, with web, application, and database tiers. Support students or online training programs. Process business and scientific data. Handle peak loads. 2. AWS (Amazon Web services different features) AWS Management Console A web interface. To get started, see the Getting Started with the AWS Management Console. AWS Command Line Interface (AWS CLI) Commands for a broad set of AWS products.To get started, see AWS Command Line Interface User Guide. Command Line Tools Commands for individual AWS products. For more information

A Quick guide to Amazon RDS

Amazon Aurora is a MySQL-compatible relational database management system (RDBMS) that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. It provides up to 5X the performance of MySQL at one tenth the cost of a commercial database. Amazon Aurora allows you to encrypt data at rest as well as in transit for your mission-critical workloads. Key points on Amazon Aurora Amazon Aurora is a relational database engine that combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. It delivers up to five times the throughput of standard MySQL running on the same hardware. Amazon Aurora is designed to be compatible with MySQL 5.6, so that existing MySQL applications and tools can run without requiring modification.  Amazon Aurora joins MySQL, Oracle, Microsoft SQL Server, and PostgreSQL as the fifth database engine avail

How Hadoop is Better for Legacy data

Image
Here is an interview question on legacy data. You all know that a lot of data is available on legacy systems. You can use Hadoop to process the data for useful insights. 1. How should we be thinking about migrating data from legacy systems? Treat legacy data as you would any other complex data type.  HDFS acts as an active archive, enabling you to cost-effectively store data in any form for as long as you like and access it when you wish to explore the data. And with the latest generation of data wrangling and ETL tools, you can transform, enrich, and blend that legacy data with other, newer data types to gain a unique perspective on what’s happening across your business. 2. What are your thoughts on getting combined insights from the existing data warehouse and Hadoop? Typically one of the starter use cases for moving relational data off a warehouse and into Hadoop is active archiving.  This is the opportunity to take data that might have otherwise gone to the archive and keep it av

JSON Material to Download Now

Image
JSON or JavaScript Object Notation is a lightweight is a text-based open standard designed for human-readable data interchange. Conventions used by JSON are known to programmers which include C, C++, Java, Python, Perl, etc. Photo Credit: Srini JSON Material to Download Now. JSON stands for JavaScript Object Notation. This format was specified by Douglas Crockford. This was designed for human-readable data interchange JSON has been extended from the JavaScript scripting language. JSON filename extension is .json . JSON Internet Media type is application/JSON The Uniform Type Identifier is public.json. JSON Quick Guide to Download JSON Quick Guide download here