Skip to main content


Showing posts from March, 2016

3 Valuable IT Skills new IT Professionals Should have in IT as a Service

What are the skills needed by the new IT professional who helps the organisation transition to IT-as-a-Service ? In order to lead their organisations to the cloud, IT professionals must focus on three fundamental areas:

Core Virtualisation Skill Sets. IT professionals must think and operate in the virtual world. No longer can they be tied to the old paradigm of physical assets dedicated to specific users or applications. They must think in terms of “services” riding on top of a fully virtualized infrastructure, and how applications will take advantage of shared resources with both servers and storage. This requires comprehensive skills in both server and storage virtualization technology, and enough experience as a practitioner to understand the intricacies and critical elements of managing virtual platforms.

Rules of Old IT and New IT:
Cross-training Competency. Leaders of IT innovation cannot be completely siloed and hyper-focused. Although there will still be a need for deep domain ex…

Linux Program and Certification from Linux Professional Institute

Linux Essentials Exam Objectives Topics:

The Linux community and a career in open sourceFinding your way on a Linux systemThe power of the command lineThe Linux operating systemSecurity and file permissions

Topic 1: The Linux Community and a Career in Open Source (weight: 7)

1.1 Linux Evolution and Popular Operating Systems

Weight: 2

Description: Knowledge of Linux development and major distributions.

Key Knowledge Areas:

Open Source Philosophy
Embedded Systems
The following is a partial list of the used files, terms and utilities:

Debian, Ubuntu (LTS)
CentOS, openSUSE, Red Hat
Linux Mint, Scientific Linux

1.2 Major Open Source Applications

Weight: 2

Description: Awareness of major applications as well as their uses and development.

Key Knowledge Areas:

Desktop Applications
Server Applications
Development Languages
Package Management Tools and repositories
Terms and Utilities:, LibreOffice, Thunderbird, Firefox, GIMP

Amazon Certified Developer Exam and Eligibility

The below is complete eligibility criteria is as follows- One or more years of hands-on experience designing and maintaining an AWS-based application. In-depth knowledge of at least one high-level programming language.  Understanding of core AWS services, uses, and basic architecture best practices.

Proficiency in designing, developing, and deploying cloud-based solutions using AWS. Experience with developing and maintaining applications written for Amazon Simple Storage Service, Amazon DynamoDB, Amazon Simple Queue Service, Amazon Simple Notification Service,Amazon Simple Workflow Service, AWS Elastic Beanstalk, and AWS CloudFormation.

Related:AWS Basics for Software Engineer

Resources for Developer Exam:
Professional experience using AWS technologyHands-on experience programming with AWS APIsUnderstnding of AWS Security best practicesUnderstanding of automation and AWS deployment toolsUnderstanding storage options and their underlying consistency modelsExcellent understanding of at le…

How to Write Your First ksh Script in UNIX

When you login into UNIX, you are first be in the home directory:


Then you can issue 

$/home: cd jthomas

Then you come to your own directory:


How to write your first script:

$/home/jthomas: vi

Here, you can write your script.

The first line in the script is:

#!/bin/ksh  - It denotes which shell you are going to use.


$vi #!/bin/ksh ################################################### # Written By: Jason Thomas # Purpose: This script was written to show users how to develop their first script ################################################### # Denotes a comment
daemon bin sys adm uucp nobody lpd

How to run script

Also read: The complete list of UNIX basic commands

Top 11 complex Hadoop PIG Complex Interview Questions

1). What is PIG?
PIG is a platform for analyzing large data sets that consist of high level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. PIG’s infrastructure layer consists of a compiler that produces sequence of MapReduce Programs.

2). What is the difference between logical and physical plans?
Pig undergoes some steps when a Pig Latin Script is converted into MapReduce jobs. After performing the basic parsing and semantic checking, it produces a logical plan. The logical plan describes the logical operators that have to be executed by Pig during execution. After this, Pig produces a physical plan. The physical plan describes the physical operators that are needed to execute the script.

3). Does ‘ILLUSTRATE’ run MR job?
No, illustrate will not pull any MR, it will pull the internal data. On the console, illustrate will not do any job. It just shows output of each stage and not the final output.

4). Is the keyword ‘DEFINE’ like…

CloudFormation possible to speedup your AWS migration

AWS CloudFormation is a service that helps you model and set up your Amazon Web Services resources so that you can spend less time managing those resources and more time focusing on your applications that run in AWS.You create a template that describes all the AWS resources that you want (like Amazon EC2 instances or Amazon RDS DB instances), and AWS CloudFormation takes care of provisioning and configuring those resources for you.You don't need to individually create and configure AWS resources and figure out what's dependent on what; AWS CloudFormation handles all of that.

Simplify Infrastructure Management
For a scalable web application that also includes a back-end database, you might use an Auto Scaling group, an Elastic Load Balancing load balancer, and an Amazon Relational Database Service database instance. Normally, you might use each individual service to provision these resources. And after you create the resources, you would have to configure them to work together.…

The best 5 differences of AWS EMR and Hadoop

With Amazon Elastic MapReduce (Amazon EMR) you can analyze and process vast amounts of data. It does this by distributing the computational work across a cluster of virtual servers running in the Amazon cloud. The cluster is managed using an open-source framework called Hadoop.

Amazon EMR has made enhancements to Hadoop and other open-source applications to work seamlessly with AWS. For example, Hadoop clusters running on Amazon EMR use EC2 instances as virtual Linux servers for the master and slave nodes, Amazon S3 for bulk storage of input and output data, and CloudWatch to monitor cluster performance and raise alarms.

You can also move data into and out of DynamoDB using Amazon EMR and Hive. All of this is orchestrated by Amazon EMR control software that launches and manages the Hadoop cluster. This process is called an Amazon EMR cluster.

What does Hadoop do...

Hadoop uses a distributed processing architecture called MapReduce in which a task is mapped to a set of servers for proce…

5 Things About AWS EC2 You Need to Focus!

Amazon Elastic Compute Cloud (Amazon EC2) - is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers.
Amazon EC2’s simple web service interface allows you to obtain and configure capacity with minimal friction.

The basic functions of EC2... 
It provides you with complete control of your computing resources and lets you run on Amazon’s proven computing environment.Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale capacity, both up and down, as your computing requirements change.Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use. Amazon EC2 provides developers the tools to build failure resilient applications and isolate themselves from common failure scenarios. 
Key Points for Interviews:
EC2 is the basic fundamental block around which the AWS are structured.EC2 provides remote ope…

Everything You Need To Know About Amazon Machine Image

Essentially a virtual machine image or snapshot ”An Amazon Machine Image (AMI) is a special type of pre-configured operating system and virtual application software which is used to create a virtual machine within the Amazon Elastic Compute Cloud (EC2).

It serves as the basic unit of deployment for services delivered using EC2.” AWS supports the following virtual image types for import/export: VMware ESX VMDK images, VMware ESX OVA (export only), Citrix Xen VHD images and Microsoft Hyper-V VHD images.

The following diagram summarizes the AMI lifecycle. After you create and register an AMI, you can use it to launch new instances. (You can also launch instances from an AMI if the AMI owner grants you launch permissions.) You can copy an AMI to the same region or to different regions. When you are finished launching instance from an AMI, you can deregister the AMI.

Life Cycle of AMI

Why MySQL You Need to Master for Data Analytics Jobs

Before you can start analysing data, you are going to actually have to have some data on hand. That means a database – preferably a relational one.
If you had your sights set on a non-relational, NoSQL database solution, you might want to step back and catch your breath. NoSQL databases are unique because of their independence from the Structured Query Language (SQL) found in relational databases. 

Relational databases all use SQL as the domain-specific language for ad hoc queries, whereas non-relational databases have no such standard query language, so they can use whatever they want –including SQL. Non-relational databases also have their own APIs designed for maximum scalability and flexibility.

When You Need to Learn NoSQL Databases?

NoSQL databases are typically designed to excel in two specific areas: speed and scalability. But for the purposes of learning about data concepts and analysis, such super-powerful tools are pretty much overkill. In other words, you need to walk before y…

The top AWS Basics for Software Engineers

What you can do with AWS?
Store public or private data. Host a static website.These websites use client-side technologies (such as HTML, CSS, and JavaScript) to display content that doesn't change frequently. A static website doesn't require server-side technologies (such as PHP and ASP.NET). Host a dynamic website, or web app. These websites include classic three-tier applications, with web, application, and database tiers. Support students or online training programs. Process business and scientific data. Handle peak loads.AWS (Amazon Web services different features)?
AWS Management Console A web interface. To get started, see the Getting Started with the AWS Management Console. AWS Command Line Interface (AWS CLI) Commands for a broad set of AWS products.To get started, see AWS Command Line Interface User Guide. Command Line Tools Commands for individual AWS products. For more information, see Command Line Tools. AWS Software Development Kits (SDK) APIs that are specific to…

Local Time Support in Amazon Aurora -New Feature

Amazon Aurora is a MySQL-compatible relational database management system (RDBMS) that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. It provides up to 5X the performance of MySQL at one tenth the cost of a commercial database. Amazon Aurora allows you to encrypt data at rest as well as in transit for your mission-critical workloads.

Few points on Amazon Aurora

Amazon Aurora is a relational database engine that combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. It delivers up to five times the throughput of standard MySQL running on the same hardware.

Amazon Aurora is designed to be compatible with MySQL 5.6, so that existing MySQL applications and tools can run without requiring modification. Amazon Aurora joins MySQL, Oracle, Microsoft SQL Server, and PostgreSQL as the fifth database engine available to custo…

How Hadoop is best suitable for large legacy data

I have selected a good interview on legacy data. You all know that lot of data is available on legacy systems. Hadoop is the mechanism you can use to process the data to get great business insights.

How should we be thinking about migrating data from legacy systems?
Treat legacy data as you would any other complex data type. HDFS acts as an active archive, enabling you to cost effectively store data in any form for as long as you like and access it when you wish to explore the data. And with the latest generation of data wrangling and ETL tools, you can transform, enrich, and blend that legacy data with other, newer data types to gain a unique perspective on what’s happening across your business.

What are your thoughts on getting combined insights from the existing data warehouse and Hadoop?
Typically one of the starter use cases for moving relational data off a warehouse and into Hadoop is active archiving. This is the opportunity to take data that might have otherwise gone to archive…

The awesome JSON Quick Guide for Legacy Programmers

JSON or JavaScript Object Notation is a lightweight is a text-based open standard designed for human readable data interchange. Conventions used by JSON are known to programmers which include C, C++, Java, Python, Perl etc.
JSON stands for JavaScript Object Notation.This format was specified by Douglas Crockford.This was designed for human-readable data interchangeJSON has been extended from the JavaScript scripting language.JSON filename extension is .jsonJSON Internet Media type is application/jsonThe Uniform Type Identifier is public.jsonJSON Quick Guide download here