Skip to main content

Posts

Showing posts from July, 2015

Networking in IoT age for big opportunities (1 of 3)

In most cases, this connection is made via electrical cables that carry the information in the form of electrical signals. Other types of connections are used, too. 

For example, computers can communicate via fiber optic cables at extremely high speeds by using impulses of light. And in a wireless network, computers communicate by using radio signals. Client computers Server computers Network interface- network port Cable: Computers in a network are usually physically connected to each other using cable Switches: You don't typically use network cable to connect computers directly to each other. Instead, each computer is connected by cable to a central switch, which connects to the rest of the network Wireless networks: In a wireless network, most cables and switches are moot. Radio transmitters and receivers take the place of cables. Of course, the main advantage of wireless networking is its flexibility: no cables to run through walls or ceilings, and client computers located any…

Role of Networking in the age of Cloud Computing (Part 1 of 2)

Networkings

Data Center (DC)-based services are emerging as relevant source of network capacity demand for service providers and telecom operators. Cloud computing services, Content Distribution Networks (CDNs), and, generally, th
e networked applications have a huge impact on the telecom operator infrastructure.

Cloud computing paradigm provides a new model for service delivery where computing resources can be provided on-demand across the network. This elasticity permits the sharing of resources among users, thus reducing costs and maximizing utilization, while posing a challenge towards an efficient cloud-aware network.

More: Cloud Storage as a Service Basics-part-2

The computing resources can be provided on-demand depending on the user requests. Such resources can be allocated on distinct servers into a data center, or through data centers distributed in the network. Under this new model, the users access their assigned resources, as well as the applications and services using the…

Cloud Storage as a Service Basics (2 of 3)

The really awesome point is cloud storage. Yes, you are storing data in cloud. But you need to understand here few good things about it. What is cloud storage... Cloud storage involves exactly what the name suggests—storing your data with a cloud service provider rather than on a local system. As with other cloud services, you access the data stored on the cloud via an Internet link.

Even though data is stored and accessed remotely, you can maintain data both locally and on the cloud as a measure of safety and redundancy. Cloud storage has a number of advantages over traditional data storage:

The benefits.. If you store your data on a cloud, you can get at it from any location that has Internet access. This makes it especially appealing to road warriors. Workers don’t need to use the same computer to access data nor do they have to carry around physical storage devices. Also, if your organization has branch offices, they can all access the data from the cloud provider. The Basics: There …

Cloud Computing Middleware an approach- Virtualization

In order to run applications on a Cloud, one needs a flexible middleware that eases the development and the deployment process.

GridGain provides a middleware that aims to develop and run applications on both public and private Clouds without any changes in the application code.

It is also possible to write dedicated applications based on the map/reduce programming model. Although GridGain provides mechanism to seamlessly deploy applications on a grid or a Cloud, it does not support the deployment of the infrastructure itself. It does, however, provide protocols to discover running GridGain nodes and organize them into topologies (Local Grid, Global Grid, etc.) to run applications on only a subset of all nodes.

Elastic Grid infrastructure provides dynamic allocation, deployment, and management of Java applications through the Cloud. It also offers a Cloud virtualization layer that abstracts specific Cloud computing provider technology to isolate applications from specific implementati…

Big data: Quiz-1 Hadoop Top Interview Questions

Q.1) How Hadoop achieves scaling in terms of storage? A.By increasing the hard disk capacity of the machine
B.By increasing the RAM capacity of the machine
C.By increasing both hard disk and RAM capacity of the machine
D.By increasing the hard disk capacity of the machine and by adding more machine
Q.2) How fault tolerance with respect to data is achieved in Hadoop? A.By breaking the data into smaller blocks and distributing these smaller blocks into several machines
B.By adding extra nodes.
C.By breaking the data into smaller blocks and copying each block several times, and distributing these replicas across several machines. By doing this Hadoop makes sure even if the machines are failed the replica is present in some other machine
D.None of these
Q.3) In what all parameters Hadoop scales up? A. Storage only
B. Performance only
C. Storage and performance both
D. Storage ,performance and IO bandwidth
Q.4) What is the scalability limit of Hadoop? A. NameNode’s RAM B. NameNode’s hard disk C. Both Har…

Different kinds of NoSQL Databases in the age of Big data

The below list gives you complete list of NoSQL databases currently available in the market.
Sorted Order Column Oriented Stores:
Google's Bigtable espouses a model where data in stored in a column-oriented way. This contrasts with the row-oriented format in RDBMS. The column-oriented storage allows data to be stored effectively. It avoids consuming space when storing nulls by simply not storing a column when a value doesn't exist for that column. Each unit of data can be thought of as a set of key/value pairs, where the unit itself is identified with the help of a primary identifier, often referred to as the primary key. Bigtable and its clones tend to call this primary key the row-key. 
Example: The name column-family bucket stores the following values:     For row-key: 1     first_name: John     last_name: Doe     For row-key: 2     first_name: Jane The location column-family stores the following:     For row-key: 1     zip_code: 10001     For row-key: 2     zip_code: 94303 The profile co…

Internet of Things Basics (Part-7)

The connecting devices and getting raw data from multiples sources, and sending this data to analysis is a major concept in IoT. Devices can be connected through Protocols.
What is protocol...

I want to share some information on advanced IP based Protocols. Read my previous post on IOT.

The role of IPv6- It is advanced in the Range of internet protocols. The main function is it supports Mobility.

We retain the position that IoT may well become the "killer-app" for IPv6. Using IPv6 with its abundant address spaces, globally unique object (thing) identification and connectivity can be provided in a standardized manner without additional status or address (re)processing—hence, its intrinsic advantage over IPv4 or other schemes. Jobs are growing in this field.

For the IoT as well as for other applications for smartphones and similar devices, there is a desire to support direct communication between mobile nodes (MNs) and far-end destinations, whether such far-ends are themselves a…

Big Data: Hadoop File system checking Utility- fsck

Hadoop provides file system utility which is called "fsck"

Basically, it checks health of all the files under a pathIt also check health of all the files under the '/'(root)BIN/HADOOP  fsck  /
- It checks health of all the files
BIN/HADOOP  fsck    /test/
- It checks health of files under the path
How to find which file is healthy:
- It prints out dot for each healthy file
- It will print message for each file, if it is not healthy, also for under replicated blocks, over replicated blocks, mis-replicated blocks, and corrupted blocks.
By default fsck utility cannot do anything for under replicated blocks and over replicated blocks. Hadoop itself heal the blocks.
How to delete corrupted blocks:
BIN/HADOOP  fsck   -delete  block-names
It will delete all corrupted blocks
BIN/HADOOP  fsck   -move  block-names
It will move corrupted blocks to /lost directory
Other options we can use with fsck: -files-blocks-locations

Top SAP HANA Iot must read Interview Questions(3 of 3)

The below is my third set of interview questions. In this lot I have given ten interview questions for your quick reference.

What is SAP HANA?SAP deployed SAP HANA as an integrated solution that combines software and hardware, which is frequently referred to as the SAP HANA appliance. As with SAP NetWeaver Business Warehouse Accelerator (SAP NetWeaver BW Accelerator), SAP partners with several hardware vendors to provide the infrastructure that is needed to run the SAP HANA software. Lenovo partnered with SAP to provide an integrated solution.
2) What is memory for CORE ratio in SAP HANA? For in-memory computing appliances, such as SAP HANA, the amount of main memory is important. In-memory computing brings data that is kept on disk into main memory. This action allows for much faster processing of the data because the CPU cores do not have to wait until the data is loaded from disk to memory, which means each CPU is better used.
SQLDBC:An SAP native database SDK that can be used to deve…

Big data: Learn Scala Programming Language for Scalability

What is Scala:

Scala's design has been influenced by many programming languages and ideas in programming language research. In fact, only a few features of Scala are genuinely new; most have been already applied in some form in other languages. Scala's innovations come primarily from how its constructs are put together.

At the surface level, Scala adopts a large part of the syntax of Java and C#, which in turn borrowed most of their syntactic conventions from C and C++. Expressions, statements, and blocks are mostly as in Java, as is the syntax of classes, packages and imports.

 Besides syntax, Scala adopts other elements of Java, such as its basic types, its class libraries, and its execution model.

Scala also owes much to other languages. Its uniform object model was pioneered by Smalltalk and taken up subsequently by Ruby. 

Its idea of universal nesting (almost every construct in Scala can be nested inside any other construct) is also present in Algol, Simula, and, more rec…

Big data: Information and Data Quality Basics (Part 1 of 3)

Information and data quality is new service work for data intense companies. I have seen not only in Analytics projects, in Mainframe projects also, there is Data Quality team.

How incorrect data impact on us:

Information quality problems and their impact are all around us:
A customer does not receive an order because of incorrect shipping information; products are sold below cost because of wrong discount rates; a manufacturing line is stopped because parts were not ordered—the result of inaccurate inventory information; a well-known U.S. senator is stopped at an airport (twice) because his name is on a government "Do not fly" list; many communities cannot run an election with results that people trust; financial reform has created new legislation such as Sarbanes—Oxley. What is information
Information is not simply data, strings of numbers, lists of addresses, or test results stored in a computer.

Information is the product of business processes and is continuously used and …

Internet of Things Basics (Part 6)

What is the architecture of internet of things-The three-layer DCM classification is more about the IoT value chain than its system architecture at run time.

I hope you enjoyed with my previous post-5 on IOT.

For system architecture, some have divided the IoT system into as many as nine layers, from bottom to top:
devicesconnectivitydata collectioncommunicationdevice management.data rulesadministrationapplicationsintegration While large companies such as IBM, Oracle, Microsoft, and others have comprehensive solutions, products, and services that cover almost the entire value chain.

Recommendation for you:Part-2 | Part-1

Broadly IOT architecture can be classified as three layers:

Device LayerCommunication LayerMangement Layer Device Layer: Devices or assets can be categorized as two groups: those that have inherent intelligence such as electric meters or heating, ventilation, and air-conditioning (HVAC) controllers, and those that are inert and must be enabled to become smart devices (e.g.,…

Top Hive interview Questions for quick read (1 of 2)

The selected interview questions on HIVE. Hive is a technology being used in Hadoop eco system.

1) What are major activities in Hadoop eco system?
Within the Hadoop ecosystem, HDFS can load and store massive quantities of data in an efficient and reliable manner. It can also serve that same data back up to client applications, such as MapReduce jobs, for processing and data analysis.
2)What is the role of HIVE in HADOOP Eco system?
Hive, often considered the Hadoop data warehouse platform, got its start at Facebook as their analyst struggled to deal with the massive quantities of data produced by the social network. Requiring analysts to learn and write MapReduce jobs was neither productive nor practical.
3)What is Hive in Hadoop?
Facebook developed a data warehouse-like layer of abstraction that would be based on tables. The tables function merely as metadata, and the table schema is projected onto the data, instead of actually moving potentially massive sets of data. 
This new capabili…

Top SAP HANA Iot must read Interview Questions(2 of 3)

1. How parallel processing is achieved in SAP HANA?
The phrase "divide and conquer" (derived from the Latin saying divide et impera) typically is used when a large problem is divided into a number of smaller, easier-to-solve problems. Regarding performance, processing huge amounts of data is a problem that can be solved by splitting the data into smaller chunks of data, which can be processed in parallel.

2.How data portioning will happen in SAP HANA?
Although servers that are available today can hold terabytes of data in memory and provide up to eight processors per server with up to 10 cores per processor, the amount of data that is stored in an in-memory database or the computing power that is needed to process such quantities of data might exceed the capacity of a single server. To accommodate the memory and computing power requirements that go beyond the limits of a single server, data can be divided into subsets and placed across a cluster of servers, which forms a dist…

Big Data: Top Hadoop Interview Questions (4 of 5)

1) What is MAP Reduce program?
- You need to give actual steps in this program
- You have to write scripts and codes

2) What is MAPReduce?
-Mapreduce is a data processing model
-It is combination of 2 parts. One is Mappers and the other one is Reducers

3)What will happen in Mapping phase?
It takes the input data, and feeds each data element into the mapper

4)What is the function of Reducer?
The reducer process all outputs from mapper and arrives at a final result

5)What kind of input required for Mapreduce?
It should be structured in the form of (Key,Value) pairs

6)What is HDFS?
HDFS is a file system designed for large-scale data processing under frameworks such as MapReduce.

7) Is HDFS like UNIX?
No, but commands in HDFS works similarly to UNIX

8) What is Simple file command?
hadoop fs -ls

9) How to copy data into HDFS file system?

Copy a file into HDFS from local system

10) What is default working directory in HDFS?
/user/$USER
$USER ==> Your login user name

Big Data: IBM InfoSphere BigInsights Basics

I am explaining here why you need IBM infoSphere. You all know about what is file system in Hadoop. 
Hadoop is a distributed file system and data processing engine that is designed to handle extremely high volumes of data in any structure.In simpler terms, just imagine that you've got dozens, or even hundreds (or thousands!) of individual computers racked and networked together. Each computer (often referred to as a node in Hadoop-speak) has its own processors and a dozen or so 2TB or 3TB hard disk drives.
All of these nodes are running software that unifies them into a single cluster, where, instead of seeing the individual computers, you see an extremely large volume where you can store your data.

The beauty of this Hadoop system is that you can store anything in this space: millions of digital image scans of mortgage contracts, days and weeks of security camera footage, trillions of sensor-generated log records, or all of the operator transcription notes from a call center. T…

Big Data: Top Cloud Computing Interview Questions (1 of 4)

The below are frequently asked interview questions on Cloud computing:
1) What is the difference between Cloud and Grid?
Grid:
-Information service
-Security Service
-Data management
-Execution Manageement
Cloud:
- Maintains up-to-date information of resources
-Create VMs according to user requirement
-Application deploment
-User management

2) What are the different cloud standards?
-Interoperability standards
-Security standards
-Portability Standards
-Governance and Risk standards

3) What are the two different sub-systems in Cloud computing?
-Management sub system
-Resource sub system

4)What is Cloud compouting?
The promise of cloud computing is ubiquitous access to a broad set of applications and services, which are delivered over the network to multiple customer.

5) Why we need specialized network for Cloud services?
The public Internet is the simplest choice for delivering cloud-based services. In this model, the cloud provider simply purchases Internet connectivity and its customers…

Internet of Thing Basics (Part 5)

Internet of things can be applied to both Vertical and Horizontal of things: Applications of the Internet of Things (IoT) have spread across an enormously large number of industry sectors. The development of the vertical applications in these sectors is unbalanced.

It is very important to sort out those vertical applications and identify common underpinning technologies that can be used across the board, so that interconnecting, interrelating, and synergized grand integration and new creative, disruptive applications can be achieved.

One of the common characteristics of the Internet of Things is that objects in a IoT world have to be instrumented 

Why we need IOT is a fundamental change in the way information is generated, from mostly manual input to massively machine-generated without human intervention.

To achieve such 5A (anything, anywhere, anytime, anyway, anyhow) and 3I (instrumented, interconnected, and intelligent) capabilities, some common, horizontal, general-purpose technol…

Top SAP HANA Iot must read Interview Questions (1 of 3)

The below are the list of awesome interview questions asked for SAP Hana interviews. Across the globe the below are the basic questions people are asking. Q.1) What is in-memory computing?
A1) In-memory computing is a technology that allows the processing of massive quantities of data in main memory to provide immediate results from analysis and transaction. The data that is processed is ideally real-time data (that is, data that is available for processing or analysis immediately after it is created).

2) How in-memory computing works?A2) Keep data in main memory to speed up data access. Minimize data movement by using the columnar storage concept, compression, and performing calculations at the database level. Divide and conquer. Use the multi-core architecture of modern processors and multi-processor servers (or even scale out into a distributed landscape) to grow beyond what can be supplied by a single server.

3)What is the benefit of keeping data in memory?
A3) Data accessing from m…

Big Data: Top NoSQL Interview Questions (2 of 5)

1) What is most important character of NoSQL?
High Availability

2)Different types of NoSQL databases?
Key-Value stores
Column Stores
Graph Stores
Document Stores

3)What is oracle NoSQL database?
Oracle NoSQL Database is a distributed key-value database designed to provide highly reliable, scalable, and available data storage across a configurable set of systems.

4)What is the DB engine being used in Oracle NoSQL database?
Oracle NoSQL Database uses Oracle Berkeley DB Java Edition as the underlying data storage engine.

5)What is oracle NoSQL database?
Oracle NoSQL Database is a shared-nothing system designed to run and scale on commodity hardware. Key-value pairs are hash partitioned across server groups known as shards. At any point in time, a single key-value pair is always associated with a unique shard in the system.

6) What are unique features of Oracle NoSQL?
Oracle NoSQL Database leverages the high availability features in Berkeley DB in order to provide resiliency, fault tolerance, …

10 top NoSQL database recently asked interview questions

1) Who are involved in developing NoSQL ?

Amazon and Google Papers

2) What is NoSQL?

Which we will use on non-relationla databases. Like columnar databases. By using NOSQL we can query data from non-relational databases.

3) What are unique features of NoSQL databases?

-There is no concept of relationship between records
-They need UN-structural data
-They do not store data that individual records do not have relationship with each other.

4) How NoSQL databases are faster than traditional RDBMS?

-Stores database on multiple servers,rather storing whole database in a single server
-Adding replicas on other servers, we can retrieve data faster, even one of the server crashes

5) What are the UNIQUE features of NoSQL?

-Opensource
-ACID complaint

6) What are the characteristics of good NoSQL product?

High availability: Fault tolerance when a single server goes downDisaster recovery: For when a data center goes down, or more likely someone digs up a network cable just outside the data centerSupport: Someon…