Skip to main content

Big Data: IBM InfoSphere BigInsights Basics

I am explaining here why you need IBM infoSphere. You all know about what is file system in Hadoop. 

Hadoop is a distributed file system and data processing engine that is designed to handle extremely high volumes of data in any structure.In simpler terms, just imagine that you've got dozens, or even hundreds (or thousands!) of individual computers racked and networked together. Each computer (often referred to as a node in Hadoop-speak) has its own processors and a dozen or so 2TB or 3TB hard disk drives.
IBM Infosphere
IBM Infosphere
All of these nodes are running software that unifies them into a single cluster, where, instead of seeing the individual computers, you see an extremely large volume where you can store your data.

The beauty of this Hadoop system is that you can store anything in this space: millions of digital image scans of mortgage contracts, days and weeks of security camera footage, trillions of sensor-generated log records, or all of the operator transcription notes from a call center. This ingestion of data, without worrying about the data model, is actually a key tenet of the NoSQL movement.

IBM InfoSphere BigInsights

BigInsights features Apache Hadoop and its related open source projects as a core component. This is informally known as the IBM Distribution for Hadoop. IBM remains committed to the integrity of these open source projects, and will ensure 100 percent compatibility with them.

This fidelity to open source provides a number of benefits. For people who have developed code against other 100 percent open source–compatible distributions, their applications will also run on BigInsights, and vice versa. This open source compatibility has enabled IBM to amass over 100 partners, including dozens of software vendors, for BigInsights.

Simply put, if the software vendor uses the libraries and interfaces for open source Hadoop, they'll work with BigInsights as well.

Components in IBM Infosphere Biginsights

Hadoop (common utilities, HDFS, and the MapReduce framework)


Avro (data serialization)


Chukwa (monitoring large clustered systems)


Flume (data collection and aggregation)


HBase (real-time read and write database)


HCatalog (table and storage management)


Hive (data summarization and querying)


Lucene (text search)


Oozie (work flow and job orchestration)


Pig (programming and query language)


Sqoop (data transfer between Hadoop and databases)


ZooKeeper (process coordination)


Popular posts

Blue Prism complete tutorials download now

RPA blue prsim tutorial popular resources I have given in this post. You can download quickly.Learning Blue Prism is really good option if you are learner of Robotic process automation. The RPA is also called "Robotic Process Automation"- Real advantages are you can automate any business process and you can complete the customer requests in less time.

The Books Available on Blue Prism 
Blue Prism resourcesDavid chappal PDF bookBlue Prism BlogsVideo Training
RPA training The other Skills you need
Basic business skills and Domain skills are more than enough to be successful in this automation careerScripting languages like Perl/JSON/JavaScript/VBScript.  The interesting point is learning any RPA tool is not a problem. You can learn tool quickly. The real point is how quickly you apply your knowledge to implement automated tasks is important.

Also read
Robotic RPA Software developer skills you needBlue Prism tutorials download to learn quicklyPopular RPA tools functionality differen…

Three popular RPA tools functional differences

Robotic process automation is growing area and many IT developers across the board started up-skill in this popular area. I have written this post for the benefit of Software developers who are interested in RPA also called Robotic Process Automation.

In my previous post, I have described that total 12 tools are available in the market. Out of those 3 tools are most popular. Those are Automation anywhere, BluePrism and Uipath. Many programmers asked what are the differences between these tools. I have given differences of all these three RPA tools.

BluePrismBlue Prism has taken a simple concept, replicating user activity on the desktop, and made it enterprise strength. The technology is scalable, secure, resilient, and flexible and is supported by a comprehensive methodology, operational framework and provided as packaged software.The technology is developed and deployed within a “corridor of IT governance” and has sophisticated error handling and process modelling capabilities to ensu…

Robotic RPA Software developer skills you need

Robotic process automation is an upcoming and becoming most popular skill. As I said there are three popular tools. To become proficient in any one of the tool is really good to get a job in Developer role.
To get a job in this line, I found in my research that some programming skills and Hand-on training on any one of the tools is required. Also, try to to know differences in other popular rpa tools.

Most people are asking experience in tools like Automation anywhare, Blue Prism and Uipath. But, you cannot be proficient in all. So just know what are the differences. Ok...
You may ask a question like how to know. First join one good coaching institute and learn one tool perfectly. And start taking online training. Really good for you. Whatever you are lacking quickly you can learn online way.

To learn Uipath try here. Also, you can learn Automation anywhere tool online way.

The following are the list of IT skills commonly asking:
Automation anywhere/Blue Prism/Uipath.Net/C#/Java/SQL ski…