30 December 2015

What is so Trendy in Data Visualization and Reporting

Data Visualization: Data visualization is the process that defines any effort to assist people to understand the importance of data by placing it in a visual context. Patterns, trends, and correlations that might be missed in text-based data can be represented and identified with data visualization software. It is a graphical representation of numerical data. This is one of the Hot skill in the market, you will get highest salary.

Types of data visualization

Visual Reporting: 
  1. Visual reporting uses charts and graphics to represent the business performance, usually defined by metrics and time-series information.
  2. The best dashboards and scorecards enables the users to drill down one or more levels to view more detailed information about a metric
  3.  Data Visualization hot jobs
    [Data Visualization hot jobs]
  4. A dashboard is a visual exception report that signifies the ambiguities in performances using visualization techniques
Visual Analysis
  1. Visual analysis allows users to visually explore the data to observe the data and discover new insights
  2. Visual analysis offers a higher degree of data interactivity
  3. Users can visually filter, compare, and correlate the data at the speed of thought incorporating forecasting, modeling, and statistical analysis
  4. Data Visualization Representations

Business Intelligence Dashboard
  1. A business intelligence dashboard is a data visualization tool that represents the current status of metrics and key performance indicators for an enterprise
  2. Dashboards combine and arrange numbers, metrics and sometimes performance scorecards on a single screen. They can be customized for a specific role and display metrics targeted for a single point of view or department
  3. Microsoft and Oracle  are some of the  vendors for business intelligence dashboards. BI dashboards can also be created through other business applications, such as Excel. They are sometimes referred to as enterprise dashboards.
Performance Scorecard
  1. It is a graphical representation of the progress over time of some entity, such as an enterprise, an employee or a business unit that functions towards some specified goal
  2. The important factors of performance scorecards are targets and key performance indicators (KPIs). KPIs are the metrics that are used to evaluate factors that are essential for the success of an organization
  3. The main difference between a business intelligence dashboard and a performance scorecard is that a business intelligence dashboard, like the dashboard of a car, indicates the status at a particular point in time. A performance scorecard displays the progress over time towards specific goals

26 December 2015

Complete Videos of IBM Watson IoT

(Internet of Things + Jobs)

Watson IoT is a set of capabilities that learn from, and infuse intelligence into, the physical world. The Internet of Things-generated data is growing twice as fast as social and computer-generated data, and it is extremely varied, noisy, time-sensitive and often confidential. You can learn quickly IBM watson for IoT quickly.
Complexity grows as billions of devices interact in a moving world. This presents a growing challenge that will test the limits of programmable computing. 

What is Cognitive IoT

  • Cognitive IoT is not explicitly programmed. It learns from experiences with the environment and interactions with people. 
  • It brings true machine learning to systems and processes so they can understand your goals, then integrate and analyze the relevant data to help you achieve them.
References
Follow us on social media

3 best Self Study Materials on Spark Mlib

Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala and Python, and an optimized engine that supports general execution graphs. An execution graph describes the possible states of execution and the states between them. Spark also supports a set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming.

Spark Overview with self study material
#Spark  
Review of Spark Machine Language Library (MLlib): MLlib is Spark's machine learning library, focusing on learning algorithms and utilities, including classification, regression, clustering, collaborative filtering, dimensionality reduction, as well as underlying optimization primitives.
Why MLlib? It is built on Apache Spark, which is a fast and general engine for large scale processing. Supposedly, running times or up to 100x faster than Hadoop MapReduce, or 10x faster on disk. Supports writing applications in Java, Scala, or Python.

References:

25 December 2015

5 Challenges Mostly People Look in Internet-of-Things

Security: While security considerations are not new in the context of information technology, the attributes of many IoT implementations present new and unique security challenges. Addressing these challenges and ensuring security in IoT products and services must be a fundamental priority. Users need to trust that IoT devices and related data services are secure from vulnerabilities, especially as this technology become more pervasive and integrated into our daily lives. 

  • Poorly secured IoT devices and services can serve as potential entry points for cyber-attack and expose user data to theft by leaving data streams inadequately protected. 
  • The interconnected nature of IoT devices means that every poorly secured device that is connected online potentially affects the security and resilience of the Internet globally. 
  • This challenge is amplified by other considerations like the mass-scale deployment of homogeneous IoT devices, the ability of some devices to automatically connect to other devices, and the likelihood of fielding these devices in unsecured environments.

As a matter of principle, developers and users of IoT devices and systems have a collective obligation to ensure they do not expose users and the Internet itself to potential harm. Accordingly, a collaborative approach to security will be needed to develop effective and appropriate solutions to IoT security challenges that are well suited to the scale and complexity of the issues.
The interconnected nature of IoT devices means that every poorly secured device that is connected online potentially affects the security and resilience of the Internet globally.
#The interconnected nature of IoT devices 
Privacy: The full potential of the Internet of Things depends on strategies that respect individual privacy choices across a broad spectrum of expectations.The data streams and user specificity afforded by IoT devices can unlock incredible and unique value to IoT users, but concerns about privacy and potential harms might hold back full adoption of the Internet of Things.

  • This means that privacy rights and respect for user privacy expectations are integral to ensuring user trust and confidence in the Internet, connected devices, and related services. Indeed, the Internet of 
  • Things is redefining the debate about privacy issues, as many implementations can dramatically change the ways personal data is collected, analyzed, used, and protected. 
  • For example, IoT amplifies concerns about the potential for increased surveillance and tracking, difficulty in being able to opt out of certain data collection, and the strength of aggregating IoT data streams to paint detailed digital portraits of users. While these are important challenges, they are not insurmountable.


In order to realize the opportunities, strategies will need to be developed to respect individual privacy choices across a broad spectrum of expectations, while still fostering innovation in new technology and services.

Interoperability / Standards: A fragmented environment of proprietary IoT technical implementations will inhibit value for users and industry. While full interoperability across products and services is not always feasible or necessary, purchasers may be hesitant to buy IoT products and services if there is integration inflexibility, high ownership complexity, and concern over vendor lock-in.

In addition, poorly designed and configured IoT devices may have negative consequences for the networking resources they connect to and the broader Internet. Appropriate standards, reference models, and best practices also will help curb the proliferation of devices that may act in disrupted ways to the Internet. The use of generic, open, and widely available standards as technical building blocks for IoT devices and services (such as the Internet Protocol) will support greater user benefits, innovation, and economic opportunity.

Legal, Regulatory and Rights: The use of IoT devices raises many new regulatory and legal questions as well as amplifies existing legal issues around the Internet. The questions are wide in scope, and the rapid rate of change in IoT technology frequently outpaces the ability of the associated policy, legal, and regulatory structures to adapt.

One set of issues surrounds cross border data flows, which occur when IoT devices collect data about people in one jurisdiction and transmit it to another jurisdiction with different data protection laws for processing.

Further, data collected by IoT devices is sometimes susceptible to misuse, potentially causing discriminatory outcomes for some users. Other legal issues with IoT devices include the conflict between law enforcement surveillance and civil rights; data retention and destruction policies; and legal liability for unintended uses, security breaches or privacy lapses.

While the legal and regulatory challenges are broad and complex in scope, adopting the guiding Internet Society principles of promoting a user’s ability to connect, speak, innovate, share, choose, and trust are core considerations for evolving IoT laws and regulations that enable user rights.

Emerging Economy and Development Issues: The Internet of Things holds significant promise for delivering social and economic benefits to emerging and developing economies. This includes areas such as sustainable agriculture, water quality and use, healthcare, industrialization, and environmental management, among others. As such, IoT holds promise as a tool in achieving the United Nations Sustainable Development Goals.

The broad scope of IoT challenges will not be unique to industrialized countries. Developing regions also will need to respond to realize the potential benefits of IoT. In addition, the unique needs and challenges of implementation in less-developed regions will need to be addressed, including infrastructure readiness, market and investment incentives, technical skill requirements, and policy resources.

24 December 2015

The 4 Most Asked Skills for Data Science Engineers

The data science is a combination of technical and general skills. As a analyst you need provide valuable information to client. The below is highly useful list.

Paradigms and practices: This involves data scientists acquiring a grounding in core concepts of data science, analytics and data management. Data scientists should easily grasp the data science life cycle, know their typical roles and responsibilities in every phase and be able to work in teams and with business domain experts and stakeholders. Also, they should learn a standard approach for establishing, managing and operationalizing data science projects in the business.

Algorithms and modeling: Here are the areas with which data scientists must become familiar: linear algebra, basic statistics, linear and logistic regression, data mining, predictive modeling, cluster analysis, association rules, market-basket analysis, decision trees, time-series analysis, forecasting, machine learning, Bayesian and Monte Carlo Statistics, matrix operations, sampling, text analytics, summarization, classification, primary components analysis, experimental design and unsupervised learning-constrained optimization.

Tools and platforms: Data scientists should master a basic group of modeling, development and visualization tools used on your data science projects, as well as the platforms used for storage, execution, integration and governance of big data in your organization. Depending on your environment, and the extent to which data scientists work with both structured and unstructured data, this may involve some combination of :
  • data warehousing, Hadoop, stream computing, NoSQL and other platforms. 
  • It will probably also entail providing instruction in MapReduce, R and other new open-source development languages in addition to SPSS, SAS and any other established tools.
Applications and outcomes: A major imperative for data scientists is to learn the chief business applications of data science in your organization, as well as ways to work best with subject-matter experts. 
  • In many companies, data science focuses on marketing, customer service, next-best offer and other customer-centric applications. 
  • Often, these applications require that data scientists know how to leverage customer data acquired from structured survey tools, sentiment analysis software, social media monitoring tools and other sources. 
  • Plus, every data scientist must understand the key business outcomes—such as maximizing customer lifetime value—that should be the focus of their modeling initiatives.

23 December 2015

What is the meaning of Agile

Agile Vs Scrum
Agile Vs Scrum
Agile is a time boxed, iterative approach to software delivery that builds software incrementally from the start of the project, instead of trying to deliver it all at once near the end.

It works by breaking projects down into little bits of user functionality called user stories, prioritizing them, and then continuously delivering them in short two week cycles called iterations.


Agile scales like any other software delivery process. Not that well.
Look - scaling is hard. There is no easy way to magically coordinate, communicate, and keep large groups of people all moving in the same direction towards the same cause. It's hard work.
The one thing Agile does bring to the conversation, is instead of looking for ways to scale up your project, look for ways to scale things down.

16 December 2015

3 Major Architecture Components in QlikView

The QlikView Desktop is a Windows-based desktop tool that is used by usiness analysts and developers to create a data model and to lay out the graphical user interface (GUI or presentation layer) for QlikView apps.

It is within this environment where a developer will use a SQL-like scripting environment augmented by ‘wizards’) to create the linkages (connection strings) to the source data and to transform the data e.g. rename fields, apply expressions) so that it can be analyzed and used within the UI, as well as re-used by other QlikView files.

Related: QlikView+Tableau+Jobs (Search and know skills needed)

The QlikView Desktop is also the environment where all user interface design and user experience is developed in a drag-and-drop paradigm: everything from graphs and tables containing slices of data to multi-tab architectures to application of color scheme templates and company logos is done here.


QLIKVIEW SERVER (QVS) - The QVS is a server-side product that contains the in-memory analytics engine and which handles all client/server communication between a QlikView client (i.e. desktop, IE plugin, AJAX or Mobile) and the server. It includes a management environment likView Management Console) for providing administrator access to control all aspects of the server eployments (including security, clustering, distribution etc.) and also includes a web server to provide front-end access to the documents within.

The web server’s user portal is known as Access Point. (It’s important to note that while the QVS contains its own web server,one can also utilize Microsoft IIS (Internet Information Server) for this purpose, too). The QVS handles client authorization against existing directory providers (e.g. Microsoft Active Directory, eDirectory) and also performs read and write to ACLs (access control lists) for QVW documents.

QLIKVIEW PUBLISHER -The QlikView Publisher is a server-side product that performs two main functions:
1) It is used to load data directly from data sources defined via connection strings in the source QVW files.
2) It is also used as a distribution service to reduce data and applications from source QVW
files based on various rules (such as user authorization or data access privileges) and to distribute these newly-created documents to the appropriate QlikView Servers or as static PDF reports via email.
Data sources that can be readily accessed by QlikView include standard ODBC or OLEDBcompliant
databases, standard flat files such as Microsoft Excel, XML, etc. as well as from systems such as SAP NetWeaver, Salesforce.com, and Informatica.

Related: QlikView Video Tutorials

QLIKVIEW ITPro- QlikView’s approach to BI allows for a self-service model for business users on the front end while maintaining strict data security and governance on the back end. Because of this approach, IT professionals—from enterprise architects to data analysts — can remain focused on their core competencies: data security, data and application provisioning, data governance and system maintenance. They no longer have to spend time writing and re-writing reports for business users.
In a typical QlikView deployment, IT professionals focus on:
  1. Managing data extracts and data and system security
  2. Creating and maintaining source QlikView files (QVWs and QVDs)
  3. Controlling data refresh and application distribution through QlikView Publisher
  4. Administering QlikView deployments via the QlikView Management Console (part of QVS)

09 December 2015

How To Master Life Cycle Of Scrum In Only One Day!

Scrum is an iterative, incremental framework for projects and product or application development. It structures development in cycles of work called Sprints.

These iterations are no more than one month each, and take place one after the other without pause. The Sprints are timeboxed – they end on a specific date whether the work has been completed or not, and are never extended. At the beginning of each Sprint, a cross-functional team selects items 5 (customer requirements) from a prioritized list.

Related: Top rated jobs in Scrum

The team commits to complete the items by the end of the Sprint. During the Sprint, the chosen items do not change. Every day the team gathers briefly to inspect its progress, and adjust the next steps needed to complete the work remaining. At the end of the Sprint, the team reviews the Sprint with stakeholders, and demonstrates what it has built.

(Frame work of Scrum)
People obtain feedback that can be incorporated in the next Sprint. Scrum emphasizes working product at the end of the Sprint that is really “done”; in the case of software, this means code that is integrated, fully tested and potentially shippable.

Related: Scrum vs Agile Key Differences

Key roles, artifacts, and events are summarized in Figure 1. A major theme in Scrum is “inspect and adapt.” Since development inevitably involves learning, innovation, and surprises, Scrum emphasizes taking a short step of development, inspecting both the resulting product and the efficacy of current practices, and then adapting the product goals and process practices. Repeat forever.

Related:

07 December 2015

The best answer for 'Efficient Workbook' in Tableau

There are several factors that define an “efficient” workbook. Some of these factors are technical and some more user-focused. An efficient workbook is:

  • A workbook that takes advantage of the “principles of visual analysis” to effectively communicate the message of the author and the data, possibly by engaging the user in an interactive experience.
  • A workbook that responds in a timely fashion. This can be a somewhat subjective measure, but in general we would want the workbook to provide an initial display of information and to respond to user interactions within a couple of (< 5) seconds. 
  • Tableau latest version is 9.1.2 as on writing this post
  • Tableau version 8 and Version 9 differences
  1. Individual Query time improved by 10x
  2. Dashboard Query times improved by 9x
  3. Query Fusion improving times by 2x
  4. And Query Caching improving times by 50x

        05 December 2015

        QlikView top features comparing to other reporting tools

        #The major differences in QlikView:
        QlikView:
        One of the QlikView’s primary differentiators is the associative user experience it delivers. QlikView is the leading Business Discovery platform.
        It enables users to explore data, make discoveries, and uncover insights that enable them to solve business problems in new ways. Business users conduct searches and interact with dynamic dashboards and analytics from any device. 
        Users can gain unexpected business insights because QlikView:

        Works the way the mind works. With QlikView, users can navigate and interact with data any way they want to — they are not limited to just following predefined drill paths or using preconfigured dashboards. Users ask and answer questions on their own and in groups and teams, forging new paths to insight and decision. 

        With QlikView, discovery is flexible. Business users can see hidden trends and make discoveries like with no other BI platform on the market.

        Delivers direct — and indirect — search. With Google-like search, users type relevant words or phrases, in any order, and get instant, associative results. With a global search bar, users can search across the entire data set in an application. With search boxes affiliated with individual list boxes, users can confine the search to just that list box. 


        They can both conduct direct and indirect searches. For example, if a user wanted to identify a sales rep but can’t remember the sales rep’s name — just details about the rep, such as that he sells fish to customers in the Nordic region — the user can search on the sales rep list box for “Nordic” and “fish” to get the names of sales reps who meet those criteria.


        Next Steps:
        Delivers answers as fast as users can think up questions.

        • A user can ask a question in QlikView in many different ways, such as lassoing data in charts and graphs and maps, clicking on items in list boxes, manipulating sliders, and selecting dates in calendars. Instantly, all the data in the entire application filters itself instantly around the user’s selections. 
        • The user can quickly and easily see relationships and find meaning in the data, for a quick path to insight. 
        • The user can continue to click on field values in the application, further filtering the data based on questions that come to mind.

        Illuminates the power of gray

        • With QlikView, users can literally see relationships in the data. They can see not just which data is associated with the user’s selections — they can just as easily see which data is not associated. 
        • How? The user’s selections are highlighted in green. Field values related to the user’s selection are highlighted in white. Unrelated data is highlighted in gray.
        • For example, when a user clicks on a product category (say, bagels) and a region (e.g., Japan), QlikView instantly shows everything in the entire data set that is associated with these selections — as well as the data that is not associated. The result? New insights and unexpected discoveries. 
        • For example, the user might see that no bagels were sold in Japan in January or June, and begin an investigation into why.

        04 December 2015

        QlikView Server vs Publisher top differences really useful to your project

        Qlikview Server Vs Publisher

        QLIKVIEW SERVER


        • The QVS is a server-side product that contains the in-memory analytics engine and which handles all client/server communication between a QlikView client (i.e. desktop, IE plugin, AJAX or Mobile) and the server. 
        • It includes a management environment (QlikView Management Console) for providing administrator access to control all aspects of the server deployments (including security, clustering, distribution etc.) and also includes a web server to provide front-end access to the documents within.
        • The web server’s user portal is known as Access Point. (It’s important to note that while the QVS contains its own web server, one can also utilize Microsoft IIS (Internet Information Server) for this purpose, too). 
        • The QVS handles client authorization against existing directory providers (e.g. Microsoft Active Directory, eDirectory) and also performs read and write to ACLs (access control lists) for QVW documents.

        QLIKVIEW PUBLISHER


        • The QlikView Publisher is a server-side product that performs two main functions: 
        • It is used to load data directly from data sources defined via connection strings in the source QVW files. 
        • It is also used as a distribution service to reduce data and applications from source QVW files based on various rules (such as user authorization or data access privileges) and to distribute these newly-created documents to the appropriate QlikView Servers or as static PDF reports via email.
        • Data sources that can be readily accessed by QlikView include standard ODBC or OLEDBcompliant databases, standard flat files such as Microsoft Excel, XML, etc. as well as from systems such as SAP NetWeaver, Salesforce.com, and Informatica.

        02 December 2015

        2 Scaling-Up And Scaling-out QlikView's Ideas! That You Can Never Miss

        #The complete architecture of Qlik view:
        #The Scale in complete architecture of Qlikview:
        In scale-up architecture, a single server is used to serve the QlikView applications. In this case, as more throughput is required, bigger and/or faster hardware (e.g. with more RAM and/or CPU capacity) are added to the same server.

        In scale-out architecture, more servers are added when more throughput is needed to achieve the performance necessary. It is common to see the use of commodity servers in these types of architectures. As more throughput is required new servers are added, creating a clustered QlikView environment. In these environments, QlikView Server supports load sharing of QlikView applications across multiple physical or logical computers. QlikView load balancing refers to the ability to distribute the load (i.e. end-user sessions) across the cluster in accordance to a predefined algorithm for selecting which node should take care of a certain session. QlikView Server version 11 supports three different load balancing algorithms.

        #The scle-out QlikView Architecture:
        •  Below is a brief definition for each scheme. Please refer to the QlikView Scalability Overview Technology white paper for further details. 
        • Random: The default load balancing scheme. The user is sent to a random server, no matter if QlikView application the user is looking for is loaded or not on a QlikView Server. 
        • Loaded Document: If only one QlikView Server has the particular QlikView application loaded, the user is sent to that QlikView Server. If more than one QlikView Server or none of the QlikView Servers have the application loaded, the user is sent to the QlikView Server with the largest amount of free RAM.
        • CPU with RAM Overload: The user is sent to the least busy QlikView Server. 

        Please note that this report does not go into detail on when to use and how to tune different load balancing algorithms for best performance. Cluster test executions presented in this report have been run in an environment configured with a better performing scheme for the certain conditions of a particular test.

        01 December 2015

        Ultimate Answer for difference between Storage Node and Compute Node

        Compute Node: This is the computer or machine where your actual business logic will be executed.

        Storage Node: This is the computer or machine where your file system reside to store the processing data. In most of the cases compute node and storage node would be the same machine

        What are the restrictions for Key and Value Class:

        The key and value classes have to be serialized by the framework. To make them serializable Hadoop provides a Writable interface. As you know from the java itself that the key of the Map should be comparable, hence the key has to implement one more interface WritableComparable.

        Featured post

        10 top Blockchain real features useful to financial projects

        Blockchain is basically a shared ledger and it has many special features. Why you need it. Business transactions take place every second...

        Most Viewed