Posts

Showing posts from June, 2015

Featured Post

SQL Interview Success: Unlocking the Top 5 Frequently Asked Queries

Image
 Here are the five top commonly asked SQL queries in the interviews. These you can expect in Data Analyst, or, Data Engineer interviews. Top SQL Queries for Interviews 01. Joins The commonly asked question pertains to providing two tables, determining the number of rows that will return on various join types, and the resultant. Table1 -------- id ---- 1 1 2 3 Table2 -------- id ---- 1 3 1 NULL Output ------- Inner join --------------- 5 rows will return The result will be: =============== 1  1 1   1 1   1 1    1 3    3 02. Substring and Concat Here, we need to write an SQL query to make the upper case of the first letter and the small case of the remaining letter. Table1 ------ ename ===== raJu venKat kRIshna Solution: ========== SELECT CONCAT(UPPER(SUBSTRING(name, 1, 1)), LOWER(SUBSTRING(name, 2))) AS capitalized_name FROM Table1; 03. Case statement SQL Query ========= SELECT Code1, Code2,      CASE         WHEN Code1 = 'A' AND Code2 = 'AA' THEN "A" | "A

Cloud Storage as a Service Basics(1 of 3)

Image
Cloud storage is a model of networked enterprise storage where data is stored in virtualized pools of storage which are generally hosted by third parties. Hosting companies operate large data centers, and customers that require their data to be hosted buy or lease storage capacity from these hosting companies. The data center operators virtualize the resources according to customer requirements and expose them as storage pools, which the customers can use to store data. Physically, the resource may span multiple servers and multiple locations. The safety of the data depends upon the hosting companies and on the applications that leverage the cloud storage. Cloud storage is based on highly virtualized infrastructure and has the same characteristics as cloud computing in terms of agility, scalability, elasticity, and multi-tenancy. It is available both off-premises and on-premises.  While it is difficult to declare a canonical definition of cloud storage architecture, object sto

Big Data:Top Hadoop Interview Questions (2 of 5)

Image
Frequently asked Hadoop interview questions. 1. What is Hadoop? Hadoop is a framework that allows users the power of distributed computing. 2.What is the difference between SQL and Hadoop? SQL is allowed to work with structured data. But SQL is most suitable for legacy technologies. Hadoop is suitable for unstructured data. And, it is well suited for modern technologis. Hadoop 3. What is Hadoop framework? It is distributed network of commodity servers(A server can contain multiple clusters, and a cluster can have multiple nodes) 4. What are 4 properties of Hadoop? Accessible-Hadoop runs on large clusters of commodity machines Robust-An assumption that low commodity machines cause many machine failures. But it handles these tactfully.  Scalable-Hadoop scales linearly to handle larger data by adding more nodes to the cluster.  Simple-Hadoop allows users to quickly write efficient parallel code 5. What kind of data Hadoop needs? Traditional RDBMS having relational

Big Data:Top Hadoop Interview Questions (1 of 5)

Image
Looking out for Hadoop Interview Questions that are frequently asked by employers? Here is the first list of Hadoop Interview Questions which  covers HDFS… 1. What is BIG DATA? Big Data is nothing but an assortment of such a huge and complex data that it becomes very tedious to capture, store, process, retrieve and analyze it with the help of on-hand database management tools or traditional data processing techniques. 2. Can you give some examples of Big Data? There are many real life examples of Big Data! Facebook is generating 500+ terabytes of data per day, NYSE (New York Stock Exchange) generates about 1 terabyte of new trade data per day, a jet airline collects 10 terabytes of censor data for every 30 minutes of flying time. All these are day to day examples of  Big Data! 3. Can you give a detailed overview about the Big Data being generated by Facebook? As of December 31, 2012, there are 1.06 billion monthly active users on facebook and 680 million mobile user

How to talk Airtel Customer Care Executive Directly

Image
Call 121 Then select : 1-6-9 Thanks for reading.

Top Skills You need for Automation Career

Image
According to KPMG -Process automation provides a means to integrate people in a software development organization with the development process and the tools supporting that development.  Successful career you need these Skills. What you will achieve by Automation By automating processes, you can boost your efficiency and help ensure standardized handling of repetitive workflow steps. Organiz ation’s benefits are translation projects that can be realized in a shorter time for less money. Skill set you need: Programming Languages (C++/Java/Scala), OOPs Concepts - MUST Unix/Linux - MUST Automation INDIA Jobs  |  USA Automation developer Jobs Automation Development/Scripting Experience - MUST XML/Xpath - Optional Perl/Python/Shell Scripting - Optional SQL/Sybase/Mongo DB - Optional Web Services - SOAP or REST API - Optional Additional Skills: Puppet/Chef/CFEngine experience  Experience with system packaging tools; e.g. RPM  SQL database programmi

New Directions for Digital Products (1 of 2)

Image
We already crossed Agriculture, Industrial, Information age. Now we are in digitization age. Many companies investing huge money in digitization. Mphasis - is betting on the digitization of Financial institutions Tech Mahindra - started research on Heath care digitization Infosys - focusing on Automation and artificial intelligence TCS - focussing on Machine learning WIPRO - is focusing on Big data and Hadoop What is digitization What we mean by digital. Digital data is distinguished from analog data in that the datum is represented in discrete, discontinuous values, rather than the continuous, wavelike values of analog. Thus, the digitization of data refers to the conversion of information into binary code, allowing for more efficient transmission and storage of data. A key differentiator of our current age from prior human history is that, as of the last decade, we not only convert data to a digital format, but we also create data in a digital format. Thus, we now h

IBM PML Vs Google MapReduce why you need to read

Image
IBM Parallel Machine Learning Toolbox (PML) is similar to that of Google's MapReduce programming model (Dean and Ghemawat, 2004) and the open source Hadoop system,which is to provide Application Programming Interfaces (APIs) that enable programmers who have no prior experience in parallel and distributed systems to nevertheless implement parallel algorithms with relative ease. Google MapReduce Vs IBM PML Like MapReduce and Hadoop, PML supports associative-commutative computations as its primary parallelization mechanism .  Unlike MapReduce and Hadoop, PML fundamentally assumes that learning algorithms can be iterative in nature, requiring multiple passes over data. The ability to maintain the state of each worker node between iterations, making it possible, for example, to partition and distribute data structures across workers Efficient distribution of data, including the ability of each worker to read a subset of the data, to sample the data, or to scan the entire data

Beginner's Tutorial on Cloud Resources Management(1of 2)

Image
Cloud resource management refers to the allocation of cloud resources, such as processor power, memory capacity, and network bandwidth. Resource management in Cloud computing The resource management system is the system responsible for allocating the cloud resources to users and applications.  For any resource management system to become successful, it needs to flexibly utilize the available cloud resources whilst maintaining service isolation. How resource management works The resource management system is expected to operate under the predefined QoS requirements as set by the customers.  The resource management at cloud scale requires a rich set of resource management schemes. The provider should show an option for scalability. Challenges   The big challenge for cloud service providers is in managing physical and virtual resources according to user-resources demands.  It is in a way that rapidly and dynamically provides resources to applications.  The big chall

20 Best Videos to Learn Machine Learning Quickly

According to Coursera -Machine learning is the science of getting computers to act without being explicitly programmed. In the past decade, machine learning has given us self-driving cars, practical speech recognition, effective web search, and a vastly improved understanding of the human genome. 1.      introduction, The Motivation Applications of Machine Learning 2.      An Application of Supervised Learning - Autonomous Deriving 3.      The Concept of Underfitting and Overfitting 4.      Newtons Method 5.      Discriminative Algorithms 6.      Multinomial Event Model 7.      Optimal Margin Classifier 8.      Kernels 9.      Bias/variance Tradeoff 10. Uniform Convergence - The Case of Infinite H 11. Bayesian Statistics and Regularization 12. The Concept of Unsupervised Learning 13. Mixture of Gaussian 14. The Factor Analysis Model 15. Latent Semantic Indexing (LSI) 16. Applications of Reinforcement Learning 17. Generalization to the Conti

RDBMS Vs NOSQL awesome differences to read now

Image
NoSQL and RDBMS or SQL are different from each other. You may ask what is the difference. Below explained in a way that you can understand quickly. 💡Traditional Database A schema is required. All traditional data warehouses using RDBMS to store datamarts. Databases understand SQL language. It has a specific format and rules to interact with traditional databases. Less scalable. It has certain limitations.  Expensive to make the databases as scalable Data should be in a certain format. Data stored in row format. NoSQL database The growing internet usage and involving a number of devices caused to invent databases that have the capability to store any kind of data. More: MongoDB 3.2 fundamentals for Developers-Learn with Exercises NoSQL Special Features The schema is not required. Ability to handle multiple data types. This is the power of NoSQL. NoSQL is much suitable for analytical databases. Since those should be flexible, scalable, and able to store any f

Machine Learning Tutorial - Part:2

Image
Machine learning is a branch of artificial intelligence. Using computing, you will design systems. These systems to behave with AI features, from your end, you need to train them. This process is called Machine Learning. Read my  part-1 if you miss it. The life cycle of machine learning Acquisition - Collect the data  Prepare - Data Cleaning and Quality  Process- Run Machine Tools  Report- Present the Results Acquire Data You can acquire data from many sources; it might be data that are held by your organization or open data from the Internet. There might be one data set, or there could be ten or more. Cleaning of Data You must come to accept that data will need to be cleaned and checked for quality before any processing can take place. These processes occur during the prepare phase. Running Machine Learning Scripts The processing phase is where the work gets done. The machine learning routines that you have created perform this phase. Reporting Finally, the

Machine Learning Quick Tutorial - Part:1

The following are the list of languages useful for Machine learning. There's no such thing as one language being "better" than another. It's a case of picking the right tool for the job. Your Resume has value if you put any one of these languages. Python The Python language has increased in usage because it's easy to learn and easy to read. Python has good libraries such as scikit-learn, PyML, Jython and pybrain. R R is an open-source statistical programming language. The syntax is not the easiest to learn, but I do encourage you to have a look at it. It also has a large number of machine learning packages and visualization tools.  The R-Java project allows Java programmers to access R functions from Java code. Matlab The Matlab language is used widely within academia for technical computing and algorithm creation. Like R, it also has a facility for plotting visualizations and graphs. Scala A new breed of languages is emerging that takes advantag

Windows Azure Cloud computing top points you need to learn now

Image
Interestingly, Windows Azure is an open platform that will support both Microsoft and non-Microsoft languages and environments. Basically Windows Azure is Cloud computing To build applications and services on Windows Azure, developers can use their existing Microsoft® Visual Studio® 2008 expertise. What is Azure Windows Azure is not grid computing, packaged software, or a standard hosting service. It is an integrated development, service hosting and management environment maintained at Microsoft data centers. The environment includes a robust and efficient core of compute and simple storage capabilities and support for a rich variety of development tools and protocols. Jon Brodkin of Network World quotes Tim O'Brien, senior director of Microsoft's Platform Strategy Group, as saying that Microsoft's Windows Azure and Amazon's Elastic Compute Cloud tackle two very different cloud computing technology problems today, but are destined to emulate each other over ti

The Story behind Mainframe to Cloud Real Journey

Image
Mainframe to CLOUD: Mainframe computing took off in the 1950s and gained much prominence throughout the 1960s. Corporations such as IBM (International Business Machines), Univac, DEC (Digital Equipment Corporation), and Control Data Corporation started developing powerful mainframe systems. Gettyimage.in These mainframe systems mainly carried out number-crunching for scientists and engineers. The main programming language used was Fortran. Then in the 1960s, the notion of database systems was conceived and corporations developed database systems based on the network and hierarchical data models. The database applications at that time were written mainly in COBOL. Cloud Vs Mainframe In the 1970s, corporations such as DEC created the notion of mini-computers. An example is DEC's VAX machine. These machines were much smaller than the mainframe systems. Around that time, terminals were developed. This way, programmers did not have to go to computing centers and use punch cards

30 High Paying Tech Jobs,$110,000 Plus Salary

Image
There is a growing demand for software developers across the globe. These 30 highly paying IT jobs really worth. PaaS or "Platform as a Service" is a type of cloud computing technology. It hosts everything that a developer needs to write an app. These apps once written, would live on PaaS cloud. Paas++jobs Cassandra is a free and open source NoSQL database. It's a kind of database that can handle and store data of different types and sizes of data and it's increasingly the go-to database for mobile and cloud applications. Several IT companies including Apple and Netflix use Cassandra. Cassandra+jobs MapReduce has been called "the heart of Hadoop." MapReduce is the method that allows Hadoop to store all kinds of data across many low-cost computer servers. To get meaningful data of Hadoop, a programmer writes software programs (often in the popular language, Java) for MapReduce. Mapreduce+jobs 30 High Paying IT Jobs Cloudera is a company that ma

MemSQL in Advanced Data Analytics

Image
Why use a battery of "complicated" and "immature" tools like Kafka, Zookeeper, and NoSQL databases to support low-latency big data applications when you can use a durable, consistent, SQL-compliant in-memory database? This is the question NewSQL in-memory database vendors MemSQL and VoltDB are posing to big-data developers who are trying to build real-time applications. MemSQL this week announced a two-way, high-performance MemSQL Spark Connector designed to complement the fast-growing Apache Spark in-memory analytics platform.   "There's a lot of excitement about Spark, but many data scientists struggle with complexity and the high degree of expertise to work with related data pipelines," said Erik Frenkiel, CEO and cofounder of MemSQL, in a phone interview with InformationWeek. "As a database, MemSQL offers durability and transaction support, so it can simplify those real-time data pipelines, providing the ability to ingest data and qu

How to Modernize Software Applications with AI (3 of 3)

Artificial intelligence is now changing the world. It is also called synonym for automation. The new concept is we can implement AI in the software development life cycle. How we can develop software applications with improved quality? Software Engineering is concerned with the planning, design, development, maintenance, and documentation of software systems. It is well known that developing high-quality software for real-world applications is complex.  Such complexity manifests itself in the fact that software has a large number of parts that have many interactions and the involvement of many stakeholders with different and sometimes conflicting objectives.  Furthermore, Software Engineering is knowledge-intensive and often deals with imprecise, incomplete and ambiguous requirements on which analysis, design, and implementations are based on. Artificial intelligence (AI) techniques such as knowledge-based systems, neural networks, fuzzy logic, and data mining have be