Posts

Showing posts with the label Spark

HBASE Vs. RDBMS Top Differences You can Unlock Now

Image
HBASE in the Big data context has a lot of benefits over RDBMS. The listed differences below make you understandable why HBASE is popular in Hadoop (or Bigdata) platform. Let us check one by one quickly. HBASE Vs. RDBMS Differences Random Accessing HBase handles a large amount of data that is store in a distributed manner in the column-oriented format while RDBMS is systematic storage of a database that cannot support a random manner for accessing the database. Database Rules RDBMS strictly follow Codd's 12 rules with fixed schemas and row-oriented manner of database and also follow ACID properties. HBase follows BASE properties and implement complex queries. Secondary indexes, complex inner and outer joins, count, sum, sort, group, and data of page and table can easily be accessible by RDBMS. Storage From small to medium storage application there is the use of RDBMS that provide the solution with MySQL and PostgreSQL whose size increase with concurrency and performance.  Codd'

Spark SQL Query how to write it in Ten steps

Image
Spark SQL example The post tells how to write SQL query in Spark and explained in ten steps.This example demonstrates how to use sqlContext.sql to create and load two tables and select rows from the tables into two DataFrames. The next steps use the DataFrame API to filter the rows for salaries greater than 150,000 from one of the tables and shows the resulting DataFrame. Then the two DataFrames are joined to create a third DataFrame. Finally the new DataFrame is saved to a Hive table. 1. At the command line, copy the Hue sample_07 and sample_08 CSV files to HDFS: $ hdfs dfs -put HUE_HOME/apps/beeswax/data/sample_07.csv /user/hdfs $ hdfs dfs -put HUE_HOME/apps/beeswax/data/sample_08.csv /user/hdfs where HUE_HOME defaultsto /opt/cloudera/parcels/CDH/lib/hue (parcel installation) or /usr/lib/hue (package installation). 2. Start spark-shell: $ spark-shell 3. Create Hive tables sample_07 and sample_08: scala> sqlContext.sql("CREATE TABLE sample_07 (code string

SPARK is Replacement for MapReduce in Bigdata Real Analytics!

Image
Apache Spark is among the Hadoop ecosystem technologies acting as catalysts for broader adoption of big data infrastructure. Now, Looker -- a vendor of business intelligence software -- has announced support for Spark and other Hadoop technologies. The goal? To speed up access to the data that fuels business decision making. SPARK Jobs Hadoop's arrival on the scene 10 years ago may have started the big data revolution, but only recently did adoption of this technology begin spreading to a wider audience. Apache Spark is one of the catalysts for the growing adoption rates. Spark can be used as a replacement for MapReduce, a component of Hadoop implementations, to speed up the processing and analytics of big data by 100x in memory, according to the Apache Software Foundation. In today's business environment, in which real-time analytics is the goal and organizations don't want to wait for data warehouses and analysts to provide batch intelligence back to business u

Hot Skills: Spark Self Study Materials

Image
Spark: With job postings up 120% year-over-year on Dice, demand for this open-source cluster-computing framework is broad-based. Government contractors and financial-services firms are just a few of the groups eager to find candidates with this skillset. 2015 Average Salary: $113,214 Related: SPARK Self Study Materials Spark Big Data and Cloud:  As companies expand their tech infrastructures, they need cloud and Big Data services such as Azure (#2), Hive (#8), and Cassandra (#9) for data storage, analysis, and security. Big Data and cloud-related skills dominated the Highest-Paid Skills list on Dice’s salary survey for the second straight year.  2015 Average Salary: Big Data—$121,328 Azure — $110,207 Salesforce: This customer-service platform serves as the bedrock for many companies’ customer service departments. Demand for Salesforce professionals seems unlikely to decline anytime soon. Employers are even willing to offer telecommuting options to lure Salesforce talent. 2