Posts

Showing posts with the label Spark

Featured Post

How to Work With Tuple in Python

Image
Tuple in python is one of the streaming datasets. The other streaming datasets are List and Dictionary. Operations that you can perform on it are shown here for your reference. Writing tuple is easy. It has values of comma separated, and enclosed with parenthesis '()'. The values in the tuple are immutable, which means you cannot replace with new values. #1. How to create a tuple Code: my_tuple=(1,2,3,4,5) print(my_tuple) Output: (1, 2, 3, 4, 5) ** Process exited - Return Code: 0 ** Press Enter to exit terminal #2. How to read tuple values Code: print(my_tuple[0]) Output: 1 ** Process exited - Return Code: 0 ** Press Enter to exit terminal #3. How to add two tuples Code: a=(1,6,7,8) c=(3,4,5,6,7,8) d=print(a+c) Output: (1, 6, 7, 8, 3, 4, 5, 6, 7, 8) ** Process exited - Return Code: 0 ** Press Enter to exit terminal #4.  How to count tuple values Here the count is not counting values; count the repetition of a given value. Code: sample=(1, 6, 7, 8, 3, 4, 5, 6, 7, 8) print(sample

Spark SQL Query how to write it in Ten steps

Image
Spark SQL example The post tells how to write SQL query in Spark and explained in ten steps.This example demonstrates how to use sqlContext.sql to create and load two tables and select rows from the tables into two DataFrames. The next steps use the DataFrame API to filter the rows for salaries greater than 150,000 from one of the tables and shows the resulting DataFrame. Then the two DataFrames are joined to create a third DataFrame. Finally the new DataFrame is saved to a Hive table. 1. At the command line, copy the Hue sample_07 and sample_08 CSV files to HDFS: $ hdfs dfs -put HUE_HOME/apps/beeswax/data/sample_07.csv /user/hdfs $ hdfs dfs -put HUE_HOME/apps/beeswax/data/sample_08.csv /user/hdfs where HUE_HOME defaultsto /opt/cloudera/parcels/CDH/lib/hue (parcel installation) or /usr/lib/hue (package installation). 2. Start spark-shell: $ spark-shell 3. Create Hive tables sample_07 and sample_08: scala> sqlContext.sql("CREATE TABLE sample_07 (code string

SPARK is Replacement for MapReduce in Bigdata Real Analytics!

Image
Apache Spark is among the Hadoop ecosystem technologies acting as catalysts for broader adoption of big data infrastructure. Now, Looker -- a vendor of business intelligence software -- has announced support for Spark and other Hadoop technologies. The goal? To speed up access to the data that fuels business decision making. SPARK Jobs Hadoop's arrival on the scene 10 years ago may have started the big data revolution, but only recently did adoption of this technology begin spreading to a wider audience. Apache Spark is one of the catalysts for the growing adoption rates. Spark can be used as a replacement for MapReduce, a component of Hadoop implementations, to speed up the processing and analytics of big data by 100x in memory, according to the Apache Software Foundation. In today's business environment, in which real-time analytics is the goal and organizations don't want to wait for data warehouses and analysts to provide batch intelligence back to business u

Hot Skills: Spark Self Study Materials

Image
Spark: With job postings up 120% year-over-year on Dice, demand for this open-source cluster-computing framework is broad-based. Government contractors and financial-services firms are just a few of the groups eager to find candidates with this skillset. 2015 Average Salary: $113,214 Related: SPARK Self Study Materials Spark Big Data and Cloud:  As companies expand their tech infrastructures, they need cloud and Big Data services such as Azure (#2), Hive (#8), and Cassandra (#9) for data storage, analysis, and security. Big Data and cloud-related skills dominated the Highest-Paid Skills list on Dice’s salary survey for the second straight year.  2015 Average Salary: Big Data—$121,328 Azure — $110,207 Salesforce: This customer-service platform serves as the bedrock for many companies’ customer service departments. Demand for Salesforce professionals seems unlikely to decline anytime soon. Employers are even willing to offer telecommuting options to lure Salesforce talent. 2