11 October 2015

Top Key Architecture Components in HIVE

Hadoop+Hive Components
Hadoop+Hive+Jobs
5 architectural components presnet in Hadoop Hive:
  • Shell: allows interactive queries like MySQL shell connected to database – Also supports web and JDBC clients
  • Driver: session handles, fetch, execute
  • Compiler: parse, plan, optimize
  • Execution engine: DAG of stages (M/R,HDFS, or metadata)
  • Metastore: schema, location in HDFS,SerDe
Data Mode of Hive:
  • Tables
– Typed columns (int, float, string, date,boolean)
– Also, list: map (for JSON-like data)
  • Partitions
– e.g., to range-partition tables by date
  • Buckets
– Hash partitions within ranges (useful for sampling, join optimization)

HIVE Meta Store
  • Database: namespace containing a set of tables
  • Holds table definitions (column types, physical layout)
  • Partition data 
  • Uses JPOX ORM for implementation; can be stored in Derby, MySQL, many other relational databases
Physical Layout of HIVE
  • Warehouse directory in HDFS
– e.g., /home/hive/warehouse
  • Tables stored in subdirectories of warehouse
– Partitions, buckets form subdirectories of tables
  • Actual data stored in flat files
– Control char-delimited text, or SequenceFiles
– With custom SerDe, can use arbitrary format

No comments:

Post a Comment

Thanks for your message. We will get back you.

© 2010-2017 Biganalytics.me. All rights reserved.. Powered by Blogger.

Total Pageviews

All material, files, logos and trademarks within this site are properties of their respective organizations.