Top Key Architecture Components in HIVE

5 architectural components present in Hadoop Hive: Shell: allows interactive queries like MySQL shell connected to a database – Also supports web and JDBC clients Driver: session handles, fetch, execute Compiler: parse, plan, optimize Execution engine: DAG of stages (M/R, HDFS, or metadata) Metastore: schema, location in HDFS, SerDe

Data Mode of Hive:
  • Tables
– Typed columns (int, float, string, date, boolean)
– Also, list: map (for JSON-like data)
  • Partitions
– e.g., to range-partition tables by date
  • Buckets
– Hash partitions within ranges (useful for sampling, join optimization)

HIVE Meta Store
  • Database: namespace containing a set of tables
  • Holds table definitions (column types, physical layout)
  • Partition data 
  • Uses JPOX ORM for implementation; can be stored in Derby, MySQL, many other relational databases
Physical Layout of HIVE
  • Warehouse directory in HDFS
– e.g., /home/hive/warehouse
  • Tables stored in subdirectories of warehouse
– Partitions, buckets form subdirectories of tables
  • Actual data stored in flat files
– Control char-delimited text, or SequenceFiles
– With custom SerDe, can use arbitrary format

Comments

Popular Posts

Hyperledger Fabric: 20 Real Interview Questions

7 AWS Interview Questions asked in Infosys, TCS

How to Fix Python Syntax Errors Quickly

Python 'getsizeof' Command the Real Purpose

Blue Prism complete tutorials download now

Vi Editor to Quit use Esc and Colon

Python Dictionary Vs List With Examples

How to Use the ps Command in Linux

AWS Vs Azure Load Balancers Top Insights

How to Decode TLV Quickly