How does Apache Hive
Hive - introduction
The term "Big Data" is used for collections of large data sets that encompass large volumes, high speed, and a variety of data that is increasing day by day. With the help of traditional data management systems, it is difficult to process big data. Hence, the Apache Software Foundation introduced a framework called Hadoop to solve big data management and processing challenges.
Hadoop is an open source framework for storing and processing big data in a distributed environment. It contains two modules, one is Map Reduce and the other is Hadoop Distributed File System (HDFS).
reduce card: It is a parallel programming model for processing large amounts of structured, semi-structured and unstructured data over large clusters of standard hardware.
HDFS:Hadoop Distributed File System is part of the Hadoop Framework, used to store and process the data set. It provides a fault tolerant filesystem to run on asset hardware.
The Hadoop ecosystem contains various sub-projects (tools), such as Sqoop, Pig, and Hive, that are used to help Hadoop modules.
Sqoop: It is use for importing and exporting data back and forth between HDFS and RDBMS.
Pig: It is a procedural language platform used to script MapReduce operations.
Hive: It's a platform to use to develop SQL-type scripts to do MapReduce operations.
Note: There are several ways to execute MapReduce operations:
- The traditional approach using Java MapReduce program for structured, semi-structured and unstructured data.
- The scripting approach for MapReduce to process structured and semi-structured data with the help of Pig.
- The Hive Query Language (HiveQL or HQL) for MapReduce to process structured data with the help of Beehive.
What is hive
Hive is a data warehouse infrastructure tool for processing structured data in Hadoop. It sits on top of the Hadoop to summarize big data, and makes query and analysis easy.
Initially Hive was developed by Facebook, later the Apache Software Foundation took it up and developed it further as open source under the name Apache Hive. It is used by different companies. For example, Amazon uses Elastic MapReduce in Amazon.
- A relational database
- A Draft for Online Transaction Processing (OLTP)
- A language for real-time queries and updates at the line level
- It stores schemas in a database and processes data in HDFS.
- It is designed for OLAP.
- It provides SQL-type language for querying called HiveQL or HQL.
- It's familiar, fast, scalable, and expandable.
Architecture from Hive
The following component diagram shows the architecture of the Hive:
This component diagram contains different units. In the following table the individual units:
|user interface||Hive is a data warehouse infrastructure software that can create interaction between users and HDFS. The user interfaces that Hive props are Hive Web UI, Hive Command Line, and Hive HD Insight (In Windows Server).|
|Meta store||Hive selects the respective database server to store the schema or metadata of tables, databases, columns in a table, the data types and HDFS mapping.|
|HiveQL process engine||HiveQL is similar to SQL for querying schema information on the metastore. It is one of the replacements of the traditional approach to the MapReduce program. Instead of writing a MapReduce program in Java, we can write a query for MapReduce job and process it.|
|Execution engine||The connection part of HiveQL Process Engine and MapReduce is Hive Execution Engine. The execution engine processes the query and produces results the same as MapReduce results. It uses the flavor of MapReduce.|
|HDFS or HBASE||Hadoop Distributed File System or Hbase are the data storage techniques for storing data in file systems.|
Working from Hive
The following diagram shows the workflow between Hive and Hadoop.
The following table shows how Hive interacts with the Hadoop framework:
The Hive interface like command line or web interface sends query drivers (any database driver like JDBC, ODBC, etc.) to be executed.
The driver takes the help of the query compiler, which parses the query to check the syntax and query plan or requirement of the query.
The compiler sends metadata requests to metastore (a database).
Metastore sends metadata as a response to the compiler.
The compiler checks the request and sends the plan for the driver. Up to here is the complete analysis and creation of a query.
The driver sends the execution plan for the execution engine.
Inside the process of executing job is a MapReduce job. The execution module sends the job to the job tracker, which is assigned to the name node and this job to the task tracker, which is assigned to the data node. Here the query is executed MapReduce Job.
Meanwhile, in execution, the execution engine can perform metadata operations with Metastore.
The execution engine receives the results from data nodes.
The execution engine sends these resulting values to the driver.
The driver sends the results to Hive interfaces.
- What is the best learning retention algorithm
- Why did modern paganism come about?
- What's your favorite song from 2017
- How non-verbal cues affect communication
- What is the full form of HEC
- Hw, PewDiePie got 80 million subscribers
- Why do I suck on technical analysis
- Have you ever avoided your crush entirely
- Why isn't Quora emailing me
- How many are too many
- Why am I inspired by children
- Are New York subways safe
- Why are you involved in politics?
- What are Softmax layers
- How many bills did Obama pass
- What sounds complex is actually simple
- How is IBS College for Management
- Who is the youngest scorer in the Premier League?
- Which doctors are sued the most?
- Will there ever be a Jewish president
- Sachin Tendulkars' daughter has a boyfriend
- How was the 1st programming language created
- Aviation What affects the aircraft's turnaround pattern
- What are important components of the project proposal
- What is 120 18 3
- What is 3 3 3 3 89
- Can the time be negative 1
- Are Shopify websites good looking
- Which BTech course is interesting
- Everyone is suitable for a dental implant
- Are there any good new forex brokers
- Java is free for commercial use
- Am I bisexual?
- What is the appeal of crime shows