Java知识分享网 - 轻松学习从此开始!    

Java知识分享网

        
AI编程,程序员挑战年入30~100万高级指南 - 职业规划
SpringBoot+SpringSecurity+Vue权限系统高级实战课程        

IDEA永久激活

Java微信小程序电商实战课程(SpringBoot+VUe)

     

AI人工智能学习大礼包

     

PyCharm永久激活

66套java实战课程无套路领取

     

Cursor+Claude AI编程 1天快速上手视频教程

     
当前位置: 主页 > Java文档 > Java基础相关 >

hive_performance_tuning-HDP3.1.0 PDF 下载


时间:2021-07-21 07:17来源:http://www.java1234.com 作者:转载  侵权举报
hive_performance_tuning-HDP3.1.0 PDF 下载
失效链接处理
hive_performance_tuning-HDP3.1.0 PDF 下载


本站整理下载:
提取码:el58 
 
 
相关截图:
 
主要内容:

Optimizing an Apache Hive data warehouse
You can tune your data warehouse infrastructure, components, and client connection parameters to improve the
performance and relevance of business intelligence and other applications. Tuning Hive and background components
that support Hive processing is particularly important as your workload and database volume increases.
Increasingly, enterprises want to run SQL workloads that return faster results than batch processing can provide.
These enterprises often want data analytics applications to support interactive queries. Hive low-latency analytical
processing (LLAP) can improve the performance of interactive queries. A Hive interactive query that runs on the
Hortonworks Data Platform (HDP) meets low-latency, variably guaged benchmarks to which Hive LLAP responds in
15 seconds or fewer. LLAP enables application development and IT infrastructure to run queries that return real-time
or near-real-time results.
You can further enhance LLAP performance with real-time data by integrating the enterprise data warehouse (EDW)
with the Druid business intelligence engine.
When you query large-scale EDW data sets, you have to meet service-level agreement (SLA) benchmarks or other
performance expectations. Because how you tune your query processing environment depends on factors such as
system resources, depth of data analysis, and query latency requirements, you must become familiar with Hive
warehouse processing, prepare for tuning, and configure LLAP using parameters that meet your performance needs.
LLAP ports
You use port 10500 to make the JDBC connection through Beeline to query Hive through the HiveServer Interactive
host. The LLAP daemon uses several other ports.
List of port properties
• HiveServer Interactive (LLAP) port (10500)
• hive.server2.thrift.http.port (10501)
• hive.llap.daemon.rpc.port (0)
• hive.llap.daemon.web.port (15002)
• hive.llap.daemon.yarn.shuffle.port (15551)
• hive.llap.management.rpc.port (15004)
Preparations for tuning performance
Before you tune Apache Hive, you should follow best practices. These guidelines include how you configure the
cluster, store data, and write queries.
Best practices
• Set up your cluster to use Apache Tez or the Hive on Tez execution engine.
In HDP 3.x, the MapReduce execution engine is replaced by Tez.
• Disable user impersonation by setting Run as end user to false in Ambari, which is equivalent to setting
hive.server2.enable.doAs in hive-site.xml.
LLAP caches data for multiple queries and this capability does not support user impersonation.
• Add the Ranger security service to your cluster and dependent services.
• Set up LLAP to run interactive queries.
• Store data using the ORC File format

 

------分隔线----------------------------


锋哥推荐