在Redhat6.4手动安装impala2.0

 背景

已经通过cloudera manager装好了hadoop,但由于不敢升级cm,所以单独升级impala,采用yum的安装方式。原本已经安装了impala1.1,现在升级到impala2.0。所以有些配置没有说明,请参考官方文档。

在本地做个镜像

把需要用到的rpm下载下来

wget -r -np -k 'http://archive.cloudera.com/impala/redhat/6/x86_64/impala/2.0.0/'

 

配置一个nginx,用来提供下载

nginx配置

server  {
                listen       9500;
                server_name www.lnmp.org;
                index index.html index.htm index.php;
                root  /home/kpi/impala/archive.cloudera.com;
                autoindex on;              #打开索引功能
                autoindex_exact_size off;  #人性化方式显示大小
                autoindex_localtime on;    #显示服务器时间
}

配置repo

把以下内容保存为文件 impala.repo 到 /etc/yum.repos.d/目录下, 下面的地址 archive.cloudera.com 可以换成你本地镜像的地址

[cloudera-impala]
name=Impala
baseurl=http://archive.cloudera.com/impala/redhat/6/x86_64/impala/2.0.0/
gpgkey = http://archive.cloudera.com/impala/redhat/6/x86_64/impala/RPM-GPG-KEY-cloudera    
gpgcheck = 1

 

安装

安装服务,所有机器安装

yum install impala impala-server impala-shell

在某台机器上装state-store这些启动脚本

yum install impala-state-store impala-catalog

配置

ln -s /etc/hive/conf/hive-site.xml /etc/impala/conf/
ln -s /etc/hive/conf/hive-env.sh /etc/impala/conf/
ln -s /etc/hadoop/conf/core-site.xml /etc/impala/conf/
cp /etc/hadoop/conf/hdfs-site.xml /etc/impala/conf/

hdfs-site.xml配置不能直接用客户端的配置,会缺少某个配置导致impala-server启动不了。如果启动不了可以检查impala的日志。

在hdfs-site.xml加入

<property>
<name>dfs.client.file-block-storage-locations.timeout</name>
<value>3000</value>
</property>

hdfs配置查看下面的网址,注意影响性能的Short-Circuit Reads选项
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/impala_config_performance.html?scroll=config_performance

修改impala默认用户为kpi

这个只是我们需要用kpi这个账号来读取数据,如果你使用默认用户impala也是可以的,注意要在hdfs配置这个用户可以直接读取hdfs

修改/etc/init.d/impala-server, /etc/init.d/impala-state-store,/etc/init.d/impala-catalog 文件,
找到 SVC_USER=”impala” 改为 SVC_USER=”kpi”
找到install -d -m 0755 -o impala -g impala /var/run/impala 改为 install -d -m 0755 -o kpi -g kpi /var/run/impala
创建日志目录

mkdir -p /home/hadoop/log/impala/
chown kpi:kpi /home/hadoop/log/impala/

修改impala 的默认配置/etc/default/impala文件

IMPALA_CATALOG_SERVICE_HOST=catalog的机器名或IP
IMPALA_STATE_STORE_HOST=statestore的机器名或IP
IMPALA_STATE_STORE_PORT=24000
IMPALA_BACKEND_PORT=22000
IMPALA_LOG_DIR=/home/hadoop/log/impala
 
IMPALA_CATALOG_ARGS=" -log_dir=${IMPALA_LOG_DIR} "
IMPALA_STATE_STORE_ARGS=" -log_dir=${IMPALA_LOG_DIR} -state_store_port=${IMPALA_STATE_STORE_PORT}"
IMPALA_SERVER_ARGS=" \
-mem_limit=4294967296 \
-enable_webserver=true \
-webserver_port=25000 \
-beeswax_port=21000 \
-hs2_port=21050 \
    -log_dir=${IMPALA_LOG_DIR} \
    -catalog_service_host=${IMPALA_CATALOG_SERVICE_HOST} \
    -state_store_port=${IMPALA_STATE_STORE_PORT} \
    -use_statestore \
    -state_store_host=${IMPALA_STATE_STORE_HOST} \
    -be_port=${IMPALA_BACKEND_PORT}"
 
ENABLE_CORE_DUMPS=false
 
# LIBHDFS_OPTS=-Djava.library.path=/usr/lib/impala/lib
# MYSQL_CONNECTOR_JAR=/usr/share/java/mysql-connector-java.jar
# IMPALA_BIN=/usr/lib/impala/sbin
# IMPALA_HOME=/usr/lib/impala
# HIVE_HOME=/usr/lib/hive
# HBASE_HOME=/usr/lib/hbase
# IMPALA_CONF_DIR=/etc/impala/conf
# HADOOP_CONF_DIR=/etc/hadoop/conf

 

启动

service impala-state-store start
service impala-catalog start
service impala-server start

报错/etc/init.d/impala-state-store: line 35: /etc/default/hadoop: No such file or directory 这个无关紧要,找不到hadoop默认配置文件

 

报错处理

在redhat5.x版本中启动impala-shell会报sasl的错误,需要用pip安装python26-libs ,sasl

报错 ERROR: block location tracking is not properly enabled because
– dfs.client.file-block-storage-locations.timeout is too low. It should be at least 3000.

是因为hdfs的配置不对,需要在hdfs_site.xml加入dfs.client.file-block-storage-locations.timeout配置,见上面的配置

检查进程

看一下进程是否都已经起来了

ps auxf|grep state
ps auxf|grep catalog

安装impala lzo(可选)

加入cloudera-gplextras5.repo

[cloudera-gplextras5]
# Packages for Cloudera's GPLExtras, Version 5, on RedHat or CentOS 6 x86_64
name=Cloudera's GPLExtras, Version 5
baseurl=http://archive.cloudera.com/gplextras/redhat/6/x86_64/gplextras/4.3.0/
gpgkey = http://archive.cloudera.com/gplextras/redhat/6/x86_64/gplextras/RPM-GPG-KEY-cloudera    
gpgcheck = 1

安装

yum install impala-lzo-2.0.0

 

参考

http://blog.javachen.com/2013/03/29/install-impala/
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/impala_noncm_installation.html




fatkun