在在Hadoop集群环境中为集群环境中为MySQL安装配置安装配置Sqoop的教程的教程
Sqoop是一个用来将Hadoop和关系型数据库中的数据相互转移的工具,可以将一个关系型数据库(例如 : MySQL ,Oracle
,Postgres等)中的数据导进到Hadoop的HDFS中,也可以将HDFS的数据导进到关系型数据库中。
Sqoop中一大亮点就是可以通过hadoop的mapreduce把数据从关系型数据库中导入数据到HDFS。
一、安装一、安装sqoop
1、下载sqoop压缩包,并解压
压缩包分别是:sqoop-1.2.0-CDH3B4.tar.gz,hadoop-0.20.2-CDH3B4.tar.gz, Mysql JDBC驱动包mysql-connector-java-
5.1.10-bin.jar
[root@node1 ~]# ll
drwxr-xr-x 15 root root 4096 Feb 22 2011 hadoop-0.20.2-CDH3B4
-rw-r--r-- 1 root root 724225 Sep 15 06:46 mysql-connector-java-5.1.10-bin.jar
drwxr-xr-x 11 root root 4096 Feb 22 2011 sqoop-1.2.0-CDH3B4
2、将sqoop-1.2.0-CDH3B4拷贝到/home/hadoop目录下,并将Mysql JDBC驱动包和hadoop-0.20.2-CDH3B4下的hadoop-
core-0.20.2-CDH3B4.jar至sqoop-1.2.0-CDH3B4/lib下,最后修改一下属主。
[root@node1 ~]# cp mysql-connector-java-5.1.10-bin.jar sqoop-1.2.0-CDH3B4/lib
[root@node1 ~]# cp hadoop-0.20.2-CDH3B4/hadoop-core-0.20.2-CDH3B4.jar sqoop-1.2.0-CDH3B4/lib
[root@node1 ~]# chown -R hadoop:hadoop sqoop-1.2.0-CDH3B4
[root@node1 ~]# mv sqoop-1.2.0-CDH3B4 /home/hadoop
[root@node1 ~]# ll /home/hadoop
total 35748
-rw-rw-r-- 1 hadoop hadoop 343 Sep 15 05:13 derby.log
drwxr-xr-x 13 hadoop hadoop 4096 Sep 14 16:16 hadoop-0.20.2
drwxr-xr-x 9 hadoop hadoop 4096 Sep 14 20:21 hive-0.10.0
-rw-r--r-- 1 hadoop hadoop 36524032 Sep 14 20:20 hive-0.10.0.tar.gz
drwxr-xr-x 8 hadoop hadoop 4096 Sep 25 2012 jdk1.7
drwxr-xr-x 12 hadoop hadoop 4096 Sep 15 00:25 mahout-distribution-0.7
drwxrwxr-x 5 hadoop hadoop 4096 Sep 15 05:13 metastore_db
-rw-rw-r-- 1 hadoop hadoop 406 Sep 14 16:02 scp.sh
drwxr-xr-x 11 hadoop hadoop 4096 Feb 22 2011 sqoop-1.2.0-CDH3B4
drwxrwxr-x 3 hadoop hadoop 4096 Sep 14 16:17 temp
drwxrwxr-x 3 hadoop hadoop 4096 Sep 14 15:59 user
3、配置configure-sqoop,注释掉对于HBase和ZooKeeper的检查
[root@node1 bin]# pwd
/home/hadoop/sqoop-1.2.0-CDH3B4/bin
[root@node1 bin]# vi configure-sqoop
#!/bin/bash
#
# Licensed to Cloudera, Inc. under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
.
.
.
# Check: If we can't find our dependencies, give up here.
if [ ! -d "${HADOOP_HOME}" ]; then
echo "Error: $HADOOP_HOME does not exist!"
echo 'Please set $HADOOP_HOME to the root of your Hadoop installation.'
exit 1
fi
#if [ ! -d "${HBASE_HOME}" ]; then
# echo "Error: $HBASE_HOME does not exist!"
# echo 'Please set $HBASE_HOME to the root of your HBase installation.'
# exit 1
#fi
#if [ ! -d "${ZOOKEEPER_HOME}" ]; then
# echo "Error: $ZOOKEEPER_HOME does not exist!"