Doris简介
中文官网:https://doris.apache.org/zh-CN/
笔者在此不再赘述。
Doris作为一个MPP数据库
,有明显的去Hadoop
优势,又能完美兼容MySQL,对于只有结构化数据表的中小规模公司
相当使用,且对肤浅的SQL Boy们相当友好,本身对流和批处理都不弱,官方提供与Spark、Flink这种可以运行在K8S的计算引擎的连接器,对开发与运维水平弱的团队也是相当友好。
由于不依赖Hadoop生态圈,不需要HDFS文件系统、ZooKeeper协调服务、Yarn资源调度,简单的运算Doris自己就能实现,复杂运算用到的Spark和Flink直接运行在K8S即可,使得基于Doris的数仓和数据集市架构
相当简洁。一个组件通吃入湖及ODS层、DW层【DWD最细粒度明细层、DWM粗粒度汇总层】、ADS应用层,只需要数据集成到Doris,即可完成一站式结构化数据分析汇总全过程,甚至可以简化维度表的中间层,数据架构
也是相当简洁。
编译
主要是切换JDK8及JDK11的区别,使用官方编译好的JDK1.8的Doris即可。
官网下载:https://doris.apache.org/zh-CN/download
Linux虚拟机要求
Doris的要求比隔壁StarRocks
更容易满足。
硬件要求
开发测试环境
模块 | CPU | 内存 | 磁盘 | 网络 | 实例数量 |
---|---|---|---|---|---|
Frontend | 8核+ | 8GB+ | SSD 或 SATA,10GB+ * | 千兆网卡 | 1 |
Backend | 8核+ | 16GB+ | SSD 或 SATA,50GB+ * | 千兆网卡 | 1-3 * |
生产环境
模块 | CPU | 内存 | 磁盘 | 网络 | 实例数量(最低要求) |
---|---|---|---|---|---|
Frontend | 16核+ | 64GB+ | SSD 或 RAID 卡,100GB+ * | 万兆网卡 | 1-5 * |
Backend | 16核+ | 64GB+ | SSD 或 SATA,100G+ * | 万兆网卡 | 10-100 * |
虚拟机资源分配
根据官方手册,建议dev环境的前端节点FE用1个8C8G的,后端节点BE用1~3个8C16G的,由于Doris横向扩容能力极强,也为了启停方便,笔者制作一个16C32G+300G硬盘的虚拟机,在单机上部署FE及BE,硬盘万一不够用,后期也很容易扩容BE。
软件要求
Linux 操作系统版本需求
Linux 系统 | 版本 |
---|---|
CentOS | 7.1 及以上 |
Ubuntu | 16.04 及以上 |
软件需求
软件 | 版本 |
---|---|
Java | 1.8 及以上 |
GCC | 4.8.2 及以上 |
操作系统安装要求
设置系统最大打开文件句柄数
vi /etc/security/limits.conf
* soft nofile 65536
* hard nofile 65536
时钟同步
Doris 的元数据要求时间精度要小于5000ms,所以所有集群所有机器要进行时钟同步,避免因为时钟问题引发的元数据不一致导致服务出现异常。
关闭交换分区(swap)
Linux交换分区会给Doris带来很严重的性能问题,需要在安装之前禁用交换分区
Liunx文件系统
这里我们推荐使用ext4文件系统,在安装操作系统的时候,请选择ext4文件系统。
网络需求
Doris 各个实例直接通过网络进行通讯。以下表格展示了所有需要的端口
实例名称 | 端口名称 | 默认端口 | 通讯方向 | 说明 |
---|---|---|---|---|
BE | be_port | 9060 | FE --> BE | BE 上 thrift server 的端口,用于接收来自 FE 的请求 |
BE | webserver_port | 8040 | BE <–> BE | BE 上的 http server 的端口 |
BE | heartbeat_service_port | 9050 | FE --> BE | BE 上心跳服务端口(thrift),用于接收来自 FE 的心跳 |
BE | brpc_port | 8060 | FE <–> BE, BE <–> BE | BE 上的 brpc 端口,用于 BE 之间通讯 |
FE | http_port | 8030 | FE <–> FE,用户 <–> FE | FE 上的 http server 端口 |
FE | rpc_port | 9020 | BE --> FE, FE <–> FE | FE 上的 thrift server 端口,每个fe的配置需要保持一致 |
FE | query_port | 9030 | 用户 <–> FE | FE 上的 mysql server 端口 |
FE | edit_log_port | 9010 | FE <–> FE | FE 上的 bdbje 之间通信用的端口 |
Broker | broker_ipc_port | 8000 | FE --> Broker, BE --> Broker | Broker 上的 thrift server,用于接收请求 |
IP 绑定
因为有多网卡的存在,或因为安装过 docker 等环境导致的虚拟网卡的存在,同一个主机可能存在多个不同的 ip。当前 Doris 并不能自动识别可用 IP。所以当遇到部署主机上有多个 IP 时,必须通过 priority_networks 配置项来强制指定正确的 IP。
虚拟机软件适配
根据官方文档,虚拟机使用Ubuntu20.04,单网卡固定IP即可,其余设置安装Doris时照着官方文档一步一步操作即可。
虚拟机制作
老规矩,制作Ubuntu20.04虚拟机,参照笔者之前的一篇博客:https://lizhiyong.blog.csdn.net/article/details/126236516
配置网卡
需要禁用IPV6。
安装必要命令
sudo apt install net-tools
sudo apt-get install openssh-server
sudo apt-get install openssh-client
sudo apt install vim
apt install openjdk-8-jdk-headless
此时可以使用MobaXterm
。
系统设置
检查Java版本
zhiyong@zhiyong-doris:~$ java -version
Command 'java' not found, but can be installed with:
sudo apt install openjdk-11-jre-headless # version 11.0.16+8-0ubuntu1~20.04, or
sudo apt install default-jre # version 2:1.11-72
sudo apt install openjdk-13-jre-headless # version 13.0.7+5-0ubuntu1~20.04
sudo apt install openjdk-16-jre-headless # version 16.0.1+9-1~20.04
sudo apt install openjdk-17-jre-headless # version 17.0.4+8-1~20.04
sudo apt install openjdk-8-jre-headless # version 8u342-b07-0ubuntu1~20.04
zhiyong@zhiyong-doris:~$ sudo apt install openjdk-8-jre-headless
[sudo] zhiyong 的密码:
正在读取软件包列表... 完成
正在分析软件包的依赖关系树
正在读取状态信息... 完成
将会同时安装下列软件:
ca-certificates-java java-common
建议安装:
default-jre fonts-dejavu-extra fonts-ipafont-gothic fonts-ipafont-mincho fonts-wqy-microhei fonts-wqy-zenhei
下列【新】软件包将被安装:
ca-certificates-java java-common openjdk-8-jre-headless
升级了 0 个软件包,新安装了 3 个软件包,要卸载 0 个软件包,有 290 个软件包未被升级。
需要下载 28.3 MB 的归档。
解压缩后会消耗 104 MB 的额外空间。
您希望继续执行吗? [Y/n] y
zhiyong@zhiyong-doris:~$ java -version
openjdk version "1.8.0_342"
OpenJDK Runtime Environment (build 1.8.0_342-8u342-b07-0ubuntu1~20.04-b07)
OpenJDK 64-Bit Server VM (build 25.342-b07, mixed mode)
此时已经安装好JDK1.8。
检查GCC版本
zhiyong@zhiyong-doris:~$ gcc
Command 'gcc' not found, but can be installed with:
sudo apt install gcc
zhiyong@zhiyong-doris:~$ sudo apt install gcc
正在读取软件包列表... 完成
正在分析软件包的依赖关系树
正在读取状态信息... 完成
将会同时安装下列软件:
binutils binutils-common binutils-x86-64-linux-gnu cpp-9 gcc-9 gcc-9-base libasan5 libatomic1 libbinutils libc-dev-bin libc6 libc6-dbg libc6-dev
libcrypt-dev libctf-nobfd0 libctf0 libgcc-9-dev libitm1 liblsan0 libquadmath0 libtsan0 libubsan1 linux-libc-dev manpages-dev
建议安装:
binutils-doc gcc-9-locales gcc-multilib make autoconf automake libtool flex bison gcc-doc gcc-9-multilib gcc-9-doc glibc-doc
下列【新】软件包将被安装:
binutils binutils-common binutils-x86-64-linux-gnu gcc gcc-9 libasan5 libatomic1 libbinutils libc-dev-bin libc6-dev libcrypt-dev libctf-nobfd0 libctf0
libgcc-9-dev libitm1 liblsan0 libquadmath0 libtsan0 libubsan1 linux-libc-dev manpages-dev
下列软件包将被升级:
cpp-9 gcc-9-base libc6 libc6-dbg
升级了 4 个软件包,新安装了 21 个软件包,要卸载 0 个软件包,有 286 个软件包未被升级。
需要下载 55.9 MB 的归档。
解压缩后会消耗 74.1 MB 的额外空间。
您希望继续执行吗? [Y/n] y
zhiyong@zhiyong-doris:~$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.4.0-1ubuntu1~20.04.1' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-9-Av3uEd/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
可以看到已经安装好GCC9.4。
检查MySQL的Client版本
zhiyong@zhiyong-doris:~$ mysql
Command 'mysql' not found, but can be installed with:
sudo apt install mysql-client-core-8.0 # version 8.0.30-0ubuntu0.20.04.2, or
sudo apt install mariadb-client-core-10.3 # version 1:10.3.34-0ubuntu0.20.04.1
zhiyong@zhiyong-doris:~$ sudo apt install mysql-client-core-8.0
正在读取软件包列表... 完成
正在分析软件包的依赖关系树
正在读取状态信息... 完成
下列【新】软件包将被安装:
mysql-client-core-8.0
升级了 0 个软件包,新安装了 1 个软件包,要卸载 0 个软件包,有 286 个软件包未被升级。
需要下载 4,521 kB 的归档。
解压缩后会消耗 67.3 MB 的额外空间。
获取:1 http://mirrors.aliyun.com/ubuntu focal-updates/main amd64 mysql-client-core-8.0 amd64 8.0.30-0ubuntu0.20.04.2 [4,521 kB]
已下载 4,521 kB,耗时 1秒 (4,289 kB/s)
正在选中未选择的软件包 mysql-client-core-8.0。
(正在读取数据库 ... 系统当前共安装有 154483 个文件和目录。)
准备解压 .../mysql-client-core-8.0_8.0.30-0ubuntu0.20.04.2_amd64.deb ...
正在解压 mysql-client-core-8.0 (8.0.30-0ubuntu0.20.04.2) ...
正在设置 mysql-client-core-8.0 (8.0.30-0ubuntu0.20.04.2) ...
正在处理用于 man-db (2.9.1-1) 的触发器 ...
安装好MySQL的Client,之后部署BE时把BE添加到FE需要用。
时钟同步
单机安装可以跳过这步。
关闭Swap
zhiyong@zhiyong-doris:~$ free -m
总计 已用 空闲 共享 缓冲/缓存 可用
内存: 32071 1316 28964 10 1790 30324
交换: 2047 0 2047
可以看到目前开启了swap,会影响Doris的性能。
zhiyong@zhiyong-doris:~$ sudo su root
root@zhiyong-doris:/home/zhiyong# vim /etc/fstab
root@zhiyong-doris:/home/zhiyong# cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda5 during installation
UUID=abc5a4f3-3280-421e-8e6d-4132ad1f2c1a / ext4 errors=remount-ro 0 1
# /boot/efi was on /dev/sda1 during installation
UUID=2D2A-8B96 /boot/efi vfat umask=0077 0 1
#/swapfile none swap sw 0 0
屏蔽掉这一行。reboot重启。
zhiyong@zhiyong-doris:~$ free -m
总计 已用 空闲 共享 缓冲/缓存 可用
内存: 32071 1070 30314 2 685 30615
交换: 0 0 0
zhiyong@zhiyong-doris:~$ sudo swapon --show
[sudo] zhiyong 的密码:
zhiyong@zhiyong-doris:~$ sudo swapon --show
zhiyong@zhiyong-doris:~$
此时swap已关闭。至此基础环境就绪。
Doris部署
现在开始正式部署Doris。
zhiyong@zhiyong-doris:~$ sudo su root
root@zhiyong-doris:/home/zhiyong# cd /
root@zhiyong-doris:/# df -Th
文件系统 类型 容量 已用 可用 已用% 挂载点
udev devtmpfs 16G 0 16G 0% /dev
tmpfs tmpfs 3.2G 1.9M 3.2G 1% /run
/dev/sda5 ext4 294G 8.4G 271G 4% /
tmpfs tmpfs 16G 0 16G 0% /dev/shm
tmpfs tmpfs 5.0M 4.0K 5.0M 1% /run/lock
tmpfs tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/loop0 squashfs 128K 128K 0 100% /snap/bare/5
/dev/loop1 squashfs 62M 62M 0 100% /snap/core20/1328
/dev/loop2 squashfs 66M 66M 0 100% /snap/gtk-common-themes/1519
/dev/loop3 squashfs 44M 44M 0 100% /snap/snapd/14978
/dev/loop4 squashfs 55M 55M 0 100% /snap/snap-store/558
/dev/loop5 squashfs 249M 249M 0 100% /snap/gnome-3-38-2004/99
/dev/sda1 vfat 511M 4.0K 511M 1% /boot/efi
tmpfs tmpfs 3.2G 20K 3.2G 1% /run/user/1000
root@zhiyong-doris:/# mkdir -p /export/software
root@zhiyong-doris:/# mkdir -p /export/server
root@zhiyong-doris:/# chmod -R 777 /export/
root@zhiyong-doris:/# cd /export/software/
root@zhiyong-doris:/export/software# ll
总用量 398792
drwxrwxrwx 2 root root 4096 8月 14 21:07 ./
drwxrwxrwx 4 root root 4096 8月 14 21:04 ../
-rw-rw-r-- 1 zhiyong zhiyong 408349973 8月 14 21:08 apache-doris-1.1.1-bin-x86.tar.gz
root@zhiyong-doris:/export/software# tar -zxvf apache-doris-1.1.1-bin-x86.tar.gz -C /export/server/
root@zhiyong-doris:/export/software# cd /export/server/
root@zhiyong-doris:/export/server# ll
总用量 12
drwxrwxrwx 3 root root 4096 8月 14 21:09 ./
drwxrwxrwx 4 root root 4096 8月 14 21:04 ../
drwxrwxr-x 7 1020 1020 4096 7月 25 15:11 apache-doris-1.1.1-bin-x86/
至此上传安装包及解包完成。
环境变量
root@zhiyong-doris:/# vim /etc/profile.d/doris.sh
root@zhiyong-doris:/# cat /etc/profile.d/doris.sh
export DORIS_HOME=/export/server/apache-doris-1.1.1-bin-x86
export PATH=$PATH:$DORIS_HOME/fe/bin:$DORIS_HOME/be/bin
root@zhiyong-doris:/# source /etc/profile.d/doris.sh
FE部署
cp文件
由于已经放置在正确路径,不再cp。
配置FE
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/conf# pwd
/export/server/apache-doris-1.1.1-bin-x86/fe/conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/conf# ll
总用量 12
drwxr-xr-x 2 1020 1020 4096 7月 25 15:11 ./
drwxr-xr-x 8 1020 1020 4096 7月 25 15:11 ../
-rw-rw-r-- 1 1020 1020 2889 6月 13 13:59 fe.conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/conf# vim fe.conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/conf# cat fe.conf
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#####################################################################
## The uppercase properties are read and exported by bin/start_fe.sh.
## To see all Frontend configurations,
## see fe/src/org/apache/doris/common/Config.java
#####################################################################
# the output dir of stderr and stdout
LOG_DIR = ${DORIS_HOME}/log
DATE = `date +%Y%m%d-%H%M%S`
JAVA_OPTS="-Xmx8192m -XX:+UseMembar -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xloggc:$DORIS_HOME/log/fe.gc.log.$DATE"
# For jdk 9+, this JAVA_OPTS will be used as default JVM options
JAVA_OPTS_FOR_JDK_9="-Xmx8192m -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xlog:gc*:$DORIS_HOME/log/fe.gc.log.$DATE:time"
##
## the lowercase properties are read by main program.
##
# INFO, WARN, ERROR, FATAL
sys_log_level = INFO
# store metadata, must be created before start FE.
# Default value is ${DORIS_HOME}/doris-meta
# meta_dir = ${DORIS_HOME}/doris-meta
meta_dir = /dorisdata/doris-meta
http_port = 8030
rpc_port = 9020
query_port = 9030
edit_log_port = 9010
mysql_service_nio_enabled = true
# Choose one if there are more than one ip except loopback address.
# Note that there should at most one ip match this list.
# If no ip match this rule, will choose one randomly.
# use CIDR format, e.g. 10.10.10.0/24
# Default value is empty.
# priority_networks = 10.10.10.0/24;192.168.0.0/16
priority_networks = 192.168.88.0/24
# Advanced configurations
# log_roll_size_mb = 1024
# sys_log_dir = ${DORIS_HOME}/log
# sys_log_roll_num = 10
# sys_log_verbose_modules = org.apache.doris
# audit_log_dir = ${DORIS_HOME}/log
# audit_log_modules = slow_query, query
# audit_log_roll_num = 10
# meta_delay_toleration_second = 10
# qe_max_connection = 1024
# max_conn_per_user = 100
# qe_query_timeout_second = 300
# qe_slow_log_ms = 5000
根据此处指定的meta_dir = /dorisdata/doris-meta
,手动创建Doris的元数据存储路径:
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/conf# mkdir -p /dorisdata/doris-meta
于是Doris有了存储元数据的路径。
启动FE
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/bin# pwd
/export/server/apache-doris-1.1.1-bin-x86/fe/bin
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/bin# ll
总用量 20
drwxr-xr-x 2 1020 1020 4096 7月 25 15:11 ./
drwxr-xr-x 8 1020 1020 4096 7月 25 15:11 ../
-rwxrwxr-x 1 1020 1020 4455 7月 24 09:57 start_fe.sh*
-rwxrwxr-x 1 1020 1020 1568 3月 24 16:59 stop_fe.sh*
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/bin# ./start_fe.sh --daemon
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/bin# jps
3406 PaloFe
3791 Jps
至此FE正常启动。
BE部署
cp文件
由于已经放置在正确路径,不再cp。
配置BE
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# pwd
/export/server/apache-doris-1.1.1-bin-x86/be/conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# ll
总用量 16
drwxr-xr-x 2 1020 1020 4096 7月 25 15:11 ./
drwxr-xr-x 8 1020 1020 4096 7月 25 15:11 ../
-rw-r--r-- 1 1020 1020 2207 7月 24 09:57 be.conf
-rw-r--r-- 1 1020 1020 1529 3月 24 16:59 odbcinst.ini
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# vim be.conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# cat be.conf
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
PPROF_TMPDIR="$DORIS_HOME/log/"
# INFO, WARNING, ERROR, FATAL
sys_log_level = INFO
# ports for admin, web, heartbeat service
be_port = 9060
webserver_port = 8040
heartbeat_service_port = 9050
brpc_port = 8060
# Choose one if there are more than one ip except loopback address.
# Note that there should at most one ip match this list.
# If no ip match this rule, will choose one randomly.
# use CIDR format, e.g. 10.10.10.0/24
# Default value is empty.
# priority_networks = 10.10.10.0/24;192.168.0.0/16
priority_networks = 192.168.88.0/24
# data root path, separate by ';'
# you can specify the storage medium of each root path, HDD or SSD
# you can add capacity limit at the end of each root path, seperate by ','
# eg:
# storage_root_path = /home/disk1/doris.HDD,50;/home/disk2/doris.SSD,1;/home/disk2/doris
# /home/disk1/doris.HDD, capacity limit is 50GB, HDD;
# /home/disk2/doris.SSD, capacity limit is 1GB, SSD;
# /home/disk2/doris, capacity limit is disk capacity, HDD(default)
#
# you also can specify the properties by setting '<property>:<value>', seperate by ','
# property 'medium' has a higher priority than the extension of path
#
# Default value is ${DORIS_HOME}/storage, you should create it by hand.
# storage_root_path = ${DORIS_HOME}/storage
storage_root_path = /dorisdata/storage.SSD
# Advanced configurations
# sys_log_dir = ${DORIS_HOME}/log
# sys_log_roll_mode = SIZE-MB-1024
# sys_log_roll_num = 10
# sys_log_verbose_modules = *
# log_buffer_level = -1
# palo_cgroups
按照修改后的storage_root_path = /dorisdata/storage.SSD
手动创建BE的存储路径:
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# mkdir -p /dorisdata/storage
至此配置好存储数据块的路径。
BE webserver_port端口配置
官网:如果 be 部署在 hadoop 集群中,注意调整 be.conf 中的 webserver_port = 8040
,以免造成端口冲突。笔者是单独的Linux虚拟机,不存在该问题,故不需要修改此项。
在 FE 中添加所有 BE 节点
已经安装好MySQL的Client,故可以:
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# mysql -h 192.168.88.21 -P 9030 -uroot
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 0
Server version: 5.7.37 Doris version 1.1.1-rc03-2dbd70bf9
Copyright (c) 2000, 2022, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
+--------------------+
1 row in set (0.02 sec)
mysql> use information_schema;
No connection. Trying to reconnect...
Connection id: 1
Current database: *** NONE ***
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+------------------------------+
| Tables_in_information_schema |
+------------------------------+
| tables |
| table_privileges |
| schema_privileges |
| user_privileges |
| referential_constraints |
| key_column_usage |
| routines |
| schemata |
| session_variables |
| global_variables |
| columns |
| character_sets |
| collations |
| table_constraints |
| engines |
| views |
| statistics |
| files |
| partitions |
+------------------------------+
19 rows in set (0.00 sec)
根据之前在FE的配置即可连接,默认用户root,无密码。
发现Client虽然是MySQL8,但作为Server端的Doris的FE居然是5.7.37。。。虽然很古老,也还是可以将就着用。
mysql> ALTER SYSTEM ADD BACKEND "192.168.88.21:9050";
Query OK, 0 rows affected (0.09 sec)
至此,将BE添加到FE中。
启动BE
zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/bin$ pwd
/export/server/apache-doris-1.1.1-bin-x86/be/bin
zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/bin$ sudo su root
[sudo] zhiyong 的密码:
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/bin# pwd
/export/server/apache-doris-1.1.1-bin-x86/be/bin
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/bin# ./start_be.sh --daemon
查看BE状态
连接FE的MySQL后:
mysql> SHOW PROC '/backends';
+-----------+-----------------+---------------+---------------+---------------+--------+----------+----------+---------------+---------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------------------------------------------------------------+---------+---------------------------------------------------------------------------------------------------------------+
| BackendId | Cluster | IP | HostName | HeartbeatPort | BePort | HttpPort | BrpcPort | LastStartTime | LastHeartbeat | Alive | SystemDecommissioned | ClusterDecommissioned | TabletNum | DataUsedCapacity | AvailCapacity | TotalCapacity | UsedPct | MaxDiskUsedPct | Tag | ErrMsg | Version | Status |
+-----------+-----------------+---------------+---------------+---------------+--------+----------+----------+---------------+---------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------------------------------------------------------------+---------+---------------------------------------------------------------------------------------------------------------+
| 10002 | default_cluster | 192.168.88.21 | 192.168.88.21 | 9050 | -1 | -1 | -1 | NULL | NULL | false | false | false | 0 | 0.000 | 1.000 B | 0.000 | 0.00 % | 0.00 % | {"location" : "default"} | java.net.ConnectException: 拒绝连接 (Connection refused) | | {"lastSuccessReportTabletsTime":"N/A","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} |
+-----------+-----------------+---------------+---------------+---------------+--------+----------+----------+---------------+---------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------------------------------------------------------------+---------+---------------------------------------------------------------------------------------------------------------+
1 row in set (0.22 sec)
可以看到启动失败。
修复BE
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# pwd
/export/server/apache-doris-1.1.1-bin-x86/be/log
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# ll
总用量 12
drwxrwxr-x 2 1020 1020 4096 8月 14 21:49 ./
drwxr-xr-x 8 1020 1020 4096 7月 25 15:11 ../
-rw-r--r-- 1 root root 3592 8月 14 21:52 be.out
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# cat be.out
start time: 2022年 08月 14日 星期日 21:49:50 CST
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0814 21:49:50.916643 4846 env.cpp:46] Env init successfully.
W0814 21:49:50.916720 4846 options.cpp:65] path can not be canonicalized. may be not exist. path=/dorisdata/storage.SSD
W0814 21:49:50.916730 4846 options.cpp:142] failed to parse store path /dorisdata/storage.SSD, res=-203
W0814 21:49:50.916736 4846 options.cpp:146] fail to parse storage_root_path config. value=[/dorisdata/storage.SSD]
F0814 21:49:50.916743 4846 doris_main.cpp:357] parse config storage path failed, path=/dorisdata/storage.SSD
*** Check failure stack trace: ***
*** Aborted at 1660484990 (unix time) try "date -d @1660484990" if you are using GNU date ***
*** SIGABRT unkown detail explain (@0x12ee) received by PID 4846 (TID 0x7f671f1d42c0) from PID 4846; stack trace: ***
0# 0x000056125E31B768 in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
1# 0x00007F671F23E090 in /lib/x86_64-linux-gnu/libc.so.6
2# raise at ../sysdeps/unix/sysv/linux/raise.c:51
3# abort at /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81
4# 0x000056125DFF1817 in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
5# 0x0000561260435C4D in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
6# google::LogMessage::SendToLog() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
7# google::LogMessage::Flush() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
8# google::LogMessageFatal::~LogMessageFatal() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
9# main in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
10# __libc_start_main at ../csu/libc-start.c:342
11# _start in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
start time: 2022年 08月 14日 星期日 21:52:28 CST
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0814 21:52:28.101946 5285 env.cpp:46] Env init successfully.
W0814 21:52:28.102023 5285 options.cpp:65] path can not be canonicalized. may be not exist. path=/dorisdata/storage.SSD
W0814 21:52:28.102032 5285 options.cpp:142] failed to parse store path /dorisdata/storage.SSD, res=-203
W0814 21:52:28.102037 5285 options.cpp:146] fail to parse storage_root_path config. value=[/dorisdata/storage.SSD]
F0814 21:52:28.102043 5285 doris_main.cpp:357] parse config storage path failed, path=/dorisdata/storage.SSD
*** Check failure stack trace: ***
*** Aborted at 1660485148 (unix time) try "date -d @1660485148" if you are using GNU date ***
*** SIGABRT unkown detail explain (@0x14a5) received by PID 5285 (TID 0x7fc79e7902c0) from PID 5285; stack trace: ***
0# 0x000055F6AA31B768 in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
1# 0x00007FC79E7FA090 in /lib/x86_64-linux-gnu/libc.so.6
2# raise at ../sysdeps/unix/sysv/linux/raise.c:51
3# abort at /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81
4# 0x000055F6A9FF1817 in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
5# 0x000055F6AC435C4D in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
6# google::LogMessage::SendToLog() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
7# google::LogMessage::Flush() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
8# google::LogMessageFatal::~LogMessageFatal() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
9# main in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
10# __libc_start_main at ../csu/libc-start.c:342
11# _start in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
从报错日志可以看出启动了2次都失败了,并且原因基本一致:/dorisdata/storage.SSD路径不存在
。换另一种方式尝试。
重新设置Doris的BE的路径:
root@zhiyong-doris:/dorisdata/storage# cd /export/server/apache-doris-1.1.1-bin-x86/be/conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# vim be.conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# cat be.conf
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
PPROF_TMPDIR="$DORIS_HOME/log/"
# INFO, WARNING, ERROR, FATAL
sys_log_level = INFO
# ports for admin, web, heartbeat service
be_port = 9060
webserver_port = 8040
heartbeat_service_port = 9050
brpc_port = 8060
# Choose one if there are more than one ip except loopback address.
# Note that there should at most one ip match this list.
# If no ip match this rule, will choose one randomly.
# use CIDR format, e.g. 10.10.10.0/24
# Default value is empty.
# priority_networks = 10.10.10.0/24;192.168.0.0/16
# data root path, separate by ';'
# you can specify the storage medium of each root path, HDD or SSD
# you can add capacity limit at the end of each root path, seperate by ','
# eg:
# storage_root_path = /home/disk1/doris.HDD,50;/home/disk2/doris.SSD,1;/home/disk2/doris
# /home/disk1/doris.HDD, capacity limit is 50GB, HDD;
# /home/disk2/doris.SSD, capacity limit is 1GB, SSD;
# /home/disk2/doris, capacity limit is disk capacity, HDD(default)
#
# you also can specify the properties by setting '<property>:<value>', seperate by ','
# property 'medium' has a higher priority than the extension of path
#
# Default value is ${DORIS_HOME}/storage, you should create it by hand.
# storage_root_path = ${DORIS_HOME}/storage
#storage_root_path = /dorisdata/storage.SSD
storage_root_path=/dorisdata/storage,medium:ssd
# Advanced configurations
# sys_log_dir = ${DORIS_HOME}/log
# sys_log_roll_mode = SIZE-MB-1024
# sys_log_roll_num = 10
# sys_log_verbose_modules = *
# log_buffer_level = -1
# palo_cgroups
重启依旧失败,日志如下:
start time: 2022年 08月 14日 星期日 22:02:36 CST
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0814 22:02:36.638031 5724 env.cpp:46] Env init successfully.
*** Check failure stack trace: ***
@ 0x55ac28235c4d google::LogMessage::Fail()
@ 0x55ac28238189 google::LogMessage::SendToLog()
@ 0x55ac282357b6 google::LogMessage::Flush()
@ 0x55ac282387f9 google::LogMessageFatal::~LogMessageFatal()
@ 0x55ac25f04c4f main
@ 0x7fe7b402c083 __libc_start_main
@ 0x55ac26118fea _start
@ (nil) (unknown)
*** Aborted at 1660485756 (unix time) try "date -d @1660485756" if you are using GNU date ***
*** SIGABRT unkown detail explain (@0x165c) received by PID 5724 (TID 0x7fe7b3fe12c0) from PID 5724; stack trace: ***
0# 0x000055AC2611B768 in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
1# 0x00007FE7B404B090 in /lib/x86_64-linux-gnu/libc.so.6
2# raise at ../sysdeps/unix/sysv/linux/raise.c:51
3# abort at /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81
4# 0x000055AC25DF2124 in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
5# 0x000055AC28235C4D in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
6# google::LogMessage::SendToLog() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
7# google::LogMessage::Flush() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
8# google::LogMessageFatal::~LogMessageFatal() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
9# main in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
10# __libc_start_main at ../csu/libc-start.c:342
11# _start in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
这样居然就解决了路径不能识别
的问题。。。之后:
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# ll
总用量 28
drwxrwxr-x 2 1020 1020 4096 8月 14 22:02 ./
drwxr-xr-x 8 1020 1020 4096 7月 25 15:11 ../
lrwxrwxrwx 1 root root 27 8月 14 22:02 be.INFO -> be.INFO.log.20220814-220236
-rw-r--r-- 1 root root 4366 8月 14 22:25 be.INFO.log.20220814-220236
-rw-r--r-- 1 root root 7038 8月 14 22:25 be.out
lrwxrwxrwx 1 root root 30 8月 14 22:02 be.WARNING -> be.WARNING.log.20220814-220236
-rw-r--r-- 1 root root 1086 8月 14 22:25 be.WARNING.log.20220814-220236
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# cat be.WARNING.log.20220814-220236
E0814 22:02:36.710678 5724 storage_engine.cpp:426] File descriptor number is less than 60000. Please use (ulimit -n) to set a value equal or greater than 60000
W0814 22:02:36.710800 5724 storage_engine.cpp:188] check fd number failed, error: Internal error: file descriptors limit is too small
W0814 22:02:36.710812 5724 storage_engine.cpp:102] open engine failed, error: Internal error: file descriptors limit is too small
F0814 22:02:36.711380 5724 doris_main.cpp:405] fail to open StorageEngine, res=file descriptors limit is too small
E0814 22:25:04.640094 7357 storage_engine.cpp:426] File descriptor number is less than 60000. Please use (ulimit -n) to set a value equal or greater than 60000
W0814 22:25:04.640141 7357 storage_engine.cpp:188] check fd number failed, error: Internal error: file descriptors limit is too small
W0814 22:25:04.640148 7357 storage_engine.cpp:102] open engine failed, error: Internal error: file descriptors limit is too small
F0814 22:25:04.640600 7357 doris_main.cpp:405] fail to open StorageEngine, res=file descriptors limit is too small
可以看到文件句柄太少
,导致BE不愿意启动,需要设置文件句柄>60000
才能正常启动BE。参考官方文档:
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# vi /etc/security/limits.conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# cat /etc/security/limits.conf
# /etc/security/limits.conf
#
#Each line describes a limit for a user in the form:
#
#<domain> <type> <item> <value>
#
#Where:
#<domain> can be:
# - a user name
# - a group name, with @group syntax
# - the wildcard *, for default entry
# - the wildcard %, can be also used with %group syntax,
# for maxlogin limit
# - NOTE: group and wildcard limits are not applied to root.
# To apply a limit to the root user, <domain> must be
# the literal username root.
#
#<type> can have the two values:
# - "soft" for enforcing the soft limits
# - "hard" for enforcing hard limits
#
#<item> can be one of the following:
# - core - limits the core file size (KB)
# - data - max data size (KB)
# - fsize - maximum filesize (KB)
# - memlock - max locked-in-memory address space (KB)
# - nofile - max number of open file descriptors
# - rss - max resident set size (KB)
# - stack - max stack size (KB)
# - cpu - max CPU time (MIN)
# - nproc - max number of processes
# - as - address space limit (KB)
# - maxlogins - max number of logins for this user
# - maxsyslogins - max number of logins on the system
# - priority - the priority to run user process with
# - locks - max number of file locks the user can hold
# - sigpending - max number of pending signals
# - msgqueue - max memory used by POSIX message queues (bytes)
# - nice - max nice priority allowed to raise to values: [-20, 19]
# - rtprio - max realtime priority
# - chroot - change root to directory (Debian-specific)
#
#<domain> <type> <item> <value>
#
#* soft core 0
#root hard core 100000
#* hard rss 10000
#@student hard nproc 20
#@faculty soft nproc 20
#@faculty hard nproc 50
#ftp hard nproc 0
#ftp - chroot /ftp
#@student - maxlogins 4
* soft nofile 65536
* hard nofile 65536
# End of file
重启Ubuntu,问题依旧:
E0814 22:38:35.834836 4142 storage_engine.cpp:426] File descriptor number is less than 60000. Please use (ulimit -n) to set a value equal or greater than 60000
W0814 22:38:35.835243 4142 storage_engine.cpp:188] check fd number failed, error: Internal error: file descriptors limit is too small
W0814 22:38:35.835258 4142 storage_engine.cpp:102] open engine failed, error: Internal error: file descriptors limit is too small
F0814 22:38:35.835891 4142 doris_main.cpp:405] fail to open StorageEngine, res=file descriptors limit is too small
显然还是句柄太少。官方文档也不是特别可靠。。。
重新修改这个句柄配置:
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# vim /etc/security/limits.conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# cat /etc/security/limits.conf
# /etc/security/limits.conf
#
#Each line describes a limit for a user in the form:
#
#<domain> <type> <item> <value>
#
#Where:
#<domain> can be:
# - a user name
# - a group name, with @group syntax
# - the wildcard *, for default entry
# - the wildcard %, can be also used with %group syntax,
# for maxlogin limit
# - NOTE: group and wildcard limits are not applied to root.
# To apply a limit to the root user, <domain> must be
# the literal username root.
#
#<type> can have the two values:
# - "soft" for enforcing the soft limits
# - "hard" for enforcing hard limits
#
#<item> can be one of the following:
# - core - limits the core file size (KB)
# - data - max data size (KB)
# - fsize - maximum filesize (KB)
# - memlock - max locked-in-memory address space (KB)
# - nofile - max number of open file descriptors
# - rss - max resident set size (KB)
# - stack - max stack size (KB)
# - cpu - max CPU time (MIN)
# - nproc - max number of processes
# - as - address space limit (KB)
# - maxlogins - max number of logins for this user
# - maxsyslogins - max number of logins on the system
# - priority - the priority to run user process with
# - locks - max number of file locks the user can hold
# - sigpending - max number of pending signals
# - msgqueue - max memory used by POSIX message queues (bytes)
# - nice - max nice priority allowed to raise to values: [-20, 19]
# - rtprio - max realtime priority
# - chroot - change root to directory (Debian-specific)
#
#<domain> <type> <item> <value>
#
#* soft core 0
#root hard core 100000
#* hard rss 10000
#@student hard nproc 20
#@faculty soft nproc 20
#@faculty hard nproc 50
#ftp hard nproc 0
#ftp - chroot /ftp
#@student - maxlogins 4
* soft nofile 204800
* hard nofile 204800
* soft nproc 204800
* hard nproc 204800
# End of file
再修改另一个文件:
root@zhiyong-doris:/# vim /etc/sysctl.conf
root@zhiyong-doris:/# cat /etc/sysctl.conf
#
# /etc/sysctl.conf - Configuration file for setting system variables
# See /etc/sysctl.d/ for additional system variables.
# See sysctl.conf (5) for information.
#
#kernel.domainname = example.com
# Uncomment the following to stop low-level messages on console
#kernel.printk = 3 4 1 3
##############################################################3
# Functions previously found in netbase
#
# Uncomment the next two lines to enable Spoof protection (reverse-path filter)
# Turn on Source Address Verification in all interfaces to
# prevent some spoofing attacks
#net.ipv4.conf.default.rp_filter=1
#net.ipv4.conf.all.rp_filter=1
# Uncomment the next line to enable TCP/IP SYN cookies
# See http://lwn.net/Articles/277146/
# Note: This may impact IPv6 TCP sessions too
#net.ipv4.tcp_syncookies=1
# Uncomment the next line to enable packet forwarding for IPv4
#net.ipv4.ip_forward=1
# Uncomment the next line to enable packet forwarding for IPv6
# Enabling this option disables Stateless Address Autoconfiguration
# based on Router Advertisements for this host
#net.ipv6.conf.all.forwarding=1
###################################################################
# Additional settings - these settings can improve the network
# security of the host and prevent against some network attacks
# including spoofing attacks and man in the middle attacks through
# redirection. Some network environments, however, require that these
# settings are disabled so review and enable them as needed.
#
# Do not accept ICMP redirects (prevent MITM attacks)
#net.ipv4.conf.all.accept_redirects = 0
#net.ipv6.conf.all.accept_redirects = 0
# _or_
# Accept ICMP redirects only for gateways listed in our default
# gateway list (enabled by default)
# net.ipv4.conf.all.secure_redirects = 1
#
# Do not send ICMP redirects (we are not a router)
#net.ipv4.conf.all.send_redirects = 0
#
# Do not accept IP source route packets (we are not a router)
#net.ipv4.conf.all.accept_source_route = 0
#net.ipv6.conf.all.accept_source_route = 0
#
# Log Martian Packets
#net.ipv4.conf.all.log_martians = 1
#
###################################################################
# Magic system request Key
# 0=disable, 1=enable all, >1 bitmask of sysrq functions
# See https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html
# for what other values do
#kernel.sysrq=438
fs.file-max = 6553560
查看open File:
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# ulimit -a | grep open
open files (-n) 1024
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# vim /etc/profile
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# cat /etc/profile
# /etc/profile: system-wide .profile file for the Bourne shell (sh(1))
# and Bourne compatible shells (bash(1), ksh(1), ash(1), ...).
if [ "${PS1-}" ]; then
if [ "${BASH-}" ] && [ "$BASH" != "/bin/sh" ]; then
# The file bash.bashrc already sets the default PS1.
# PS1='\h:\w\$ '
if [ -f /etc/bash.bashrc ]; then
. /etc/bash.bashrc
fi
else
if [ "`id -u`" -eq 0 ]; then
PS1='# '
else
PS1='$ '
fi
fi
fi
if [ -d /etc/profile.d ]; then
for i in /etc/profile.d/*.sh; do
if [ -r $i ]; then
. $i
fi
done
unset i
fi
ulimit -u 204800
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# source /etc/profile
zhiyong@zhiyong-doris:~$ ulimit -a | grep open
open files (-n) 204800
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# ulimit -a | grep open
open files (-n) 1024
显然Ubuntu20.04中该配置对root用户不生效。所以。。。只能将相关的文件属主更换为普通用户
:
root@zhiyong-doris:/# chown -R zhiyong:zhiyong /export
root@zhiyong-doris:/# chown -R zhiyong:zhiyong /dorisdata
zhiyong@zhiyong-doris:~$ start_fe.sh --daemon
zhiyong@zhiyong-doris:~$ start_be.sh --daemon
重启后打开网站:
http://192.168.88.21:8040/api/health
终于启动成功!!!
zhiyong@zhiyong-doris:~$ mysql -h 192.168.88.21 -P 9030 -uroot
mysql> SHOW PROC '/backends';
+-----------+-----------------+---------------+---------------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------+----------------------+-------------------------------------------------------------------------------------------------------------------------------+
| BackendId | Cluster | IP | HostName | HeartbeatPort | BePort | HttpPort | BrpcPort | LastStartTime | LastHeartbeat | Alive | SystemDecommissioned | ClusterDecommissioned | TabletNum | DataUsedCapacity | AvailCapacity | TotalCapacity | UsedPct | MaxDiskUsedPct | Tag | ErrMsg | Version | Status |
+-----------+-----------------+---------------+---------------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------+----------------------+-------------------------------------------------------------------------------------------------------------------------------+
| 10002 | default_cluster | 192.168.88.21 | 192.168.88.21 | 9050 | 9060 | 8040 | 8060 | 2022-08-14 23:09:58 | 2022-08-14 23:15:58 | true | false | false | 0 | 0.000 | 268.832 GB | 293.797 GB | 8.50 % | 8.50 % | {"location" : "default"} | | 1.1.1-rc03-2dbd70bf9 | {"lastSuccessReportTabletsTime":"2022-08-14 23:15:22","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} |
+-----------+-----------------+---------------+---------------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------+----------------------+-------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.22 sec)
可以看到此时Alive=true
。
FS_Broker 部署
官方文档:Broker 以插件的形式,独立于 Doris 部署。如果需要从第三方存储系统导入数据,需要部署相应的 Broker,默认提供了读取 HDFS 、对象存储的 fs_broker。fs_broker 是无状态的,建议每一个 FE 和 BE 节点都部署一个 Broker。
cp文件
由于已经放置在正确路径,不再cp。
zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86$ ll
总用量 120
drwxrwxr-x 7 zhiyong zhiyong 4096 7月 25 15:11 ./
drwxrwxrwx 3 zhiyong zhiyong 4096 8月 14 21:09 ../
drwxr-xr-x 5 zhiyong zhiyong 4096 7月 25 15:11 apache_hdfs_broker/
drwxr-xr-x 8 zhiyong zhiyong 4096 7月 25 15:11 be/
drwxr-xr-x 10 zhiyong zhiyong 4096 8月 14 21:20 fe/
-rw-rw-r-- 1 zhiyong zhiyong 86171 7月 25 15:11 LICENSE-dist.txt
drwxrwxr-x 2 zhiyong zhiyong 4096 7月 25 15:11 licenses/
-rw-rw-r-- 1 zhiyong zhiyong 1948 7月 25 15:11 NOTICE.txt
drwxr-xr-x 4 zhiyong zhiyong 4096 7月 25 15:11 udf/
可以看到官方的安装包默认已经有hdfs的broker。
修改Broker配置
zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/conf$ pwd
/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/conf
zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/conf$ ll
总用量 20
drwxr-xr-x 2 zhiyong zhiyong 4096 7月 25 15:11 ./
drwxr-xr-x 5 zhiyong zhiyong 4096 7月 25 15:11 ../
-rw-rw-r-- 1 zhiyong zhiyong 1543 3月 24 16:59 apache_hdfs_broker.conf
-rw-rw-r-- 1 zhiyong zhiyong 956 3月 24 16:59 hdfs-site.xml
-rw-rw-r-- 1 zhiyong zhiyong 1426 3月 24 16:59 log4j.properties
显然需要修改hdfs-site.xml
使其可以连接HDFS。还需要修改apache_hdfs_broker.conf
这个配置文件。笔者不太可能使用Doris自己从HDFS拉数据,最多是用Spark或者Flink灌数据,略过。
启动Broker
zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/bin$ pwd
/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/bin
zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/bin$ ll
总用量 16
drwxr-xr-x 2 zhiyong zhiyong 4096 7月 25 15:11 ./
drwxr-xr-x 5 zhiyong zhiyong 4096 7月 25 15:11 ../
-rwxrwxr-x 1 zhiyong zhiyong 2892 5月 8 12:06 start_broker.sh*
-rwxrwxr-x 1 zhiyong zhiyong 1603 3月 24 16:59 stop_broker.sh*
zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/bin$ ./start_broker.sh --daemon
使用起来和其它命令没什么区别。
添加Broker
使用 mysql-client 连接启动的 FE,执行以下命令:
ALTER SYSTEM ADD BROKER broker_name "broker_host1:broker_ipc_port1","broker_host2:broker_ipc_port2",...;
其中 broker_host 为 Broker 所在节点 ip;broker_ipc_port 在 Broker 配置文件中的conf/apache_hdfs_broker.conf。
查看Broker状态
使用 mysql-client 连接任一已启动的 FE,执行以下命令查看 Broker 状态:
SHOW PROC "/brokers";
先记下来。
Superbisor守护进程
在生产环境中,所有实例都应使用守护进程启动,以保证进程退出后,会被自动拉起,如 Supervisor。如需使用守护进程启动,在 0.9.0 及之前版本中,需要修改各个 start_xx.sh 脚本,去掉最后的 & 符号。从 0.10.0 版本开始,直接调用 sh start_xx.sh
启动即可。笔者安装的Doris1.1已经不需要手动配置这一步。新版本有新版本的好处。
组件集成
导入导出等功能不是笔者关注的重点。笔者最关心的是Doris完美支持了Spark及Flink的连接器。
Spark连接器
Spark Doris Connector 可以支持通过 Spark 读取 Doris 中存储的数据,也支持通过Spark写入数据到Doris。
代码库地址:https://github.com/apache/incubator-doris-spark-connector
- 支持从
Doris
中读取数据 - 支持
Spark DataFrame
批量/流式 写入Doris
- 可以将
Doris
表映射为DataFrame
或者RDD
,推荐使用DataFrame
。 - 支持在
Doris
端完成数据过滤,减少数据传输量。
版本兼容
Connector | Spark | Doris | Java | Scala |
---|---|---|---|---|
2.3.4-2.11.xx | 2.x | 0.12+ | 8 | 2.11 |
3.1.2-2.12.xx | 3.x | 0.12.+ | 8 | 2.12 |
3.2.0-2.12.xx | 3.2.x | 0.12.+ | 8 | 2.12 |
比较遗憾的是还不支持目前最新的Spark3.3.0。
GAV:
<dependency>
<groupId>org.apache.doris</groupId>
<artifactId>spark-doris-connector-3.1_2.12</artifactId>
<!--artifactId>spark-doris-connector-2.3_2.11</artifactId-->
<version>1.0.1</version>
</dependency>
Spark
的Batch
和Stream
模式都可以支持,功能比较完善,既然支持了RDD
及DataFrame
,那么上层的SQL
也是支持的。具体官方文档介绍的很详细。
Flink连接器
link Doris Connector 可以支持通过 Flink 操作(读取、插入、修改、删除) Doris 中存储的数据。
代码库地址:https://github.com/apache/doris-flink-connector
- 可以将
Doris
表映射为DataStream
或者Table
。
注意:
- 修改和删除只支持在 Unique Key 模型上
- 目前的删除是支持 Flink CDC 的方式接入数据实现自动删除,如果是其他数据接入的方式删除需要自己实现。Flink CDC 的数据删除使用方式参照本文档最后一节
版本兼容
Connector | Flink | Doris | Java | Scala |
---|---|---|---|---|
1.14_2.11-1.1.0 | 1.14.x | 1.0+ | 8 | 2.11 |
1.14_2.12-1.1.0 | 1.14.x | 1.0+ | 8 | 2.12 |
非常完美地支持Flink的里程碑1.14
版本。当然也不支持目前最新的1.15版本。
GAV:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>${flink.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_${scala.version}</artifactId>
<version>${flink.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_${scala.version}</artifactId>
<version>${flink.version}</version>
<scope>provided</scope>
</dependency>
<!-- flink table -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-planner_${scala.version}</artifactId>
<version>${flink.version}</version>
<scope>provided</scope>
</dependency>
<!-- flink-doris-connector -->
<dependency>
<groupId>org.apache.doris</groupId>
<artifactId>flink-doris-connector-1.14_2.12</artifactId>
<version>1.1.0</version>
</dependency>
Flink
的DataStream
和Table
都可以支持,那么上层SQL当然也没啥问题,具体官方文档介绍的很详细。
案例
给出了Java
、C++
、Python
的Demo,还有Spark
及Flink
、Spring-JDBC
的Demo,对新手异常友好!
组件关闭
zhiyong@zhiyong-doris:~$ stop_be.sh
stop doris_be, and remove pid file.
zhiyong@zhiyong-doris:~$ stop_fe.sh
stop java, and remove pid file.
命令也很简单,不拖泥带水。