在Ubuntu20.04单机部署Doris1.1

Doris简介

中文官网:https://doris.apache.org/zh-CN/

官方文档:https://doris.apache.org/zh-CN/docs/summary/basic-summary/

笔者在此不再赘述。

Doris作为一个MPP数据库,有明显的去Hadoop优势,又能完美兼容MySQL,对于只有结构化数据表的中小规模公司相当使用,且对肤浅的SQL Boy们相当友好,本身对流和批处理都不弱,官方提供与Spark、Flink这种可以运行在K8S的计算引擎的连接器,对开发与运维水平弱的团队也是相当友好。

由于不依赖Hadoop生态圈,不需要HDFS文件系统、ZooKeeper协调服务、Yarn资源调度,简单的运算Doris自己就能实现,复杂运算用到的Spark和Flink直接运行在K8S即可,使得基于Doris的数仓和数据集市架构相当简洁。一个组件通吃入湖及ODS层、DW层【DWD最细粒度明细层、DWM粗粒度汇总层】、ADS应用层,只需要数据集成到Doris,即可完成一站式结构化数据分析汇总全过程,甚至可以简化维度表的中间层,数据架构也是相当简洁。

编译

参考官网文档:https://doris.apache.org/zh-CN/docs/install/source-install/compilation

主要是切换JDK8及JDK11的区别,使用官方编译好的JDK1.8的Doris即可。

官网下载:https://doris.apache.org/zh-CN/download

Doris1.1下载地址:https://dist.apache.org/repos/dist/release/doris/1.1/1.1.1-rc03/apache-doris-1.1.1-bin-x86.tar.gz

Linux虚拟机要求

Doris的要求比隔壁StarRocks更容易满足。

硬件要求

开发测试环境
模块CPU内存磁盘网络实例数量
Frontend8核+8GB+SSD 或 SATA,10GB+ *千兆网卡1
Backend8核+16GB+SSD 或 SATA,50GB+ *千兆网卡1-3 *
生产环境
模块CPU内存磁盘网络实例数量(最低要求)
Frontend16核+64GB+SSD 或 RAID 卡,100GB+ *万兆网卡1-5 *
Backend16核+64GB+SSD 或 SATA,100G+ *万兆网卡10-100 *
虚拟机资源分配

根据官方手册,建议dev环境的前端节点FE用1个8C8G的,后端节点BE用1~3个8C16G的,由于Doris横向扩容能力极强,也为了启停方便,笔者制作一个16C32G+300G硬盘的虚拟机,在单机上部署FE及BE,硬盘万一不够用,后期也很容易扩容BE。

软件要求

Linux 操作系统版本需求
Linux 系统版本
CentOS7.1 及以上
Ubuntu16.04 及以上
软件需求
软件版本
Java1.8 及以上
GCC4.8.2 及以上
操作系统安装要求
设置系统最大打开文件句柄数
vi /etc/security/limits.conf 
* soft nofile 65536
* hard nofile 65536
时钟同步

Doris 的元数据要求时间精度要小于5000ms,所以所有集群所有机器要进行时钟同步,避免因为时钟问题引发的元数据不一致导致服务出现异常。

关闭交换分区(swap)

Linux交换分区会给Doris带来很严重的性能问题,需要在安装之前禁用交换分区

Liunx文件系统

这里我们推荐使用ext4文件系统,在安装操作系统的时候,请选择ext4文件系统。

网络需求

Doris 各个实例直接通过网络进行通讯。以下表格展示了所有需要的端口

实例名称端口名称默认端口通讯方向说明
BEbe_port9060FE --> BEBE 上 thrift server 的端口,用于接收来自 FE 的请求
BEwebserver_port8040BE <–> BEBE 上的 http server 的端口
BEheartbeat_service_port9050FE --> BEBE 上心跳服务端口(thrift),用于接收来自 FE 的心跳
BEbrpc_port8060FE <–> BE, BE <–> BEBE 上的 brpc 端口,用于 BE 之间通讯
FEhttp_port8030FE <–> FE,用户 <–> FEFE 上的 http server 端口
FErpc_port9020BE --> FE, FE <–> FEFE 上的 thrift server 端口,每个fe的配置需要保持一致
FEquery_port9030用户 <–> FEFE 上的 mysql server 端口
FEedit_log_port9010FE <–> FEFE 上的 bdbje 之间通信用的端口
Brokerbroker_ipc_port8000FE --> Broker, BE --> BrokerBroker 上的 thrift server,用于接收请求
IP 绑定

因为有多网卡的存在,或因为安装过 docker 等环境导致的虚拟网卡的存在,同一个主机可能存在多个不同的 ip。当前 Doris 并不能自动识别可用 IP。所以当遇到部署主机上有多个 IP 时,必须通过 priority_networks 配置项来强制指定正确的 IP。

虚拟机软件适配

根据官方文档,虚拟机使用Ubuntu20.04,单网卡固定IP即可,其余设置安装Doris时照着官方文档一步一步操作即可。

虚拟机制作

老规矩,制作Ubuntu20.04虚拟机,参照笔者之前的一篇博客:https://lizhiyong.blog.csdn.net/article/details/126236516

配置网卡

在这里插入图片描述

需要禁用IPV6。

安装必要命令

sudo apt install net-tools
sudo apt-get install openssh-server
sudo apt-get install openssh-client
sudo apt install vim
apt install openjdk-8-jdk-headless

此时可以使用MobaXterm

系统设置

检查Java版本

zhiyong@zhiyong-doris:~$ java -version

Command 'java' not found, but can be installed with:

sudo apt install openjdk-11-jre-headless  # version 11.0.16+8-0ubuntu1~20.04, or
sudo apt install default-jre              # version 2:1.11-72
sudo apt install openjdk-13-jre-headless  # version 13.0.7+5-0ubuntu1~20.04
sudo apt install openjdk-16-jre-headless  # version 16.0.1+9-1~20.04
sudo apt install openjdk-17-jre-headless  # version 17.0.4+8-1~20.04
sudo apt install openjdk-8-jre-headless   # version 8u342-b07-0ubuntu1~20.04

zhiyong@zhiyong-doris:~$ sudo apt install openjdk-8-jre-headless
[sudo] zhiyong 的密码:
正在读取软件包列表... 完成
正在分析软件包的依赖关系树
正在读取状态信息... 完成
将会同时安装下列软件:
  ca-certificates-java java-common
建议安装:
  default-jre fonts-dejavu-extra fonts-ipafont-gothic fonts-ipafont-mincho fonts-wqy-microhei fonts-wqy-zenhei
下列【新】软件包将被安装:
  ca-certificates-java java-common openjdk-8-jre-headless
升级了 0 个软件包,新安装了 3 个软件包,要卸载 0 个软件包,有 290 个软件包未被升级。
需要下载 28.3 MB 的归档。
解压缩后会消耗 104 MB 的额外空间。
您希望继续执行吗? [Y/n] y

zhiyong@zhiyong-doris:~$ java -version
openjdk version "1.8.0_342"
OpenJDK Runtime Environment (build 1.8.0_342-8u342-b07-0ubuntu1~20.04-b07)
OpenJDK 64-Bit Server VM (build 25.342-b07, mixed mode)

此时已经安装好JDK1.8。

检查GCC版本

zhiyong@zhiyong-doris:~$ gcc

Command 'gcc' not found, but can be installed with:

sudo apt install gcc

zhiyong@zhiyong-doris:~$ sudo apt install gcc
正在读取软件包列表... 完成
正在分析软件包的依赖关系树
正在读取状态信息... 完成
将会同时安装下列软件:
  binutils binutils-common binutils-x86-64-linux-gnu cpp-9 gcc-9 gcc-9-base libasan5 libatomic1 libbinutils libc-dev-bin libc6 libc6-dbg libc6-dev
  libcrypt-dev libctf-nobfd0 libctf0 libgcc-9-dev libitm1 liblsan0 libquadmath0 libtsan0 libubsan1 linux-libc-dev manpages-dev
建议安装:
  binutils-doc gcc-9-locales gcc-multilib make autoconf automake libtool flex bison gcc-doc gcc-9-multilib gcc-9-doc glibc-doc
下列【新】软件包将被安装:
  binutils binutils-common binutils-x86-64-linux-gnu gcc gcc-9 libasan5 libatomic1 libbinutils libc-dev-bin libc6-dev libcrypt-dev libctf-nobfd0 libctf0
  libgcc-9-dev libitm1 liblsan0 libquadmath0 libtsan0 libubsan1 linux-libc-dev manpages-dev
下列软件包将被升级:
  cpp-9 gcc-9-base libc6 libc6-dbg
升级了 4 个软件包,新安装了 21 个软件包,要卸载 0 个软件包,有 286 个软件包未被升级。
需要下载 55.9 MB 的归档。
解压缩后会消耗 74.1 MB 的额外空间。
您希望继续执行吗? [Y/n] y

zhiyong@zhiyong-doris:~$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.4.0-1ubuntu1~20.04.1' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-9-Av3uEd/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)

可以看到已经安装好GCC9.4。

检查MySQL的Client版本

zhiyong@zhiyong-doris:~$ mysql

Command 'mysql' not found, but can be installed with:

sudo apt install mysql-client-core-8.0     # version 8.0.30-0ubuntu0.20.04.2, or
sudo apt install mariadb-client-core-10.3  # version 1:10.3.34-0ubuntu0.20.04.1

zhiyong@zhiyong-doris:~$ sudo apt install mysql-client-core-8.0
正在读取软件包列表... 完成
正在分析软件包的依赖关系树
正在读取状态信息... 完成
下列【新】软件包将被安装:
  mysql-client-core-8.0
升级了 0 个软件包,新安装了 1 个软件包,要卸载 0 个软件包,有 286 个软件包未被升级。
需要下载 4,521 kB 的归档。
解压缩后会消耗 67.3 MB 的额外空间。
获取:1 http://mirrors.aliyun.com/ubuntu focal-updates/main amd64 mysql-client-core-8.0 amd64 8.0.30-0ubuntu0.20.04.2 [4,521 kB]
已下载 4,521 kB,耗时 1(4,289 kB/s)
正在选中未选择的软件包 mysql-client-core-8.0。
(正在读取数据库 ... 系统当前共安装有 154483 个文件和目录。)
准备解压 .../mysql-client-core-8.0_8.0.30-0ubuntu0.20.04.2_amd64.deb  ...
正在解压 mysql-client-core-8.0 (8.0.30-0ubuntu0.20.04.2) ...
正在设置 mysql-client-core-8.0 (8.0.30-0ubuntu0.20.04.2) ...
正在处理用于 man-db (2.9.1-1) 的触发器 ...

安装好MySQL的Client,之后部署BE时把BE添加到FE需要用。

时钟同步

单机安装可以跳过这步。

关闭Swap

zhiyong@zhiyong-doris:~$ free -m
              总计         已用        空闲      共享    缓冲/缓存    可用
内存:       32071        1316       28964          10        1790       30324
交换:        2047           0        2047

可以看到目前开启了swap,会影响Doris的性能。

zhiyong@zhiyong-doris:~$ sudo su root
root@zhiyong-doris:/home/zhiyong# vim /etc/fstab
root@zhiyong-doris:/home/zhiyong# cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/sda5 during installation
UUID=abc5a4f3-3280-421e-8e6d-4132ad1f2c1a /               ext4    errors=remount-ro 0       1
# /boot/efi was on /dev/sda1 during installation
UUID=2D2A-8B96  /boot/efi       vfat    umask=0077      0       1
#/swapfile                                 none            swap    sw              0       0

屏蔽掉这一行。reboot重启。

zhiyong@zhiyong-doris:~$ free -m
              总计         已用        空闲      共享    缓冲/缓存    可用
内存:       32071        1070       30314           2         685       30615
交换:           0           0           0
zhiyong@zhiyong-doris:~$ sudo swapon --show
[sudo] zhiyong 的密码:
zhiyong@zhiyong-doris:~$ sudo swapon --show
zhiyong@zhiyong-doris:~$

此时swap已关闭。至此基础环境就绪。

Doris部署

现在开始正式部署Doris。

zhiyong@zhiyong-doris:~$ sudo su root
root@zhiyong-doris:/home/zhiyong# cd /
root@zhiyong-doris:/# df -Th
文件系统       类型      容量  已用  可用 已用% 挂载点
udev           devtmpfs   16G     0   16G    0% /dev
tmpfs          tmpfs     3.2G  1.9M  3.2G    1% /run
/dev/sda5      ext4      294G  8.4G  271G    4% /
tmpfs          tmpfs      16G     0   16G    0% /dev/shm
tmpfs          tmpfs     5.0M  4.0K  5.0M    1% /run/lock
tmpfs          tmpfs      16G     0   16G    0% /sys/fs/cgroup
/dev/loop0     squashfs  128K  128K     0  100% /snap/bare/5
/dev/loop1     squashfs   62M   62M     0  100% /snap/core20/1328
/dev/loop2     squashfs   66M   66M     0  100% /snap/gtk-common-themes/1519
/dev/loop3     squashfs   44M   44M     0  100% /snap/snapd/14978
/dev/loop4     squashfs   55M   55M     0  100% /snap/snap-store/558
/dev/loop5     squashfs  249M  249M     0  100% /snap/gnome-3-38-2004/99
/dev/sda1      vfat      511M  4.0K  511M    1% /boot/efi
tmpfs          tmpfs     3.2G   20K  3.2G    1% /run/user/1000
root@zhiyong-doris:/# mkdir -p /export/software
root@zhiyong-doris:/# mkdir -p /export/server
root@zhiyong-doris:/# chmod -R 777 /export/
root@zhiyong-doris:/# cd /export/software/
root@zhiyong-doris:/export/software# ll
总用量 398792
drwxrwxrwx 2 root    root         4096 814 21:07 ./
drwxrwxrwx 4 root    root         4096 814 21:04 ../
-rw-rw-r-- 1 zhiyong zhiyong 408349973 814 21:08 apache-doris-1.1.1-bin-x86.tar.gz
root@zhiyong-doris:/export/software# tar -zxvf apache-doris-1.1.1-bin-x86.tar.gz -C /export/server/
root@zhiyong-doris:/export/software# cd /export/server/
root@zhiyong-doris:/export/server# ll
总用量 12
drwxrwxrwx 3 root root 4096 814 21:09 ./
drwxrwxrwx 4 root root 4096 814 21:04 ../
drwxrwxr-x 7 1020 1020 4096 725 15:11 apache-doris-1.1.1-bin-x86/

至此上传安装包及解包完成。

环境变量

root@zhiyong-doris:/# vim /etc/profile.d/doris.sh
root@zhiyong-doris:/# cat /etc/profile.d/doris.sh
export DORIS_HOME=/export/server/apache-doris-1.1.1-bin-x86
export PATH=$PATH:$DORIS_HOME/fe/bin:$DORIS_HOME/be/bin
root@zhiyong-doris:/# source /etc/profile.d/doris.sh

FE部署

cp文件

由于已经放置在正确路径,不再cp。

配置FE

root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/conf# pwd
/export/server/apache-doris-1.1.1-bin-x86/fe/conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/conf# ll
总用量 12
drwxr-xr-x 2 1020 1020 4096 725 15:11 ./
drwxr-xr-x 8 1020 1020 4096 725 15:11 ../
-rw-rw-r-- 1 1020 1020 2889 613 13:59 fe.conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/conf# vim fe.conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/conf# cat fe.conf
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

#####################################################################
## The uppercase properties are read and exported by bin/start_fe.sh.
## To see all Frontend configurations,
## see fe/src/org/apache/doris/common/Config.java
#####################################################################

# the output dir of stderr and stdout
LOG_DIR = ${DORIS_HOME}/log

DATE = `date +%Y%m%d-%H%M%S`
JAVA_OPTS="-Xmx8192m -XX:+UseMembar -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xloggc:$DORIS_HOME/log/fe.gc.log.$DATE"

# For jdk 9+, this JAVA_OPTS will be used as default JVM options
JAVA_OPTS_FOR_JDK_9="-Xmx8192m -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xlog:gc*:$DORIS_HOME/log/fe.gc.log.$DATE:time"

##
## the lowercase properties are read by main program.
##

# INFO, WARN, ERROR, FATAL
sys_log_level = INFO

# store metadata, must be created before start FE.
# Default value is ${DORIS_HOME}/doris-meta
# meta_dir = ${DORIS_HOME}/doris-meta
meta_dir = /dorisdata/doris-meta

http_port = 8030
rpc_port = 9020
query_port = 9030
edit_log_port = 9010
mysql_service_nio_enabled = true

# Choose one if there are more than one ip except loopback address.
# Note that there should at most one ip match this list.
# If no ip match this rule, will choose one randomly.
# use CIDR format, e.g. 10.10.10.0/24
# Default value is empty.
# priority_networks = 10.10.10.0/24;192.168.0.0/16
priority_networks = 192.168.88.0/24

# Advanced configurations
# log_roll_size_mb = 1024
# sys_log_dir = ${DORIS_HOME}/log
# sys_log_roll_num = 10
# sys_log_verbose_modules = org.apache.doris
# audit_log_dir = ${DORIS_HOME}/log
# audit_log_modules = slow_query, query
# audit_log_roll_num = 10
# meta_delay_toleration_second = 10
# qe_max_connection = 1024
# max_conn_per_user = 100
# qe_query_timeout_second = 300
# qe_slow_log_ms = 5000

根据此处指定的meta_dir = /dorisdata/doris-meta,手动创建Doris的元数据存储路径:

root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/conf# mkdir -p /dorisdata/doris-meta

于是Doris有了存储元数据的路径。

启动FE

root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/bin# pwd
/export/server/apache-doris-1.1.1-bin-x86/fe/bin
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/bin# ll
总用量 20
drwxr-xr-x 2 1020 1020 4096 725 15:11 ./
drwxr-xr-x 8 1020 1020 4096 725 15:11 ../
-rwxrwxr-x 1 1020 1020 4455 724 09:57 start_fe.sh*
-rwxrwxr-x 1 1020 1020 1568 324 16:59 stop_fe.sh*
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/bin# ./start_fe.sh --daemon
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/fe/bin# jps
3406 PaloFe
3791 Jps

至此FE正常启动。

BE部署

cp文件

由于已经放置在正确路径,不再cp。

配置BE

root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# pwd
/export/server/apache-doris-1.1.1-bin-x86/be/conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# ll
总用量 16
drwxr-xr-x 2 1020 1020 4096 725 15:11 ./
drwxr-xr-x 8 1020 1020 4096 725 15:11 ../
-rw-r--r-- 1 1020 1020 2207 724 09:57 be.conf
-rw-r--r-- 1 1020 1020 1529 324 16:59 odbcinst.ini
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# vim be.conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# cat be.conf
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

PPROF_TMPDIR="$DORIS_HOME/log/"

# INFO, WARNING, ERROR, FATAL
sys_log_level = INFO

# ports for admin, web, heartbeat service
be_port = 9060
webserver_port = 8040
heartbeat_service_port = 9050
brpc_port = 8060

# Choose one if there are more than one ip except loopback address.
# Note that there should at most one ip match this list.
# If no ip match this rule, will choose one randomly.
# use CIDR format, e.g. 10.10.10.0/24
# Default value is empty.
# priority_networks = 10.10.10.0/24;192.168.0.0/16
priority_networks = 192.168.88.0/24

# data root path, separate by ';'
# you can specify the storage medium of each root path, HDD or SSD
# you can add capacity limit at the end of each root path, seperate by ','
# eg:
# storage_root_path = /home/disk1/doris.HDD,50;/home/disk2/doris.SSD,1;/home/disk2/doris
# /home/disk1/doris.HDD, capacity limit is 50GB, HDD;
# /home/disk2/doris.SSD, capacity limit is 1GB, SSD;
# /home/disk2/doris, capacity limit is disk capacity, HDD(default)
#
# you also can specify the properties by setting '<property>:<value>', seperate by ','
# property 'medium' has a higher priority than the extension of path
#
# Default value is ${DORIS_HOME}/storage, you should create it by hand.
# storage_root_path = ${DORIS_HOME}/storage
storage_root_path = /dorisdata/storage.SSD

# Advanced configurations
# sys_log_dir = ${DORIS_HOME}/log
# sys_log_roll_mode = SIZE-MB-1024
# sys_log_roll_num = 10
# sys_log_verbose_modules = *
# log_buffer_level = -1
# palo_cgroups

按照修改后的storage_root_path = /dorisdata/storage.SSD手动创建BE的存储路径:

root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# mkdir -p /dorisdata/storage

至此配置好存储数据块的路径。

BE webserver_port端口配置

官网:如果 be 部署在 hadoop 集群中,注意调整 be.conf 中的 webserver_port = 8040 ,以免造成端口冲突。笔者是单独的Linux虚拟机,不存在该问题,故不需要修改此项。

在 FE 中添加所有 BE 节点

已经安装好MySQL的Client,故可以:

root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# mysql -h 192.168.88.21 -P 9030 -uroot
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 0
Server version: 5.7.37 Doris version 1.1.1-rc03-2dbd70bf9

Copyright (c) 2000, 2022, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
+--------------------+
1 row in set (0.02 sec)

mysql> use information_schema;
No connection. Trying to reconnect...
Connection id:    1
Current database: *** NONE ***

Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+------------------------------+
| Tables_in_information_schema |
+------------------------------+
| tables                       |
| table_privileges             |
| schema_privileges            |
| user_privileges              |
| referential_constraints      |
| key_column_usage             |
| routines                     |
| schemata                     |
| session_variables            |
| global_variables             |
| columns                      |
| character_sets               |
| collations                   |
| table_constraints            |
| engines                      |
| views                        |
| statistics                   |
| files                        |
| partitions                   |
+------------------------------+
19 rows in set (0.00 sec)

根据之前在FE的配置即可连接,默认用户root,无密码。

发现Client虽然是MySQL8,但作为Server端的Doris的FE居然是5.7.37。。。虽然很古老,也还是可以将就着用。

mysql> ALTER SYSTEM ADD BACKEND "192.168.88.21:9050";
Query OK, 0 rows affected (0.09 sec)

至此,将BE添加到FE中。

启动BE

zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/bin$ pwd
/export/server/apache-doris-1.1.1-bin-x86/be/bin
zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/bin$ sudo su root
[sudo] zhiyong 的密码:
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/bin# pwd
/export/server/apache-doris-1.1.1-bin-x86/be/bin
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/bin# ./start_be.sh --daemon

查看BE状态

连接FE的MySQL后:

mysql> SHOW PROC '/backends';
+-----------+-----------------+---------------+---------------+---------------+--------+----------+----------+---------------+---------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------------------------------------------------------------+---------+---------------------------------------------------------------------------------------------------------------+
| BackendId | Cluster         | IP            | HostName      | HeartbeatPort | BePort | HttpPort | BrpcPort | LastStartTime | LastHeartbeat | Alive | SystemDecommissioned | ClusterDecommissioned | TabletNum | DataUsedCapacity | AvailCapacity | TotalCapacity | UsedPct | MaxDiskUsedPct | Tag                      | ErrMsg                                                       | Version | Status                                                                                                        |
+-----------+-----------------+---------------+---------------+---------------+--------+----------+----------+---------------+---------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------------------------------------------------------------+---------+---------------------------------------------------------------------------------------------------------------+
| 10002     | default_cluster | 192.168.88.21 | 192.168.88.21 | 9050          | -1     | -1       | -1       | NULL          | NULL          | false | false                | false                 | 0         | 0.000            | 1.000 B       | 0.000         | 0.00 %  | 0.00 %         | {"location" : "default"} | java.net.ConnectException: 拒绝连接 (Connection refused)     |         | {"lastSuccessReportTabletsTime":"N/A","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} |
+-----------+-----------------+---------------+---------------+---------------+--------+----------+----------+---------------+---------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------------------------------------------------------------+---------+---------------------------------------------------------------------------------------------------------------+
1 row in set (0.22 sec)

可以看到启动失败。

修复BE

root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# pwd
/export/server/apache-doris-1.1.1-bin-x86/be/log
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# ll
总用量 12
drwxrwxr-x 2 1020 1020 4096 814 21:49 ./
drwxr-xr-x 8 1020 1020 4096 725 15:11 ../
-rw-r--r-- 1 root root 3592 814 21:52 be.out
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# cat be.out
start time: 2022年 08月 14日 星期日 21:49:50 CST
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0814 21:49:50.916643  4846 env.cpp:46] Env init successfully.
W0814 21:49:50.916720  4846 options.cpp:65] path can not be canonicalized. may be not exist. path=/dorisdata/storage.SSD
W0814 21:49:50.916730  4846 options.cpp:142] failed to parse store path /dorisdata/storage.SSD, res=-203
W0814 21:49:50.916736  4846 options.cpp:146] fail to parse storage_root_path config. value=[/dorisdata/storage.SSD]
F0814 21:49:50.916743  4846 doris_main.cpp:357] parse config storage path failed, path=/dorisdata/storage.SSD
*** Check failure stack trace: ***
*** Aborted at 1660484990 (unix time) try "date -d @1660484990" if you are using GNU date ***
*** SIGABRT unkown detail explain (@0x12ee) received by PID 4846 (TID 0x7f671f1d42c0) from PID 4846; stack trace: ***
 0# 0x000056125E31B768 in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 1# 0x00007F671F23E090 in /lib/x86_64-linux-gnu/libc.so.6
 2# raise at ../sysdeps/unix/sysv/linux/raise.c:51
 3# abort at /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81
 4# 0x000056125DFF1817 in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 5# 0x0000561260435C4D in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 6# google::LogMessage::SendToLog() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 7# google::LogMessage::Flush() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 8# google::LogMessageFatal::~LogMessageFatal() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 9# main in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
10# __libc_start_main at ../csu/libc-start.c:342
11# _start in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be

start time: 2022年 08月 14日 星期日 21:52:28 CST
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0814 21:52:28.101946  5285 env.cpp:46] Env init successfully.
W0814 21:52:28.102023  5285 options.cpp:65] path can not be canonicalized. may be not exist. path=/dorisdata/storage.SSD
W0814 21:52:28.102032  5285 options.cpp:142] failed to parse store path /dorisdata/storage.SSD, res=-203
W0814 21:52:28.102037  5285 options.cpp:146] fail to parse storage_root_path config. value=[/dorisdata/storage.SSD]
F0814 21:52:28.102043  5285 doris_main.cpp:357] parse config storage path failed, path=/dorisdata/storage.SSD
*** Check failure stack trace: ***
*** Aborted at 1660485148 (unix time) try "date -d @1660485148" if you are using GNU date ***
*** SIGABRT unkown detail explain (@0x14a5) received by PID 5285 (TID 0x7fc79e7902c0) from PID 5285; stack trace: ***
 0# 0x000055F6AA31B768 in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 1# 0x00007FC79E7FA090 in /lib/x86_64-linux-gnu/libc.so.6
 2# raise at ../sysdeps/unix/sysv/linux/raise.c:51
 3# abort at /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81
 4# 0x000055F6A9FF1817 in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 5# 0x000055F6AC435C4D in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 6# google::LogMessage::SendToLog() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 7# google::LogMessage::Flush() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 8# google::LogMessageFatal::~LogMessageFatal() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 9# main in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
10# __libc_start_main at ../csu/libc-start.c:342
11# _start in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be


从报错日志可以看出启动了2次都失败了,并且原因基本一致:/dorisdata/storage.SSD路径不存在。换另一种方式尝试。

重新设置Doris的BE的路径:

root@zhiyong-doris:/dorisdata/storage# cd /export/server/apache-doris-1.1.1-bin-x86/be/conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# vim be.conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/conf# cat be.conf
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

PPROF_TMPDIR="$DORIS_HOME/log/"

# INFO, WARNING, ERROR, FATAL
sys_log_level = INFO

# ports for admin, web, heartbeat service
be_port = 9060
webserver_port = 8040
heartbeat_service_port = 9050
brpc_port = 8060

# Choose one if there are more than one ip except loopback address.
# Note that there should at most one ip match this list.
# If no ip match this rule, will choose one randomly.
# use CIDR format, e.g. 10.10.10.0/24
# Default value is empty.
# priority_networks = 10.10.10.0/24;192.168.0.0/16

# data root path, separate by ';'
# you can specify the storage medium of each root path, HDD or SSD
# you can add capacity limit at the end of each root path, seperate by ','
# eg:
# storage_root_path = /home/disk1/doris.HDD,50;/home/disk2/doris.SSD,1;/home/disk2/doris
# /home/disk1/doris.HDD, capacity limit is 50GB, HDD;
# /home/disk2/doris.SSD, capacity limit is 1GB, SSD;
# /home/disk2/doris, capacity limit is disk capacity, HDD(default)
#
# you also can specify the properties by setting '<property>:<value>', seperate by ','
# property 'medium' has a higher priority than the extension of path
#
# Default value is ${DORIS_HOME}/storage, you should create it by hand.
# storage_root_path = ${DORIS_HOME}/storage
#storage_root_path = /dorisdata/storage.SSD
storage_root_path=/dorisdata/storage,medium:ssd

# Advanced configurations
# sys_log_dir = ${DORIS_HOME}/log
# sys_log_roll_mode = SIZE-MB-1024
# sys_log_roll_num = 10
# sys_log_verbose_modules = *
# log_buffer_level = -1
# palo_cgroups

重启依旧失败,日志如下:

start time: 2022年 08月 14日 星期日 22:02:36 CST
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0814 22:02:36.638031  5724 env.cpp:46] Env init successfully.
*** Check failure stack trace: ***
    @     0x55ac28235c4d  google::LogMessage::Fail()
    @     0x55ac28238189  google::LogMessage::SendToLog()
    @     0x55ac282357b6  google::LogMessage::Flush()
    @     0x55ac282387f9  google::LogMessageFatal::~LogMessageFatal()
    @     0x55ac25f04c4f  main
    @     0x7fe7b402c083  __libc_start_main
    @     0x55ac26118fea  _start
    @              (nil)  (unknown)
*** Aborted at 1660485756 (unix time) try "date -d @1660485756" if you are using GNU date ***
*** SIGABRT unkown detail explain (@0x165c) received by PID 5724 (TID 0x7fe7b3fe12c0) from PID 5724; stack trace: ***
 0# 0x000055AC2611B768 in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 1# 0x00007FE7B404B090 in /lib/x86_64-linux-gnu/libc.so.6
 2# raise at ../sysdeps/unix/sysv/linux/raise.c:51
 3# abort at /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81
 4# 0x000055AC25DF2124 in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 5# 0x000055AC28235C4D in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 6# google::LogMessage::SendToLog() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 7# google::LogMessage::Flush() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 8# google::LogMessageFatal::~LogMessageFatal() in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
 9# main in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be
10# __libc_start_main at ../csu/libc-start.c:342
11# _start in /export/server/apache-doris-1.1.1-bin-x86/be/lib/doris_be

这样居然就解决了路径不能识别的问题。。。之后:

root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# ll
总用量 28
drwxrwxr-x 2 1020 1020 4096 814 22:02 ./
drwxr-xr-x 8 1020 1020 4096 725 15:11 ../
lrwxrwxrwx 1 root root   27 814 22:02 be.INFO -> be.INFO.log.20220814-220236
-rw-r--r-- 1 root root 4366 814 22:25 be.INFO.log.20220814-220236
-rw-r--r-- 1 root root 7038 814 22:25 be.out
lrwxrwxrwx 1 root root   30 814 22:02 be.WARNING -> be.WARNING.log.20220814-220236
-rw-r--r-- 1 root root 1086 814 22:25 be.WARNING.log.20220814-220236
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# cat be.WARNING.log.20220814-220236
E0814 22:02:36.710678  5724 storage_engine.cpp:426] File descriptor number is less than 60000. Please use (ulimit -n) to set a value equal or greater than 60000
W0814 22:02:36.710800  5724 storage_engine.cpp:188] check fd number failed, error: Internal error: file descriptors limit is too small
W0814 22:02:36.710812  5724 storage_engine.cpp:102] open engine failed, error: Internal error: file descriptors limit is too small
F0814 22:02:36.711380  5724 doris_main.cpp:405] fail to open StorageEngine, res=file descriptors limit is too small
E0814 22:25:04.640094  7357 storage_engine.cpp:426] File descriptor number is less than 60000. Please use (ulimit -n) to set a value equal or greater than 60000
W0814 22:25:04.640141  7357 storage_engine.cpp:188] check fd number failed, error: Internal error: file descriptors limit is too small
W0814 22:25:04.640148  7357 storage_engine.cpp:102] open engine failed, error: Internal error: file descriptors limit is too small
F0814 22:25:04.640600  7357 doris_main.cpp:405] fail to open StorageEngine, res=file descriptors limit is too small

可以看到文件句柄太少,导致BE不愿意启动,需要设置文件句柄>60000才能正常启动BE。参考官方文档:

root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# vi /etc/security/limits.conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# cat /etc/security/limits.conf
# /etc/security/limits.conf
#
#Each line describes a limit for a user in the form:
#
#<domain>        <type>  <item>  <value>
#
#Where:
#<domain> can be:
#        - a user name
#        - a group name, with @group syntax
#        - the wildcard *, for default entry
#        - the wildcard %, can be also used with %group syntax,
#                 for maxlogin limit
#        - NOTE: group and wildcard limits are not applied to root.
#          To apply a limit to the root user, <domain> must be
#          the literal username root.
#
#<type> can have the two values:
#        - "soft" for enforcing the soft limits
#        - "hard" for enforcing hard limits
#
#<item> can be one of the following:
#        - core - limits the core file size (KB)
#        - data - max data size (KB)
#        - fsize - maximum filesize (KB)
#        - memlock - max locked-in-memory address space (KB)
#        - nofile - max number of open file descriptors
#        - rss - max resident set size (KB)
#        - stack - max stack size (KB)
#        - cpu - max CPU time (MIN)
#        - nproc - max number of processes
#        - as - address space limit (KB)
#        - maxlogins - max number of logins for this user
#        - maxsyslogins - max number of logins on the system
#        - priority - the priority to run user process with
#        - locks - max number of file locks the user can hold
#        - sigpending - max number of pending signals
#        - msgqueue - max memory used by POSIX message queues (bytes)
#        - nice - max nice priority allowed to raise to values: [-20, 19]
#        - rtprio - max realtime priority
#        - chroot - change root to directory (Debian-specific)
#
#<domain>      <type>  <item>         <value>
#

#*               soft    core            0
#root            hard    core            100000
#*               hard    rss             10000
#@student        hard    nproc           20
#@faculty        soft    nproc           20
#@faculty        hard    nproc           50
#ftp             hard    nproc           0
#ftp             -       chroot          /ftp
#@student        -       maxlogins       4
* soft nofile 65536
* hard nofile 65536

# End of file

重启Ubuntu,问题依旧:

E0814 22:38:35.834836  4142 storage_engine.cpp:426] File descriptor number is less than 60000. Please use (ulimit -n) to set a value equal or greater than 60000
W0814 22:38:35.835243  4142 storage_engine.cpp:188] check fd number failed, error: Internal error: file descriptors limit is too small
W0814 22:38:35.835258  4142 storage_engine.cpp:102] open engine failed, error: Internal error: file descriptors limit is too small
F0814 22:38:35.835891  4142 doris_main.cpp:405] fail to open StorageEngine, res=file descriptors limit is too small

显然还是句柄太少。官方文档也不是特别可靠。。。

重新修改这个句柄配置:

root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# vim /etc/security/limits.conf
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# cat /etc/security/limits.conf
# /etc/security/limits.conf
#
#Each line describes a limit for a user in the form:
#
#<domain>        <type>  <item>  <value>
#
#Where:
#<domain> can be:
#        - a user name
#        - a group name, with @group syntax
#        - the wildcard *, for default entry
#        - the wildcard %, can be also used with %group syntax,
#                 for maxlogin limit
#        - NOTE: group and wildcard limits are not applied to root.
#          To apply a limit to the root user, <domain> must be
#          the literal username root.
#
#<type> can have the two values:
#        - "soft" for enforcing the soft limits
#        - "hard" for enforcing hard limits
#
#<item> can be one of the following:
#        - core - limits the core file size (KB)
#        - data - max data size (KB)
#        - fsize - maximum filesize (KB)
#        - memlock - max locked-in-memory address space (KB)
#        - nofile - max number of open file descriptors
#        - rss - max resident set size (KB)
#        - stack - max stack size (KB)
#        - cpu - max CPU time (MIN)
#        - nproc - max number of processes
#        - as - address space limit (KB)
#        - maxlogins - max number of logins for this user
#        - maxsyslogins - max number of logins on the system
#        - priority - the priority to run user process with
#        - locks - max number of file locks the user can hold
#        - sigpending - max number of pending signals
#        - msgqueue - max memory used by POSIX message queues (bytes)
#        - nice - max nice priority allowed to raise to values: [-20, 19]
#        - rtprio - max realtime priority
#        - chroot - change root to directory (Debian-specific)
#
#<domain>      <type>  <item>         <value>
#

#*               soft    core            0
#root            hard    core            100000
#*               hard    rss             10000
#@student        hard    nproc           20
#@faculty        soft    nproc           20
#@faculty        hard    nproc           50
#ftp             hard    nproc           0
#ftp             -       chroot          /ftp
#@student        -       maxlogins       4
* soft nofile 204800
* hard nofile 204800
* soft nproc 204800
* hard nproc 204800

# End of file

再修改另一个文件:

root@zhiyong-doris:/# vim /etc/sysctl.conf
root@zhiyong-doris:/# cat /etc/sysctl.conf
#
# /etc/sysctl.conf - Configuration file for setting system variables
# See /etc/sysctl.d/ for additional system variables.
# See sysctl.conf (5) for information.
#

#kernel.domainname = example.com

# Uncomment the following to stop low-level messages on console
#kernel.printk = 3 4 1 3

##############################################################3
# Functions previously found in netbase
#

# Uncomment the next two lines to enable Spoof protection (reverse-path filter)
# Turn on Source Address Verification in all interfaces to
# prevent some spoofing attacks
#net.ipv4.conf.default.rp_filter=1
#net.ipv4.conf.all.rp_filter=1

# Uncomment the next line to enable TCP/IP SYN cookies
# See http://lwn.net/Articles/277146/
# Note: This may impact IPv6 TCP sessions too
#net.ipv4.tcp_syncookies=1

# Uncomment the next line to enable packet forwarding for IPv4
#net.ipv4.ip_forward=1

# Uncomment the next line to enable packet forwarding for IPv6
#  Enabling this option disables Stateless Address Autoconfiguration
#  based on Router Advertisements for this host
#net.ipv6.conf.all.forwarding=1


###################################################################
# Additional settings - these settings can improve the network
# security of the host and prevent against some network attacks
# including spoofing attacks and man in the middle attacks through
# redirection. Some network environments, however, require that these
# settings are disabled so review and enable them as needed.
#
# Do not accept ICMP redirects (prevent MITM attacks)
#net.ipv4.conf.all.accept_redirects = 0
#net.ipv6.conf.all.accept_redirects = 0
# _or_
# Accept ICMP redirects only for gateways listed in our default
# gateway list (enabled by default)
# net.ipv4.conf.all.secure_redirects = 1
#
# Do not send ICMP redirects (we are not a router)
#net.ipv4.conf.all.send_redirects = 0
#
# Do not accept IP source route packets (we are not a router)
#net.ipv4.conf.all.accept_source_route = 0
#net.ipv6.conf.all.accept_source_route = 0
#
# Log Martian Packets
#net.ipv4.conf.all.log_martians = 1
#

###################################################################
# Magic system request Key
# 0=disable, 1=enable all, >1 bitmask of sysrq functions
# See https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html
# for what other values do
#kernel.sysrq=438
fs.file-max = 6553560

查看open File:

root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# ulimit -a | grep open
open files                      (-n) 1024
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# vim /etc/profile
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# cat /etc/profile
# /etc/profile: system-wide .profile file for the Bourne shell (sh(1))
# and Bourne compatible shells (bash(1), ksh(1), ash(1), ...).

if [ "${PS1-}" ]; then
  if [ "${BASH-}" ] && [ "$BASH" != "/bin/sh" ]; then
    # The file bash.bashrc already sets the default PS1.
    # PS1='\h:\w\$ '
    if [ -f /etc/bash.bashrc ]; then
      . /etc/bash.bashrc
    fi
  else
    if [ "`id -u`" -eq 0 ]; then
      PS1='# '
    else
      PS1='$ '
    fi
  fi
fi

if [ -d /etc/profile.d ]; then
  for i in /etc/profile.d/*.sh; do
    if [ -r $i ]; then
      . $i
    fi
  done
  unset i
fi

ulimit -u 204800
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# source /etc/profile
zhiyong@zhiyong-doris:~$ ulimit -a | grep open
open files                      (-n) 204800
root@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/be/log# ulimit -a | grep open
open files                      (-n) 1024

显然Ubuntu20.04中该配置对root用户不生效。所以。。。只能将相关的文件属主更换为普通用户

root@zhiyong-doris:/# chown -R zhiyong:zhiyong /export
root@zhiyong-doris:/# chown -R zhiyong:zhiyong /dorisdata
zhiyong@zhiyong-doris:~$ start_fe.sh --daemon
zhiyong@zhiyong-doris:~$ start_be.sh --daemon

重启后打开网站:

http://192.168.88.21:8040/api/health

在这里插入图片描述

终于启动成功!!!

zhiyong@zhiyong-doris:~$ mysql -h 192.168.88.21 -P 9030 -uroot

mysql> SHOW PROC '/backends';
+-----------+-----------------+---------------+---------------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------+----------------------+-------------------------------------------------------------------------------------------------------------------------------+
| BackendId | Cluster         | IP            | HostName      | HeartbeatPort | BePort | HttpPort | BrpcPort | LastStartTime       | LastHeartbeat       | Alive | SystemDecommissioned | ClusterDecommissioned | TabletNum | DataUsedCapacity | AvailCapacity | TotalCapacity | UsedPct | MaxDiskUsedPct | Tag                      | ErrMsg | Version              | Status                                                                                                                        |
+-----------+-----------------+---------------+---------------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------+----------------------+-------------------------------------------------------------------------------------------------------------------------------+
| 10002     | default_cluster | 192.168.88.21 | 192.168.88.21 | 9050          | 9060   | 8040     | 8060     | 2022-08-14 23:09:58 | 2022-08-14 23:15:58 | true  | false                | false                 | 0         | 0.000            | 268.832 GB    | 293.797 GB    | 8.50 %  | 8.50 %         | {"location" : "default"} |        | 1.1.1-rc03-2dbd70bf9 | {"lastSuccessReportTabletsTime":"2022-08-14 23:15:22","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} |
+-----------+-----------------+---------------+---------------+---------------+--------+----------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+------------------+---------------+---------------+---------+----------------+--------------------------+--------+----------------------+-------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.22 sec)

可以看到此时Alive=true

FS_Broker 部署

官方文档:Broker 以插件的形式,独立于 Doris 部署。如果需要从第三方存储系统导入数据,需要部署相应的 Broker,默认提供了读取 HDFS 、对象存储的 fs_broker。fs_broker 是无状态的,建议每一个 FE 和 BE 节点都部署一个 Broker。

cp文件

由于已经放置在正确路径,不再cp。

zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86$ ll
总用量 120
drwxrwxr-x  7 zhiyong zhiyong  4096 725 15:11 ./
drwxrwxrwx  3 zhiyong zhiyong  4096 814 21:09 ../
drwxr-xr-x  5 zhiyong zhiyong  4096 725 15:11 apache_hdfs_broker/
drwxr-xr-x  8 zhiyong zhiyong  4096 725 15:11 be/
drwxr-xr-x 10 zhiyong zhiyong  4096 814 21:20 fe/
-rw-rw-r--  1 zhiyong zhiyong 86171 725 15:11 LICENSE-dist.txt
drwxrwxr-x  2 zhiyong zhiyong  4096 725 15:11 licenses/
-rw-rw-r--  1 zhiyong zhiyong  1948 725 15:11 NOTICE.txt
drwxr-xr-x  4 zhiyong zhiyong  4096 725 15:11 udf/

可以看到官方的安装包默认已经有hdfs的broker。

修改Broker配置

zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/conf$ pwd
/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/conf
zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/conf$ ll
总用量 20
drwxr-xr-x 2 zhiyong zhiyong 4096 725 15:11 ./
drwxr-xr-x 5 zhiyong zhiyong 4096 725 15:11 ../
-rw-rw-r-- 1 zhiyong zhiyong 1543 324 16:59 apache_hdfs_broker.conf
-rw-rw-r-- 1 zhiyong zhiyong  956 324 16:59 hdfs-site.xml
-rw-rw-r-- 1 zhiyong zhiyong 1426 324 16:59 log4j.properties

显然需要修改hdfs-site.xml使其可以连接HDFS。还需要修改apache_hdfs_broker.conf这个配置文件。笔者不太可能使用Doris自己从HDFS拉数据,最多是用Spark或者Flink灌数据,略过。

启动Broker

zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/bin$ pwd
/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/bin
zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/bin$ ll
总用量 16
drwxr-xr-x 2 zhiyong zhiyong 4096 725 15:11 ./
drwxr-xr-x 5 zhiyong zhiyong 4096 725 15:11 ../
-rwxrwxr-x 1 zhiyong zhiyong 2892 58 12:06 start_broker.sh*
-rwxrwxr-x 1 zhiyong zhiyong 1603 324 16:59 stop_broker.sh*
zhiyong@zhiyong-doris:/export/server/apache-doris-1.1.1-bin-x86/apache_hdfs_broker/bin$ ./start_broker.sh --daemon

使用起来和其它命令没什么区别。

添加Broker

使用 mysql-client 连接启动的 FE,执行以下命令:

ALTER SYSTEM ADD BROKER broker_name "broker_host1:broker_ipc_port1","broker_host2:broker_ipc_port2",...;

其中 broker_host 为 Broker 所在节点 ip;broker_ipc_port 在 Broker 配置文件中的conf/apache_hdfs_broker.conf。

查看Broker状态

使用 mysql-client 连接任一已启动的 FE,执行以下命令查看 Broker 状态:

SHOW PROC "/brokers";

先记下来。

Superbisor守护进程

在生产环境中,所有实例都应使用守护进程启动,以保证进程退出后,会被自动拉起,如 Supervisor。如需使用守护进程启动,在 0.9.0 及之前版本中,需要修改各个 start_xx.sh 脚本,去掉最后的 & 符号。从 0.10.0 版本开始,直接调用 sh start_xx.sh 启动即可。笔者安装的Doris1.1已经不需要手动配置这一步。新版本有新版本的好处。

组件集成

导入导出等功能不是笔者关注的重点。笔者最关心的是Doris完美支持了Spark及Flink的连接器。

Spark连接器

官方文档:https://doris.apache.org/zh-CN/docs/ecosystem/spark-doris-connector/

Spark Doris Connector 可以支持通过 Spark 读取 Doris 中存储的数据,也支持通过Spark写入数据到Doris。

代码库地址:https://github.com/apache/incubator-doris-spark-connector

  • 支持从Doris中读取数据
  • 支持Spark DataFrame批量/流式 写入Doris
  • 可以将Doris表映射为DataFrame或者RDD,推荐使用DataFrame
  • 支持在Doris端完成数据过滤,减少数据传输量。

版本兼容

ConnectorSparkDorisJavaScala
2.3.4-2.11.xx2.x0.12+82.11
3.1.2-2.12.xx3.x0.12.+82.12
3.2.0-2.12.xx3.2.x0.12.+82.12

比较遗憾的是还不支持目前最新的Spark3.3.0。

GAV:

<dependency>
  <groupId>org.apache.doris</groupId>
  <artifactId>spark-doris-connector-3.1_2.12</artifactId>
  <!--artifactId>spark-doris-connector-2.3_2.11</artifactId-->
  <version>1.0.1</version>
</dependency>

SparkBatchStream模式都可以支持,功能比较完善,既然支持了RDDDataFrame,那么上层的SQL也是支持的。具体官方文档介绍的很详细。

Flink连接器

官方文档:https://doris.apache.org/zh-CN/docs/ecosystem/flink-doris-connector

link Doris Connector 可以支持通过 Flink 操作(读取、插入、修改、删除) Doris 中存储的数据。

代码库地址:https://github.com/apache/doris-flink-connector

  • 可以将 Doris 表映射为 DataStream 或者 Table

注意:

  1. 修改和删除只支持在 Unique Key 模型上
  2. 目前的删除是支持 Flink CDC 的方式接入数据实现自动删除,如果是其他数据接入的方式删除需要自己实现。Flink CDC 的数据删除使用方式参照本文档最后一节

版本兼容

ConnectorFlinkDorisJavaScala
1.14_2.11-1.1.01.14.x1.0+82.11
1.14_2.12-1.1.01.14.x1.0+82.12

非常完美地支持Flink的里程碑1.14版本。当然也不支持目前最新的1.15版本。

GAV:

<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-java</artifactId>
    <version>${flink.version}</version>
    <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-streaming-java_${scala.version}</artifactId>
    <version>${flink.version}</version>
    <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-clients_${scala.version}</artifactId>
    <version>${flink.version}</version>
    <scope>provided</scope>
</dependency>
<!-- flink table -->
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-table-planner_${scala.version}</artifactId>
    <version>${flink.version}</version>
    <scope>provided</scope>
</dependency>

<!-- flink-doris-connector -->
<dependency>
  <groupId>org.apache.doris</groupId>
  <artifactId>flink-doris-connector-1.14_2.12</artifactId>
  <version>1.1.0</version>
</dependency>  

FlinkDataStreamTable都可以支持,那么上层SQL当然也没啥问题,具体官方文档介绍的很详细。

案例

官方案例:https://github.com/apache/doris/tree/master/samples/doris-demo

给出了JavaC++Python的Demo,还有SparkFlinkSpring-JDBC的Demo,对新手异常友好!

组件关闭

zhiyong@zhiyong-doris:~$ stop_be.sh
stop doris_be, and remove pid file.
zhiyong@zhiyong-doris:~$ stop_fe.sh
stop java, and remove pid file.

命令也很简单,不拖泥带水。

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值