安装SmartbiMPP 单节点（rpm/deb包）

一、安装指南

1、系统环境要求

组件	要求
操作系统	只支持 64位的Linux系统建议： CentOS 6.6及以上 RedHat 6.6及以上 Ubuntu 22.04及以上
CPU	建议8核或以上
内存	建议32G以上
硬盘	建议固态硬盘或高速硬盘空间大小2048G

官方预构建的二进制文件通常针对x86_64进行编译，并利用SSE 4.2指令集，部署前需要检查环境是否支持SSE4.2指令（arm架构平台，不需要检测SSE4.2指令）

grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

2、安装介质装备

SmartbiMPP 介质（RPM包）

介质名称
clickhouse-common-static-22.8.9.24.x86_64.rpm
clickhouse-server-22.8.9.24.x86_64.rpm
clickhouse-client-22.8.9.24.x86_64.rpm

SmartbiMPP 介质（deb包）

介质名称
clickhouse-common-static_22.8.9.24_amd64.deb
clickhouse-server_22.8.9.24_amd64.deb
clickhouse-client_22.8.9.24_amd64.deb

安装权限要求：

安装时用root用户运行或使用具有sudo权限的用户执行，推荐使用root用户安装。

3、设置系统环境

3.1 取消打开文件数限制

在/etc/security/limits.conf 文件的末尾加入以下内容(注意： * 不能省略)：

* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072

在/etc/sysctl.conf中，添加以下内容：

fs.inotify.max_user_watches=524288 
fs.inotify.max_user_instances=8192

保存后，执行以下命令让设置生效

sudo sysctl -p /etc/sysctl.conf

3.2 取消SELINUX（UBUNTU 系统不需要配置此步骤）

#临时关闭selinux
setenforce 0
#永久关闭Selinux，需要重启服务器生效
sed -i 's/=enforcing/=disabled/g'  /etc/selinux/config

3.3 关闭防火墙

1）CentOS 6.X/Redhat 6.X

service iptables stop
service ip6tables stop
chkconfig iptables off
chkconfig ip6tables off

2）CentOS 7.X/Redhat 7.X

systemctl stop firewalld 
systemctl disable firewalld 
systemctl status firewalld

3）Ubuntu系统

Ubuntu系统默认的防火墙是ufw，具体操作如下

ufw disable 
ufw status

3.4开启防火墙

如果需要开启防火墙，则需要开放smartbiMPP服务使用到的端口：

服务名	开放端口
smartbiMPP	9000,8123

Centos系统

开启端口：9000、8123

firewall-cmd --permanent --add-port=9000/tcp
firewall-cmd --permanent --add-port=8123/tcp

配置完以后重新加载firewalld，使配置生效

firewall-cmd --reload

查看防火墙的配置信息

firewall-cmd --list-all

Ubuntu系统

开启端口：9000、8123

ufw allow 8123/tcp
ufw allow 9000/tcp

配置完以后重新加载firewalld，使配置生效

ufw reload

查看防火墙的配置信息

ufw status numbered

温馨提示

建议关闭服务器虚拟内存，避免查询期间使用虚拟内存导致查询变慢和查询异常。

swapoff -a #临时关闭

4、安装SmartbiMPP

4.1 安装SmartbiMPP

RPM包安装：

rpm -ivh clickhouse-common-static-22.8.9.24.x86_64.rpm
rpm -ivh clickhouse-server-22.8.9.24.x86_64.rpm

注意：安装server服务时此处会出现输入密码提示：Enter password for default user:   
	选择一：此处可以设置密码，密码信息将会保存在/etc/clickhouse-server/users.d/default-password.xml文件中，后期修改密码可在此文件修改。
	选择二：跳过不设置密码，密码为空，没有生成/etc/clickhouse-server/users.d/default-password.xml文件，后期可以在/etc/clickhouse-server/users.xml文件中添加或修改密码。

rpm -ivh clickhouse-client-22.8.9.24.x86_64.rpm

DEB包安装

dpkg -i clickhouse-common-static_22.8.9.24_amd64.deb
dpkg -i clickhouse-server_22.8.9.24_amd64.deb

注意：安装server服务时此处会出现输入密码提示：Enter password for default user:   
	选择一：此处可以设置密码，密码信息将会保存在/etc/clickhouse-server/users.d/default-password.xml文件中，后期修改密码可在此文件修改。
	选择二：跳过不设置密码，密码为空，没有生成/etc/clickhouse-server/users.d/default-password.xml文件，后期可以在/etc/clickhouse-server/users.xml文件中添加或修改密码。

dpkg -i clickhouse-client_22.8.9.24_amd64.deb

5、配置修改说明

5.1 日志文件存放路径

修改的配置文件/etc/clickhouse-server/config.xml，可以配置mpp的日志存放路径。

如修改日志目录，启动时可能提示权限不足，需要为指定的日志目录授权

#chown -R clickhouse.clickhouse <新的日志目录>

5.2 修改默认端口

8123是JDBC驱动的连接端口。

9000是clickhouse-client的连接端口。

一般不建议修改端口配置。

5.3 修改数据目录

数据存储目录用来存储SMARTBIMPP的数据文件，可以设置其他目录用于存储SMARTBIMPP的数据文件，如果指定其他目录，服务启动时可能会提示目录权限不足，需要为指定的目录添加权限：

chown -R clickhouse.clickhouse <新数据目录>

根据实际情况（找一个磁盘空间比较大的存储路径），修改的配置文件/etc/clickhouse-server/config.xml，修改之后的内容如下：

5.4 开启远程访问权限

修改的配置文件/etc/clickhouse-server/config.xml，放开注释即可。IPV4地址和IPV6地址只能配置一个，否则会出现端口冲突的报错。

如果需要配置监听ipv4地址，则配置问<listen_host>0.0.0.0</listen_host>

如果需要配置监听ipv6地址，则配置问<listen_host>::</listen_host>

温馨提示

ipv4地址和ipv6地址只能配置一个。

不是所有环境都支持ipv6地址，建议优先选择配置监听ipv4地址。

5.5 修改SMARTBIMPP内存设置

22.8版本的clickhouse，修改内存使用的配置文件在/etc/clickhouse-server/config.xml

修改配置文件/etc/clickhouse-server/config.xml

<max_server_memory_usage>0</max_server_memory_usage>

<max_server_memory_usage_to_ram_ratio>0.9</max_server_memory_usage_to_ram_ratio>

温馨提示

max_server_memory_usage默认值为0，表示不限制内存使用。跟config.xml文件的max_server_memory_usage_to_ram_ratio参数一起使用。默认最大内存是服务器的物理内存的90%。

max_server_memory_usage的配置值不能大于服务器物理内存乘以max_server_memory_usage_to_ram_ratio的值。

max_server_memory_usage的配置值也不能太小，否则可能导致无法执行sql查询

5.6 其他配置修改：

修改配置文件/etc/clickhouse-server/user.xml

查询语句大小限制，默认的max_query_size最大是256 KiB，如果查询语句比较大，可以配置此项，下图配置为10G，可自行调整，但不超过mpp最大内存

<max_query_size>10737418240</max_query_size>

查询语法树中的最大元素数。默认值下为50000，如果超过，会引发异常。
<max_ast_elements>10000000</max_ast_elements>

<max_expanded_ast_elements>10000000</max_expanded_ast_elements>

5.7 修改用户密码

情况一：安装server服务时设置过密码

默认密码是在安装时已经设置的密码，密码会被加密处理后存放在/etc/clickhouse-server/users.d/default-password.xml配置文件中，若要修改修改密码，需要先将密码加密。然后替换默认密码

具体操作如下：

1）执行下面命令生成密码

echo -n "manager" | sha256sum | tr -d '-'

2）设置密码

vi /etc/clickhouse-server/users.d/default-password.xml  # 编辑该文件，将加密生成的密码替换之前的密码。

情况二：安装server服务时没有设置密码

没设置密码时，连接高速缓存库密码为空，修改/etc/clickhouse-server/users.xml文件可以修改密码。

方法一：设置明文密码

方法二：设置加密密码

echo -n "manager" | sha256sum | tr -d '-'

vi /etc/clickhouse-server/users.xml       #  编辑该文件，将加密生成的密码替换之前的密码。

温馨提示

1、建议使用方法二，设置加密的密码。

2、密码复杂度建议：密码长度大于8位+大小写字母+数字+特殊字符。

5.8、设置时区

修改的配置文件/etc/clickhouse-server/config.xml，设置时区为北京时间 Asia/Shanghai。

<timezone>Asia/Shanghai</timezone>

6 启动SMARTBIMPP

启动成功后，即可在smartbi中测试连接。

22.8版本的clickhouse，启动方式如下

clickhouse start

二、使用指南

1、运维操作

1）CentOS 6.X/Redhat 6.X

# 停止服务
clickhouse stop
# 启动服务
clickhouse start
# 查看服务进程
ps  -ef｜grep  clickhouse
# 设置开机自启
chkconfig clickhouse-server on

2）CentOS 7.X/Redhat 7.X

# 停止服务
clickhouse stop
# 启动服务
clickhouse start
# 查看服务进程
ps  -ef｜grep  clickhouse
# 设置开机自启
systemctl enable clickhouse-server

3）Ubuntu系统

# 停止服务
clickhouse stop
# 启动服务
clickhouse start
# 查看服务进程
ps  -ef｜grep  clickhouse
# 设置开机自启
systemctl enable clickhouse-server

2、上手教程

2.1 命令行登录

# 用clickhouse-client连接本机clickhouse-server服务器
clickhouse-client

# 用本机clickhouse-client连接远程clickhouse-server服务器
clickhouse-client --host 192.168.3.54 --port 9000 --database default --user default --password manager

2.2 建表语句

ENGINE：是表的引擎类型，最常用的MergeTree。还有一个Log引擎也是比较常用。MergeTree要求有一个日期字段，还有主键。Log没有这个限制。

CREATE TABLE code_province(
    state_province        String,
    province_name         String,
    create_date           date
) ENGINE =
MergeTree(create_date, (state_province), 8192);

------------------------------------------------------------------------
CREATE TABLE code_province(
    state_province        String,
    province_name         String,
    create_date           date
) ENGINE = Log;

create_date：是表的日期字段，一个表必须要有一个日期字段。

State_province：是表的主键，主键可以有多个字段，每个字段用逗号分隔。

8192：是索引粒度，用默认值8192即可。

2.3 数据加载

-----------导入CSV文件---------------


clickhouse-client --query "INSERT INTO default.code_province FORMAT CSV" < code_province.csv

3、数据备份

1、停止服务 clickhouse stop

2、备份SMARTBIMPP的整个数据目录即可。数据目录可在/etc/clickhouse-server/config.xml文件中查看。

4、常见问题

关于日志：

SmartbiMPP默认日志存放路径：/var/log/clickhouse-server/ 目录包含:

clickhouse-server.log ： SmartbiMPP运行日志

clickhouse-server.err.log：SmartbiMPP错误日志

遇到SmartbiMPP启动失败的时候，一般都可以根据日志来分析问题。

如果修改了日志路径，请根据实际配置路径（/etc/clickhouse-server/config.xml）查找错误日志路径。

注意事项

如果需要获取SmartbiMPP日志，请使用root用户或者具有sudo权限的用户操作

① 系统不支持ipv6：

日志报错情况如下图所示。手工修改/etc/clickhouse-server/config.xml的监听地址，屏蔽：：和：：1的IPV6监听配置。

② 启动失败：

如下图所示，如果使用service clickhouse-server start启动SmartbiMPP，提示UNKNOWN。此时SmartbiMPP启动失败。但是查看/var/log/clickhouse-server/clickhouse-server.err.log又没有日志输出。

原因：/var/log/clickhouse-server/目录的日志文件不是clickhouse用户权限，所以启动SmartbiMPP时，没权限写日志导致启动失败。

处理方法：给日志目录赋予clickhouse的权限

# chown -R clickhouse:clickhouse /var/log/clickhouse-server
# clickhouse start

5、调试方法

有时可能会出现SmartbiMPP启动失败，但是又没有日志的情况，可以使用如下命令启动SmartbiMPP，在前端获取启动日志分析问题。

使用指定配置文件启动SmartbiMPP
# sudo -u clickhouse clickhouse-server --config-file=/etc/clickhouse-server/config.xml

6、smartbi连接高速缓存库

高速缓存库在Smartbi中的配置方法如下：

（1）启动服务器，在浏览器输入Smartbi地址，进行登录；

（2）输入用户名密码，登录平台；

（3）在“系统导航栏”选择 数据连接，在“资源目录区”的高速缓存库更多操作，选择打开，或双击 高速缓存库；

（4）进入“高速缓存库”界面

（5）据实际使用的数据库，修改相应连接属性，连接属性详情请参见各数据库的连接详情；

（6）点击保存，保存配置。

详细说明请参考：高速缓存库

7、smartbi-mpp调优配置参考

安装完成后，如果需要进行相关的参数调优，可以参考以下示例进行配置

7.1 config.xml调优参考

调优配置项说明

配置项	默认值	建议值	说明
level	trace	error	配置日志级别
size	1000M	100M	配置日志文件大小
listen_host	-	0.0.0.0	限制来源主机的请求 <listen_host>::</listen_host> ：支持IPv6和IPv4网络 <listen_host>0.0.0.0</listen_host> ：不支持IPV6网络 ifconfig 查看是否有inet6表示是否支持IPV6
keep_alive_timeout	3	300	关闭连接之前等待传入请求的秒数，默认为3秒注：推荐300，解决大数据了插入时 broken write问题
max_concurrent_queries	100	300	最大并发处理的请求数(包含 select,insert 等) Maximum number of concurrent queries
query_log	-	-	配置存储数据库名，表名，分区字段，刷新频率和采集频率等相关信息建议增加TTL 只保留30天之内数据，减少系统空间占用 1）query_log 2）trace_log 3）query_thread_log 4）query_views_log 5）part_log <ttl>event_date + INTERVAL 30 DAY DELETE</ttl>
trace_log	-	-
query_thread_log	-	-
query_views_log	-	-
part_log	-	-
openSSL	-	-	SSL客户端/服务器配置
timezone	-	Asia/Shanghai	时区配置
database_atomic_delay_before_drop_table_sec	480	0	解决删除副本表立马重建会报错的问题
inactive_parts_to_throw_insert	-	0	当表中非活跃部分(未参与合并的部分)数量超过此值时，新插入操作会抛出异常	MergeTree引擎配置 <merge_tree> <inactive_parts_to_throw_insert>0</inactive_parts_to_throw_insert> <inactive_parts_to_delay_insert>0</inactive_parts_to_delay_insert> <non_replicated_deduplication_window>0</non_replicated_deduplication_window> <execute_merges_on_single_replica_time_threshold>0</execute_merges_on_single_replica_time_threshold> <try_fetch_recompressed_part_timeout>7200</try_fetch_recompressed_part_timeout> <replicated_fetches_http_connection_timeout>1</replicated_fetches_http_connection_timeout> <replicated_fetches_http_send_timeout>0</replicated_fetches_http_send_timeout> <replicated_fetches_http_receive_timeout>0</replicated_fetches_http_receive_timeout> <max_partitions_to_read>-1</max_partitions_to_read> <allow_nullable_key>FALSE</allow_nullable_key> <min_bytes_to_rebalance_partition_over_jbod>0</min_bytes_to_rebalance_partition_over_jbod> <min_marks_to_honor_max_concurrent_queries>0</min_marks_to_honor_max_concurrent_queries> <detach_old_local_parts_when_cloning_replica>TRUE</detach_old_local_parts_when_cloning_replica> <always_fetch_merged_part>TRUE</always_fetch_merged_part> <max_suspicious_broken_parts>10000</max_suspicious_broken_parts> <parts_to_throw_insert>300</parts_to_throw_insert> <parts_to_delay_insert>150</parts_to_delay_insert> <max_delay_to_insert>256</max_delay_to_insert> <max_parts_in_total>100000</max_parts_in_total> <replicated_deduplication_window>100</replicated_deduplication_window> <replicated_deduplication_window_seconds>604800</replicated_deduplication_window_seconds> <max_replicated_logs_to_keep>1000</max_replicated_logs_to_keep> <min_replicated_logs_to_keep>10</min_replicated_logs_to_keep> <prefer_fetch_merged_part_time_threshold>3600</prefer_fetch_merged_part_time_threshold> <prefer_fetch_merged_part_size_threshold>10737418240</prefer_fetch_merged_part_size_threshold> <max_files_to_modify_in_alter_columns>75</max_files_to_modify_in_alter_columns> <max_files_to_remove_in_alter_columns>50</max_files_to_remove_in_alter_columns> <replicated_max_ratio_of_wrong_parts>0.5</replicated_max_ratio_of_wrong_parts> <replicated_max_parallel_fetches_for_host>15</replicated_max_parallel_fetches_for_host> <replicated_can_become_leader>TRUE</replicated_can_become_leader> <zookeeper_session_expiration_check_period>1</zookeeper_session_expiration_check_period> <old_parts_lifetime>480</old_parts_lifetime> <max_bytes_to_merge_at_max_space_in_pool>161061273600</max_bytes_to_merge_at_max_space_in_pool> <max_bytes_to_merge_at_min_space_in_pool>1048576</max_bytes_to_merge_at_min_space_in_pool> <merge_max_block_size>8192</merge_max_block_size> <max_part_loading_threads>16</max_part_loading_threads> </merge_tree>
inactive_parts_to_delay_insert	-	0	当表中非活跃部分数量超过此值时，新插入操作会被延迟
non_replicated_deduplication_window	-	0	非复制表的去重窗口大小
execute_merges_on_single_replica_time_threshold	-	0	在单个副本上执行合并的时间阈值(秒)，超过此值后会在其他副本上执行
try_fetch_recompressed_part_timeout	-	7200	尝试从其他副本获取重新压缩部分的超时时间(秒)
replicated_fetches_http_connection_timeout	-	1	副本间通过HTTP获取数据时的连接超时时间(秒)
replicated_fetches_http_send_timeout	-	0	副本间通过HTTP发送数据的超时时间(秒)
replicated_fetches_http_receive_timeout	-	0	副本间通过HTTP接收数据的超时时间(秒)
max_partitions_to_read	-	-1	单次查询可读取的最大分区数 (默认-1=无限制)
allow_nullable_key	-	FALSE	是否允许主键或排序键使用Nullable类型
min_bytes_to_rebalance_partition_over_jbod	-	0	在JBOD(多磁盘)配置中，重新平衡分区所需的最小字节数
min_marks_to_honor_max_concurrent_queries	-	0	考虑最大并发查询限制所需的最小标记数
detach_old_local_parts_when_cloning_replica	-	TRUE	克隆副本时是否分离旧的本地部分
always_fetch_merged_part	-	TRUE	是否总是从其他副本获取合并后的部分
max_suspicious_broken_parts	-	10000	允许的最大可疑损坏部分数
parts_to_throw_insert	-	300	当表中未合并部分数量超过此值时，新插入操作会抛出异常
parts_to_delay_insert	-	150	当表中未合并部分数量超过此值时，新插入操作会被延迟
max_delay_to_insert	-	256	插入操作的最大延迟秒数
max_parts_in_total	-	100000	表中允许的最大分区数量上限
replicated_deduplication_window	-	100	复制表去重窗口大
replicated_deduplication_window_seconds	-	604800	复制表去重时间窗口(秒）
max_replicated_logs_to_keep	-	1000	保留的最大复制日志数
min_replicated_logs_to_keep	-	10	保留的最小复制日志数
prefer_fetch_merged_part_time_threshold	-	3600	超过此时间阈值(秒)后，优先获取合并后的部分
prefer_fetch_merged_part_size_threshold	-	10737418240	超过此大小阈值后，优先获取合并后的部分
max_files_to_modify_in_alter_columns	-	75	ALTER COLUMN操作中允许修改的最大文件数
max_files_to_remove_in_alter_columns	-	50	ALTER COLUMN操作中允许删除的最大文件数
replicated_max_ratio_of_wrong_parts	-	0.5	允许错误部分的最大比例
replicated_max_parallel_fetches_for_host	-	15	从单个主机并行获取的最大数量
replicated_can_become_leader	-	TRUE	该副本是否可以成为leader
zookeeper_session_expiration_check_period	-	1	ZooKeeper会话过期检查周期(秒)
old_parts_lifetime	-	480	旧部分保留时间（秒）
max_bytes_to_merge_at_max_space_in_pool	-	161061273600	合并任务最大字节数(默认150G)
max_bytes_to_merge_at_min_space_in_pool	-	1048576	合并任务最小字节数(默认1M)
merge_max_block_size	-	8192	合并操作时的最大块大小
max_part_loading_threads	-	16	加载部分的最大线程数

配置示例config.xml：调优的都配置示例在和<!-- smartbi end →之间

注意

调优参数的配置可以根据实际环境进行调整

<?xml version="1.0"?>
<clickhouse>
    <logger>
<!-- smartbi begin 日志级别从trace调整成error减少无用日志输出-->
        <level>error</level>
<!-- smartbi end  -->
        <log>/var/log/clickhouse-server/clickhouse-server.log</log>
        <errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
<!-- smartbi begin 配置日志文件大小从1000M 修改成100M 防止日志过大无法下载和打开-->  
        <size>100M</size>
<!-- smartbi end  -->
        <count>10</count>
    </logger>
        <header>
            <name>Access-Control-Allow-Origin</name>
            <value>*</value>
        </header>
        <header>
            <name>Access-Control-Allow-Headers</name>
            <value>origin, x-requested-with</value>
        </header>
        <header>
            <name>Access-Control-Allow-Methods</name>
            <value>POST, GET, OPTIONS</value>
        </header>
        <header>
            <name>Access-Control-Max-Age</name>
            <value>86400</value>
        </header>
    <http_port>8123</http_port>
    <tcp_port>9000</tcp_port>
    <mysql_port>9004</mysql_port>
    <postgresql_port>9005</postgresql_port>
    <interserver_http_port>9009</interserver_http_port>
    <listen_host>0.0.0.0</listen_host>
    <max_connections>4096</max_connections>

<!-- smartbi begin 关闭连接之前等待传入请求的秒数， 默认为3秒，调整为300，解决大数据了插入时 broken write问题-->
    <keep_alive_timeout>3</keep_alive_timeout>
<!-- smartbi end  -->
    <grpc>
        <enable_ssl>false</enable_ssl>
        <ssl_cert_file>/path/to/ssl_cert_file</ssl_cert_file>
        <ssl_key_file>/path/to/ssl_key_file</ssl_key_file>
        <ssl_require_client_auth>false</ssl_require_client_auth>
        <ssl_ca_cert_file>/path/to/ssl_ca_cert_file</ssl_ca_cert_file>
        <transport_compression_type>none</transport_compression_type>
        <transport_compression_level>0</transport_compression_level>
        <max_send_message_size>-1</max_send_message_size>
        <max_receive_message_size>-1</max_receive_message_size>
        <verbose_logs>false</verbose_logs>
    </grpc>
    <openSSL>
        <server> 
            <certificateFile>/etc/clickhouse-server/server.crt</certificateFile>
            <privateKeyFile>/etc/clickhouse-server/server.key</privateKeyFile>
            <verificationMode>none</verificationMode>
            <loadDefaultCAFile>true</loadDefaultCAFile>
            <cacheSessions>true</cacheSessions>
            <disableProtocols>sslv2,sslv3</disableProtocols>
            <preferServerCiphers>true</preferServerCiphers>
        </server>
        <client>
            <loadDefaultCAFile>true</loadDefaultCAFile>
            <cacheSessions>true</cacheSessions>
            <disableProtocols>sslv2,sslv3</disableProtocols>
            <preferServerCiphers>true</preferServerCiphers>
            <invalidCertificateHandler>
                <name>RejectCertificateHandler</name>
            </invalidCertificateHandler>
        </client>
    </openSSL>

<!-- smartbi begin 最大并发处理的请求数(包含 select,insert 等)，默认100 调整到300 可以根据机器性能和并发情况再调大-->
    <max_concurrent_queries>300</max_concurrent_queries>
<!-- smartbi end  -->

    <max_server_memory_usage>0</max_server_memory_usage>
    <max_thread_pool_size>10000</max_thread_pool_size>
    <max_server_memory_usage_to_ram_ratio>0.9</max_server_memory_usage_to_ram_ratio>
    <total_memory_profiler_step>4194304</total_memory_profiler_step>
    <total_memory_tracker_sample_probability>0</total_memory_tracker_sample_probability>
    <uncompressed_cache_size>8589934592</uncompressed_cache_size>
    <mark_cache_size>5368709120</mark_cache_size>
    <mmap_cache_size>1000</mmap_cache_size>
    <compiled_expression_cache_size>134217728</compiled_expression_cache_size>
    <compiled_expression_cache_elements_size>10000</compiled_expression_cache_elements_size>
    <path>/var/lib/clickhouse/</path>
    <tmp_path>/var/lib/clickhouse/tmp/</tmp_path>
    <user_files_path>/var/lib/clickhouse/user_files/</user_files_path>
    <ldap_servers>
    </ldap_servers>
    <user_directories>
        <users_xml>
            <path>users.xml</path>
        </users_xml>
        <local_directory>
            <path>/var/lib/clickhouse/access/</path>
        </local_directory>
    </user_directories>
    <default_profile>default</default_profile>
    <custom_settings_prefixes></custom_settings_prefixes>
    <default_database>default</default_database>
<!-- smartbi begin 配置时区-->
     <timezone>Asia/Shanghai</timezone>
<!-- smartbi end  -->  
<!-- smartbi begin 解决删除副本表立马重建会报错的问题 ,默认是480s，调整为0-->
    <database_atomic_delay_before_drop_table_sec>0</database_atomic_delay_before_drop_table_sec>
<!-- smartbi end  -->
    <mlock_executable>true</mlock_executable>
    <remap_executable>false</remap_executable>
    <![CDATA[
         Uncomment below in order to use JDBC table engine and function.

         To install and run JDBC bridge in background:
         * [Debian/Ubuntu]
           export MVN_URL=https://repo1.maven.org/maven2/ru/yandex/clickhouse/clickhouse-jdbc-bridge
           export PKG_VER=$(curl -sL $MVN_URL/maven-metadata.xml | grep '<release>' | sed -e 's|.*>\(.*\)<.*|\1|')
           wget https://github.com/ClickHouse/clickhouse-jdbc-bridge/releases/download/v$PKG_VER/clickhouse-jdbc-bridge_$PKG_VER-1_all.deb
           apt install --no-install-recommends -f ./clickhouse-jdbc-bridge_$PKG_VER-1_all.deb
           clickhouse-jdbc-bridge &

         * [CentOS/RHEL]
           export MVN_URL=https://repo1.maven.org/maven2/ru/yandex/clickhouse/clickhouse-jdbc-bridge
           export PKG_VER=$(curl -sL $MVN_URL/maven-metadata.xml | grep '<release>' | sed -e 's|.*>\(.*\)<.*|\1|')
           wget https://github.com/ClickHouse/clickhouse-jdbc-bridge/releases/download/v$PKG_VER/clickhouse-jdbc-bridge-$PKG_VER-1.noarch.rpm
           yum localinstall -y clickhouse-jdbc-bridge-$PKG_VER-1.noarch.rpm
           clickhouse-jdbc-bridge &

         Please refer to https://github.com/ClickHouse/clickhouse-jdbc-bridge#usage for more information.
    ]]>
    <remote_servers>
    </remote_servers>

    <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>
    <max_session_timeout>3600</max_session_timeout>
    <default_session_timeout>60</default_session_timeout>

    <query_log>
        <database>system</database>
        <table>query_log</table>
        <partition_by>toYYYYMM(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
<!-- smartbi begin 建议增加TTL 只保留30天之内数据，减少系统空间占用-->
<ttl>event_date + INTERVAL 30 DAY DELETE</ttl>
<!-- smartbi end  -->
    </query_log>

    <trace_log>
        <database>system</database>
        <table>trace_log</table>
        <partition_by>toYYYYMM(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
<!-- smartbi begin 建议增加TTL 只保留30天之内数据，减少系统空间占用-->
<ttl>event_date + INTERVAL 30 DAY DELETE</ttl>
<!-- smartbi end  -->
    </trace_log>

    <query_thread_log>
        <database>system</database>
        <table>query_thread_log</table>
        <partition_by>toYYYYMM(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
<!-- smartbi begin 建议增加TTL 只保留30天之内数据，减少系统空间占用-->
<ttl>event_date + INTERVAL 30 DAY DELETE</ttl>
<!-- smartbi end  -->
    </query_thread_log>

    <query_views_log>
        <database>system</database>
        <table>query_views_log</table>
        <partition_by>toYYYYMM(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
<!-- smartbi begin 建议增加TTL 只保留30天之内数据，减少系统空间占用-->
<ttl>event_date + INTERVAL 30 DAY DELETE</ttl>
<!-- smartbi end  -->
    </query_views_log>

    <part_log>
        <database>system</database>
        <table>part_log</table>
        <partition_by>toYYYYMM(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
<!-- smartbi begin 建议增加TTL 只保留30天之内数据，减少系统空间占用-->
<ttl>event_date + INTERVAL 30 DAY DELETE</ttl>
<!-- smartbi end  -->
    </part_log>

    <metric_log>
        <database>system</database>
        <table>metric_log</table>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
        <collect_interval_milliseconds>1000</collect_interval_milliseconds>
    </metric_log>
    <asynchronous_metric_log>
        <database>system</database>
        <table>asynchronous_metric_log</table>
        <flush_interval_milliseconds>7000</flush_interval_milliseconds>
    </asynchronous_metric_log>

    <opentelemetry_span_log>
        <engine>
            engine MergeTree
            partition by toYYYYMM(finish_date)
            order by (finish_date, finish_time_us, trace_id)
        </engine>
        <database>system</database>
        <table>opentelemetry_span_log</table>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </opentelemetry_span_log>

    <crash_log>
        <database>system</database>
        <table>crash_log</table>

        <partition_by />
        <flush_interval_milliseconds>1000</flush_interval_milliseconds>
    </crash_log>

    <session_log>
        <database>system</database>
        <table>session_log</table>

        <partition_by>toYYYYMM(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </session_log>

    <top_level_domains_lists>
    </top_level_domains_lists>
    <dictionaries_config>*_dictionary.xml</dictionaries_config>
    <user_defined_executable_functions_config>*_function.xml</user_defined_executable_functions_config>

    <encryption_codecs>
    </encryption_codecs>
    <distributed_ddl>
        <path>/clickhouse/task_queue/ddl</path>
    </distributed_ddl>

<!-- smartbi begin merge_tree调优配置-->
<merge_tree>
    <inactive_parts_to_throw_insert>0</inactive_parts_to_throw_insert>
    <inactive_parts_to_delay_insert>0</inactive_parts_to_delay_insert>
    <non_replicated_deduplication_window>0</non_replicated_deduplication_window>
    <execute_merges_on_single_replica_time_threshold>0</execute_merges_on_single_replica_time_threshold>
    <try_fetch_recompressed_part_timeout>7200</try_fetch_recompressed_part_timeout>
    <replicated_fetches_http_connection_timeout>1</replicated_fetches_http_connection_timeout>
    <replicated_fetches_http_send_timeout>0</replicated_fetches_http_send_timeout>
    <replicated_fetches_http_receive_timeout>0</replicated_fetches_http_receive_timeout>
    <max_partitions_to_read>-1</max_partitions_to_read>
    <allow_nullable_key>FALSE</allow_nullable_key>
    <min_bytes_to_rebalance_partition_over_jbod>0</min_bytes_to_rebalance_partition_over_jbod>
    <min_marks_to_honor_max_concurrent_queries>0</min_marks_to_honor_max_concurrent_queries>
    <detach_old_local_parts_when_cloning_replica>TRUE</detach_old_local_parts_when_cloning_replica>
    <always_fetch_merged_part>TRUE</always_fetch_merged_part>
    <max_suspicious_broken_parts>10000</max_suspicious_broken_parts>
    <parts_to_throw_insert>300</parts_to_throw_insert>
    <parts_to_delay_insert>150</parts_to_delay_insert>
    <max_delay_to_insert>256</max_delay_to_insert>
    <max_parts_in_total>100000</max_parts_in_total>
    <replicated_deduplication_window>100</replicated_deduplication_window>
    <replicated_deduplication_window_seconds>604800</replicated_deduplication_window_seconds>
    <max_replicated_logs_to_keep>1000</max_replicated_logs_to_keep>
    <min_replicated_logs_to_keep>10</min_replicated_logs_to_keep>
    <prefer_fetch_merged_part_time_threshold>3600</prefer_fetch_merged_part_time_threshold>
    <prefer_fetch_merged_part_size_threshold>10737418240</prefer_fetch_merged_part_size_threshold>
    <max_files_to_modify_in_alter_columns>75</max_files_to_modify_in_alter_columns>
    <max_files_to_remove_in_alter_columns>50</max_files_to_remove_in_alter_columns>
    <replicated_max_ratio_of_wrong_parts>0.5</replicated_max_ratio_of_wrong_parts>
    <replicated_max_parallel_fetches_for_host>15</replicated_max_parallel_fetches_for_host>
    <replicated_can_become_leader>TRUE</replicated_can_become_leader>
    <zookeeper_session_expiration_check_period>1</zookeeper_session_expiration_check_period>
    <old_parts_lifetime>480</old_parts_lifetime>
    <max_bytes_to_merge_at_max_space_in_pool>161061273600</max_bytes_to_merge_at_max_space_in_pool>
    <max_bytes_to_merge_at_min_space_in_pool>1048576</max_bytes_to_merge_at_min_space_in_pool>
    <merge_max_block_size>8192</merge_max_block_size>
    <max_part_loading_threads>16</max_part_loading_threads>
</merge_tree>
<!-- smartbi end  -->

    <graphite_rollup_example>
        <pattern>
            <regexp>click_cost</regexp>
            <function>any</function>
            <retention>
                <age>0</age>
                <precision>3600</precision>
            </retention>
            <retention>
                <age>86400</age>
                <precision>60</precision>
            </retention>
        </pattern>
        <default>
            <function>max</function>
            <retention>
                <age>0</age>
                <precision>60</precision>
            </retention>
            <retention>
                <age>3600</age>
                <precision>300</precision>
            </retention>
            <retention>
                <age>86400</age>
                <precision>3600</precision>
            </retention>
        </default>
    </graphite_rollup_example>

    <format_schema_path>/var/lib/clickhouse/format_schemas/</format_schema_path>
    <query_masking_rules>
        <rule>
            <name>hide encrypt/decrypt arguments</name>
            <regexp>((?:aes_)?(?:encrypt|decrypt)(?:_mysql)?)\s*\(\s*(?:'(?:\\'|.)+'|.*?)\s*\)</regexp>
            <replace>\1(???)</replace>
        </rule>
    </query_masking_rules>

    <send_crash_reports>
        <enabled>false</enabled>
        <anonymize>false</anonymize>
        <endpoint>https://6f33034cfe684dd7a3ab9875e57b1c8d@o388870.ingest.sentry.io/5226277</endpoint>
    </send_crash_reports>
</clickhouse>

7.2 user.xml调优参考

调优配置项说明

配置项	默认值	建议值	说明
background_pool_size	16	cpu 线程数的 2 倍	后台线程池的大小允许的前提下建议改成 cpu 线程数的 2 倍
background_schedule_pool_size	128	cpu 线程数的 2 倍	执行后台任务（复制表、Kafka 流、DNS 缓存更新）的线程数允许的前提下建议改成 cpu 线程数的 2 倍
background_distributed_schedule_pool_size	16	cpu 线程数的 2 倍	设置为分布式发送执行后台任务的线程数允许的前提下建议改成 cpu 线程数的 2 倍
background_fetches_pool_size	8	cpu 线程数的 2 倍	将用于从后台 *MergeTree 引擎表的另一个副本获取数据部分的最大线程数
background_common_pool_size	8	cpu 线程数的 2 倍	将用于在后台对 *MergeTree 引擎表执行各种操作（主要是垃圾回收）的最大线程数
max_threads	当前服务器CPU核数	-	设置单个查询所能使用的最大 cpu 个数
max_memory_usage	0	物理内存的*80%	单次 Query 占用内存最大值，默认为0,表示不限制该值可以设置的比较大，这样可以提升集群查询的上限，一般留操作系统20%。
max_bytes_before_external_group_by	0	max_memory_usage*50%	分组最大内存，当 group 使用内存超过阈值后会刷新到磁盘进行，，默认为0,表示不限制按照 max_memory_usage 的一半设置内存
max_bytes_before_external_sort	0	max_memory_usage*50%	排序最大内存，默认是0表示不限制，但是如果内存不够时直接抛错，设置了该值 order by 可以正常完成，但是速度慢按照 max_memory_usage 的一半设置内存
max_memory_usage_for_user	0	0	单个用户在运行查询时，限制使用的最大内存用量，默认值为0，即不做限制
max_partitions_per_insert_block	100	1000	在单次INSERT写入的时候，限制创建的最大分区个数，默认值为100个如果超出这个阈值数目，将会得到异常；
max_rows_to_group_by	0	0	在执行GROUP BY聚合查询的时候，限制去重后的聚合KEY的最大个数，默认值为0，即不做限制当超过阈值数量的时候，其处理方式由group_by_overflow_mode参数决定；
group_by_overflow_mode	throw	throw	当max_rows_to_group_by熔断规则触发的时候，有三种处理形式: throw抛出异常，此乃默认值； break立即停止查询，并返回当前部分的数据； any仅以当前已存在的聚合KEY，继续完成聚合查询；
max_query_size	262144	26214400	sql语句最大长度，默认是256K 建议设置为25M
max_ast_elements	50000	10000000	查询语法树中的最大元素数，默认值下为50000 建议设置为10000000
max_expanded_ast_elements	50000	10000000	查询语法树中的最大元素数。默认值下为50000 建议设置为10000000
http_receive_timeout	180	3600	接收超时时间，适量增大方式插入数据时出现 Broken pipe
http_send_timeout	180	3600	发送超时时间，适量增大方式插入数据时出现 Broken pipe

配置示例user.xml：调优的都配置示例在和<!-- smartbi end →之间

注意

调优参数的配置可以根据实际环境进行调整

<?xml version="1.0"?>
<clickhouse>
    <profiles>
        <default>
        <!-- smartbi begin 数据接收和发送时间，默认180s,适量增大方式插入数据时出现 Broken pipe-->
        <http_send_timeout>3600</http_send_timeout>
        <http_receive_timeout>3600</http_receive_timeout>
        <max_query_size>26214400</max_query_size>
        <max_ast_elements>10000000</max_ast_elements>
        <max_expanded_ast_elements>10000000</max_expanded_ast_elements>        <!-- 其他user.xml调优参数添加在此处即可 -->
        <!-- smartbi end  →
            <load_balancing>random</load_balancing>
        </default>
        <readonly>
            <readonly>1</readonly>
        </readonly>
    </profiles>
    <users><!--用户配置-->
        <default> <!--用户名-->
            <password></password> <!--置登录密码 免密码登入，支持明文、SHA256加密和double_sha1加密三种形式-->
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
        </default>
    </users>
    <quotas><!--限制资源被过度使用，当使用的资源数量达到阈值时，那么正在进行的操作会被自动中断-->
        <default><!--自定义名称，全局唯一-->
            <interval><!--配置时间间隔，每个时间内的资源消耗限制-->
                <duration>3600</duration><!--表示累积的时间周期，单位是秒-->
                <queries>0</queries><!--表示在周期内允许执行的查询次数，0表示不限制-->
                <errors>0</errors><!--表示在周期内允许发生异常的次数，0表示不限制-->
                <result_rows>0</result_rows> <!--表示在周期内允许查询返回的结果行数，0表示不限制-->
                <read_rows>0</read_rows><!--表示在周期内在分布式查询中，允许远端节点读取的数据行数，0表示不限制。-->
                <execution_time>0</execution_time><!--表示周期内允许执行的查询时间，单位是秒，0表示不限制。-->
            </interval>
        </default>
    </quotas>
</clickhouse>

关注我们

服务支持

页面树结构

安装SmartbiMPP 单节点（rpm/deb包）

一、安装指南

1、系统环境要求

2、安装介质装备

3、设置系统环境

3.1 取消打开文件数限制

3.2 取消SELINUX（UBUNTU 系统不需要配置此步骤）

3.3 关闭防火墙

3.4开启防火墙

4、安装SmartbiMPP

4.1 安装SmartbiMPP

5、配置修改说明

5.1 日志文件存放路径

5.2 修改默认端口

5.3 修改数据目录

5.4 开启远程访问权限

5.5 修改SMARTBIMPP内存设置

5.6 其他配置修改：

5.7 修改用户密码

5.8、设置时区

6 启动SMARTBIMPP

二、使用指南

1、运维操作

2、上手教程

2.1 命令行登录

2.2 建表语句

2.3 数据加载

3、数据备份

4、常见问题

5、调试方法

6、smartbi连接高速缓存库

7、smartbi-mpp调优配置参考

7.1 config.xml调优参考