获取Elasticsearch策略执行情况
GET /_slm/policy/my-backup-slm?human
报错信息:
failed to create snapshot successfully, ****** out of ****** total shards failed","stack_trace":"SnapshotException[[my-backup-repository:my-backup-snapshots-******] failed to create snapshot successfully, ****** out of ****** total shards failed]\n\tat
org.elasticsearch.xpack.slm.SnapshotLifecycleTask$1.onResponse(SnapshotLifecycleTask.java:113)\n\tat org.elasticsearch.xpack.slm.SnapshotLifecycleTask$1.onResponse(SnapshotLifecycleTask.java:95)\n\tat
..........
IndexShardSnapshotFailedException[IllegalArgumentException[Unknown client name [secondary]. Existing client configs: default]]\n\t\tat
org.elasticsearch.snapshots.SnapshotShardFailure.<init>(SnapshotShardFailure.java:66)\n\t\tat
org.elasticsearch.snapshots.SnapshotShardFailure.<init>(SnapshotShardFailure.java:54)\n\t\tat =
从报错信息分析,某些分片无法识别GCS client : secondary , 首先把该分片查找出来。
获取备份阶段
# 先手工执行一次备份
POST _slm/policy/my-backup-slm/_execute
# 获取备份阶段状态
GET _snapshot/_status
"my_index" : {
"shards_stats" : {
"initializing" : 0,
"started" : 0,
"finalizing" : 0,
"done" : 5,
"failed" : 1,
"total" : 6
},
"stats" : {
"incremental" : {
"file_count" : 0,
"size_in_bytes" : 0
},
"total" : {
"file_count" : 5,
"size_in_bytes" : 1040
},
"start_time_in_millis" : 1658916212121,
"time_in_millis" : 3523
},
"shards" : {
"0" : {
"stage" : "FAILURE",
"stats" : {
"incremental" : {
"file_count" : 0,
"size_in_bytes" : 0
},
"total" : {
"file_count" : 0,
"size_in_bytes" : 0
},
"start_time_in_millis" : 0,
"time_in_millis" : 1658916211139
},
"reason" : "IllegalArgumentException[Unknown client name [secondary]. Existing client configs: default]"
},
"1" : {
"stage" : "DONE",
......
},
"2" : {
"stage" : "DONE",
......
},
"3" : {
......
},
"4" : {
"stage" : "DONE",
......
},
"5" : {
"stage" : "DONE",
......
}
}
}
上面显示索引 “my_index”显示共有6个分片,其中分片0备份失败,原因是:Unknown client name [secondary].
检查分片所有节点IP
curl -XGET "http://192.168.X.X:9200/_cat/shards/my_index/?pretty"
# 部分信息用 * 号代替
.......
my_index 0 p STARTED 0 208b 192.168.X.X node_data_**
my_index 0 r STARTED 0 208b 192.168.X.X node_data_**
my_index 0 r STARTED 0 208b 192.168.X.X node_data_**
通过索引获取分片0的主节点IP地址 ~
重新生成keystore
cd /usr/local/elasticsearch
# 将你的json文件重新导入es
./bin/elasticsearch-keystore add-file gcs.client.secondary.credentials_file /your-path/your-key.json
# 修改权限为es进程用户
chown elasticsearch:elasticsearch /usr/local/elasticsearch/config/elasticsearch.keystore
检查存储库
验证快照存储库是否可用
POST /_snapshot/my-backup-repository/_verify