elasticsearch集群

elasticsearch是高可用且可扩展的，可以通过增加节点来增加可靠性，一个节点就是一个elasticsearch实例，一个集群由一个或多个节点组成，具有相同的cluster.name，当加入新节点或删除一个节点时，集群会感知到并平衡数据。集群中一个节点会被选举为主节点master，来管理集群的一些变更，主节点不参与文档的变更和搜索

节点

**主节点(master)**：维护集群信息，如果不作为数据节点的话，不会参与到搜索和索引流程的处理，只会去处理集群内节点的变更、索引的变更等，整个集群中只会有一个master节点

1
2
3

node.master: true
node.data: false
node.ingest: false

**数据节点(data)**：存储索引数据，同时对外提供索引服务，对CPU、内存、IO等资源要求比较高

1
2
3

node.master: false
node.data: true
node.ingest: false

**客户端节点(client)**：既不会成为主节点，也不会成为数据节点，作用是在搜索索引的时候，作为协调节点，对搜索进行负载均衡。如来了一个搜索请求，该节点会将该请求路由到各个数据节点，进行搜索，最终将搜索结果合并，一般在总节点数小于100时，不需要客户端节点

node.master: false
node.data: false
node.ingest: true
search.remote.connect: false

分片

主分片：一个主分片就是一个完全独立的Lucene索引，一个节点可以拥有多个主分片，针对es请求创建索引时，该请求会将一个完整的数据切分为几个分片，分别交给相应的切片机器去处理。分片的作用是为了扩展，可以进行横向扩展和纵向扩展，默认5个，设定之后不可更改

副本分片：一个主分片可以设置0-n个副本分片，设定之后还可以调整，作用是进行数据冗余和提供搜索

集群监控

elasticsearch提供了一套集群监控的api，以_cluster开头

查看集群状态

GET _cluster/state

{
    "cluster_name": "elasticsearch",
    "cluster_uuid": "QIXlJNJGRw2cUicL8u2lpQ",
    "version": 66,
    "state_uuid": "HMMIHgl2SGOe6AOC5oahVA",
    "master_node": "ZUkF5AgmR9WMMDHxPj4oMg",
    "blocks": {},
    "nodes": {},
    "metadata": { // 索引结构相关信息
        "cluster_uuid": "QIXlJNJGRw2cUicL8u2lpQ",
        "templates": {},
        "indices": {},
        "index-graveyard": {
            "tombstones": []
        },
        "ingest": {
            "pipeline": []
        }
    },
    "routing_table": { // 索引分片相关信息
        "indices": {}
    },
    "routing_nodes": {}, // 路由节点相关信息
    "snapshots": {
        "snapshots": []
    },
    "snapshot_deletions": {
        "snapshot_deletions": []
    },
    "restore": {
        "snapshots": []
    }
}

可以只查询集群状态中的某个信息，如只查看metadata

1	GET _cluster/state/metadata

观察集群的健康情况

GET _cluster/health

{
    "cluster_name": "elasticsearch",
    "status": "green", // 集群服务状况，green表示所有分片和副本都可用；yellow表示所有主分片都可用，但不是所有副本分片都可用；red表示不是所有的主要分片都可用
    "timed_out": false,
    "number_of_nodes": 1,
    "number_of_data_nodes": 1,
    "active_primary_shards": 15,//主分片的数量
    "active_shards": 15,// 所有的分片数，宝库主分片和副本分片
    "relocating_shards": 0,// 正在迁移的分片
    "initializing_shards": 0,// 正在创建的分片
    "unassigned_shards": 0,// 由于可用节点过少导致的没有办法分配出来的分片数目
    "delayed_unassigned_shards": 0, // 延迟未被分配的分片
    "number_of_pending_tasks": 0,// 集群中被挂起的任务数目
    "number_of_in_flight_fetch": 0, // 正在进行迁移的分片数量
    "task_max_waiting_in_queue_millis": 0, // 队列中任务的最大等待时间
    "active_shards_percent_as_number": 100.0 // 活动分片的百分比
}

还可针对某个索引来进行健康检查

1	GET _cluster/health/索引名

还可以使用_cat命令来查看

1	GET _cat/helth?v

集群统计

返回两个基本信息，一个是索引信息，包含有分片数量、存储大小、内存使用等；另一个是集群节点信息，包含有节点数量、角色、操作系统信息、jvm版本、内存使用率、cpu和插件安装信息等

GET _cluster/stats


{
    "_nodes": {
        "total": 1,
        "successful": 1,
        "failed": 0
    },
    "cluster_name": "elasticsearch",
    "cluster_uuid": "QIXlJNJGRw2cUicL8u2lpQ",
    "timestamp": 1653788963016,
    "status": "yellow",
    "indices": {}, // 索引统计信息
    "nodes": {} // 节点统计信息
}

节点监控

节点监控使用_nodes开头

节点信息

集群节点信息接口用于搜索一个或多个集群节点信息

GET _nodes


{
    "_nodes": {
        "total": 1,
        "successful": 1,
        "failed": 0
    },
    "cluster_name": "elasticsearch",
    "nodes": {
        "ZUkF5AgmR9WMMDHxPj4oMg": {
            "name": "ZUkF5Ag",
            "transport_address": "192.168.1.220:9300",
            "host": "192.168.1.220",
            "ip": "192.168.1.220",
            "version": "6.8.23",
            "build_flavor": "default",
            "build_type": "rpm",
            "build_hash": "4f67856",
            "total_indexing_buffer": 207591833,
            "roles": [],
            "attributes": {},
            "settings": {},
            "os": {},
            "process": {},
            "jvm": {},
            "thread_pool": {},
            "transport": {},
            "http": {},
            "plugins": [],
            "modules": [],
            "ingest": {}
        }
    }
}

还可以使用_cat来查看

1	GET _cat/nodes?v

节点统计

对一个、多个或全部的集群节点进行统计

GET _nodes/stats

{
    "_nodes": {
        "total": 1,
        "successful": 1,
        "failed": 0
    },
    "cluster_name": "elasticsearch",
    "nodes": {
        "ZUkF5AgmR9WMMDHxPj4oMg": {
            "timestamp": 1653789513649,
            "name": "ZUkF5Ag",
            "transport_address": "192.168.1.220:9300",
            "host": "192.168.1.220",
            "ip": "192.168.1.220:9300",
            "roles": [],
            "attributes": {},
            "indices": {}, // 索引相关的数据统计
            "os": {}, // 操作系统统计
            "process": {},//进程统计
            "jvm": {},// jvm虚拟机统计
            "thread_pool": {},//线程池统计
            "fs": {},// 文件系统信息
            "transport": {},//网络数据统计
            "http": {},// http连接信息
            "breakers": {},// 数据处理的统计
            "script": {},//脚本相关的信息
            "discovery": {},
            "ingest": {},
            "adaptive_selection": {}
        }
    }
}

集群分片迁移

elasticsearch可以通过集群路由可以通过_cluster/reroute来对集群中的分片进行操作，例如可以在集群中把一个分片从一个一个节点迁移到另一个节点，将未分配的分片可以分配到一个特定节点上

POST /_cluster/reroute
{
    "commands" : [
        {
            "move" : {
                "index" : "test", 
              "shard" : 0,
                "from_node" : "node1", "to_node" : "node2"
            }
        },
        {
          "allocate_replica" : {
                "index" : "test", 
            "shard" : 1,
                "node" : "node3"
          }
        }
    ]
}

有三种操作

move 把分片从一节点移动到另一个节点，可以指定索引名和分片号
cancel 取消分配一个分片，node参数可以指定在哪个节点取消正在分配的分片，可以使用allow_primary参数来取消分配主分片
allocate_replica 分配一个未分配的分片到指定节点

集群节点配置

配置更新

配置更新有两种状态，一种是持久的persistent，一种是临时的transient

PUT /_cluster/settings
{
    "persistent" : {
        "indices.recovery.max_bytes_per_sec" : "50mb"
    }
}

获取配置

1	GET /_cluster/settings?include_defaults=true

include_defaults=true表示包含默认配置

常用配置

主节点选举，以discovery.zen为前缀

discovery.zen.ping_timeout 默认3s，主节点选举，选举过程中，候选主节点ping master主节点的超时时间对选举时间进行调整，如果网络缓慢会造成集群重新选举，可适当调整该值
discovery.zen.join_timeout 当一个节点请求加入主节点，会发送请求信息到主节点，请求的超时时间配置，是ping_timeout的20倍，默认60000ms
discovery.zen.minimum_master_nodes 防止脑裂(一个es集群因网络原因或master节点负载过大，响应速度过慢，导致ping主节点超过超时时间，而引起的部分候选主节点进行重新选举，导致产生大于1个master节点，每一个master节点相当于一个独立的es集群)，只有大于该值，才会进行master节点的选举，该参数必须大于等于集群中master候选节点的quorum数量，quorum=master候选主节点数/2 + 1
discovery.zen.no_master_block 当集群中没有活动的master节点后，该设置指定了哪些操作需要被阻塞，可选all和write，默认为write
discovery.zen.ping.unicast.hosts 单播路由服务器，单播时，需要一些服务器列表进行集群状态的传播

故障检测，以discovery.zen.fd为前缀

discovery.zen.fd.ping_interval ping检查的频率，默认1s
discovery.zen.fd.ping_retries ping的失败或超时重试次数，默认3次
discovery.zen.fd.ping_timeout ping的超时时间，在运行中，master检测所有节点，以及节点检测master是否通畅的超时时间，会触发master节点的选举机制，默认30s

分片分配设置，以cluster.routing.allocation开头

cluster.routing.allocation.enable 禁用或启用哪种类型分片，可选项有
- all 允许所有的分片被重新分配，默认
- primaries 只允许主节点分片被重新分配
- new_primaries 只允许新的主节点索引的分片被重新分配
- none 不对任何分片进行重新分配
cluster.routing.allocation.node_concurrent_recoveries 允许在一个节点上同时并发多少个分片分配，默认2
cluster.routing.allocation.node_initial_primaries_recoveries 当副本分片加入到集群时，在一个节点上并发发生分片分配的数量，默认4
cluster.routing.allocation.same_shard.host 在一个主机上的当有多少个相同的集群名称的分片分配时，是否进行检查主机名和主机ip地址，默认false，仅适用于在同一台机器上启动多个节点时配置

分片平衡设置，

cluster.routing.rebalance.enable 启用或禁用特定种类的分片重新分配，可选项有
- all 允许所有的分片进行分片平衡，默认
- primaries 只允许主节点分片进行分片平衡
- replicas 只允许从分片进行平衡
- none 不对任何分片进行平衡
cluster.routing.allocation.allow_rebalance 当分片再平衡时允许的操作，可选项有
- always 总是允许再平衡
- indices_primaries_active 只有主节点索引允许再平衡
- indeices_all_active 所有分片允许再平衡，默认
cluster.routing.allocation.cluster_concurrent_rebalance 重新平衡时允许多少个并发的分片同时操作，默认2

启发式分片平衡，以cluster.routing.allocation.balance为前缀

cluster.routing.allocation.balance.index 在特定节点上，每个索引分配的分片数量，默认0.55
cluster.routing.allocation.balance.threshold 操作的最小最优化的值，默认为1
cluster.routing.allocation.balance.shard 在节点上分配每个分片的权重，默认是0.45

基于磁盘的配置，以cluster.routing.allocation.disk为前缀

cluster.routing.allocation.disk.include_relocations 计算节点的磁盘使用情况的时间间隔，默认true
cluster.routing.allocation.disk.watermark.low 允许分配时的磁盘空间最小值，可以是比例或者绝对值，如85%或1G，当磁盘占用超过设定值之后，系统将不会对此节点进行分配操作
cluster.routing.allocation.disk.watermark.high 允许分配时的磁盘空间最大值，当超过这个值之后，系统会把分片迁移到别的节点，默认90%
cluster.routing.allocation.disk.watermark.flood_stage elasticsearch变为只读模式，默认为85%
cluster.routing.allocation.disk.threshold_enabled 是否启用磁盘分配决策，默认true

完整配置示例

# 集群名称
cluster.name: my-application

# ------------------------------------ Node ------------------------------------
# 节点名称
node.name: node-1
# 不作为主节点
node.master: false
# 作为数据节点
node.data: true
# 指定节点的部落属性
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
# 数据存储地址
path.data: /data/elasticsearch/lib/elasticsearch
#
# log日志地址
path.logs: /var/log/elasticsearch
# 临时文件
# path.work: /var/log/elasticsearch/tmp
# 设置插件的存放位置，默认是es目录下的plugins目录
# path.plugins: /opt/elasticsearch/plugins

# 设置默认索引分片个数，默认5
index.number_of_shards: 5
# 设置默认索引副本个数，默认1
index.number_of_replicas: 1
#
# ----------------------------------- Memory -----------------------------------
# 设置为true来锁住内存，因为当jvm开始swapping时es的效率会降低，锁住保证不swap
bootstrap.memory_lock: true
bootstrap.system_call_filter: false
# ---------------------------------- Network -----------------------------------

# 默认只能本机访问,改为0.0.0.0可以远程访问
network.host: 0.0.0.0
# 设置参与集群的端口
transport.tcp.port: 9300
# 端口号，默认9200
http.port: 9200
# 设置内容的最大容量，默认100mb
http.max_content_length: 100mb

# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
# 默认3s，主节点选举，选举过程中，候选主节点ping master主节点的超时时间对选举时间进行调整，如果网络缓慢会造成集群重新选举，可适当调整该值
discovery.zen.ping_timeout: 30s
# 防止脑裂(一个es集群因网络原因或master节点负载过大，响应速度过慢，导致ping主节点超过超时时间，而引起的部分候选主节点进行重新选举，导致产生大于1个master节点，每一个master节点相当于一个独立的es集群)，只有大于该值，才会进行master节点的选举，该参数必须大于等于集群中master候选节点的quorum数量，`quorum=master候选主节点数/2 + 1`
discovery.zen.minimum_master_nodes: 2
# ping的超时时间，在运行中，master检测所有节点，以及节点检测master是否通畅的超时时间，会触发master节点的选举机制，默认30s
discovery.zen.fd.ping_timeout: 30s
# 单播路由服务器，单播时，需要一些服务器列表进行集群状态的传播
discovery.zen.ping.unicast.hosts: ["0.0.0.0"]


# ---------------------------------- Gateway -----------------------------------


# 最多等待五分钟，五分钟没上线就会重新rebalance，等到足够节点上线后，才会进行shard recovery过程
gateway.recover_after_nodes: 3
gateway.recover_after_time: 5m
# 节点的最少数量
gateway.exported_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
# 禁止在生产环境中删除所有索引，设置是否可以通过正则或者_all来关闭索引库，默认true表示必须需要显示的指定索引名称
#action.destructive_requires_name: true

# ---------------------------------- http.cors -----------------------------------
http.cors.enabled: true

http.cors.allow-origin: "*"

可以使用bigdesk来进行elasticsearch的集群监控，地址为https://github.com/hlstudio/bigdesk.git