Elasticsearch Cluster

1. Elasticsearch Cluster Node

Master-eligible node(Master node)

- Cluster - Node - Shard 의 mapping 정보를 담고있는 node로 , index의 생성이나 삭제, 노드 추적등 전반적인 클러스터 전체에 걸친 간단한 작업을 담당

- SPOF 를 위하여 "node.master : true "로 설정되어 있는 Node 중 Master 를 선출 하며, Master 장애시 재선출을 진행

- "node.master : true "로 설정 하며, 기본 값은 true 로 설정

split brain 방지

Master node들의 split brain을 방지하고자 "discovery.zen.minimum_master_nodes" 지시자 를 설정 해야 한다.(기본값 1)

"(Master 를 구성 할 수 있는 node(node.master:true) /2)+1" 의 방식으로 설정 해야 한다.

e.g) Master+data 4대 , data 5대 일 경우 (4/2)+1로 3으로 지정해야 한다.

elasticsearch.yml(master only)

node.master: true
node.data: false
node.ingest: false

Data node

- 실제로 Data 가 저장 되는 Node로 Indexing 및 검색 작업이 수행

- "node.data : true"로 설정 하며, 기본 값은 true 로 설정

elasticsearch.yml(data only)

node.master: false

node.data: true

node.ingest: false
Ingest node

- logstash 없이, logstash 와 같은 전처리 과정이 필요할 때 사용하는 Node

- "node.ingest : true "로 설정 하며, 기본 값은 true 로 설정

elasticsearch.yml(Ingest only)

node.master: false

node.data: false

node.ingest: true
coordination node

- Mater Node 들의 로드 밸런서 역할을 담당

- Node설정이 모두 "False" 로 설정

elasticsearch.yml(coordination only)

node.master: false

node.data: false

node.ingest: false

2. Elasticsearch Cluster (Index Shard/Replica)

Index Shard

- 기본값 5로 설정된 5개의 Primary Shard를 구성 한다.

- 초기 생성시에만 설정 가능 하며, 이후에는 수정이 불가능 하다.

elasticsearch.yml

index.number_of_shards: 5
Index Replica

- Cluster 구성시 기본값 1로 설정된 복제본을 구성 한다 (단일 구성시 0)

elasticsearch.yml

index.number_of_replicas: 1

* Kibana 상으로 확인 가능

3. Elasticsearch Cluster Test

Cluster 테스트를 하기 위하여 coordinator node 를 통하여 Master/Data Node 3대로 로드 밸런싱 가능하게 구성한다.

이와 같은 구성은 추가적으로 스케일 아웃시 Data Node만 추가 하는 형태로 되어 있으며,Master Node의 Split Brain을 방지 할 수 있도록 한다.

hostname	역할	public ip	Elasticsearch Version	index directory
kafka-es	coordinator	192.168.10.190	6.3.1	/data/es
kafka-es-data01	master/data node	192.168.10.229	6.3.1	/data/es
kafka-es-data02	master/data node	192.168.10.231	6.3.1	/data/es
kafka-es-data03	master/data node	192.168.10.233	6.3.1	/data/es

hostname으로 cluster 하기 위하여, host file을 추가한다.

/etc/hosts

0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
168.10.190 kafka-es
168.10.229 kafka-es-data01
168.10.231 kafka-es-data02
168.10.233 kafka-es-data03

yum repo를 이용 하여 openjdk 1.8 이상 버전으로 설치 한다.

install openjdk

yum install -y java-1.8.0-openjdk-devel.x86_64

elasticsearch Rpm 파일을 다운로드/설치 한다.

download/install elasticsearch 6.3.1

cd /opt
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.1.rpm
rpm -Uvh elasticsearch-6.3.1.rpm

elasticsearch.yml파일을 수정하여 cluster 및 기본 설정을 구성한다.

공통적으로 cluster.name 지시자에는 해당 클러스터 이름을 추가 하며(없을 경우 자동생성),

discovery를 통하여 같은 이름인 경우 cluster를 구성 한다.

node.name 은 유니크한 값을 가져야 하기 때문에 hostnam의 변수로 자동 추가 되도록 하였다.

path.data 지시자에는 실제 index가 물리적으로 저장 될 공간을 추가 한다.

그리고 coodinator node로 사용 하기 위하여 node.master,node.data 지시자를 false 로 하여 실제 index가 물리적으로 저장 되지 않도록 한다.

마지막으로 discovery.zen.ping.unicast.hosts 지시자에 나머지 Node들의 정보를 추가 하여 해당 호스트에 대하여 unicast로 cluster될 수 있도록 요청 하도록 한다.

/etc/elasticsearch/elasticsearch.yml(coordinator)

cluster.name: cy-cluster
node.name:  ${HOSTNAME}
network.host: 0.0.0.0
 
path.data: /data/es
path.logs: /var/log/elasticsearch
node.master: false
node.data: false
http.enabled: true
discovery.zen.ping.unicast.hosts: ["192.168.10.229","192.168.10.231","192.168.10.233"]

Master/Data node는 coodinator node와 다 동일 하지만 node.master,node.data 지시자를 true로 하여 data/master의 역할 을 할 수 있도록 하며,

"http.enabled: false"로 하여 해당 node들에서 rest api를 사용 하지 않도록 설정 한다.

/etc/elasticsearch/elasticsearch.yml(Master/Data)

cluster.name: cy-cluster
node.name:  ${HOSTNAME}
network.host: 0.0.0.0
 
path.data: /data/es
path.logs: /var/log/elasticsearch
node.master: true
node.data: true
http.enabled: false
discovery.zen.ping.unicast.hosts: ["192.168.10.229","192.168.10.231","192.168.10.233"]

systemctl 에 등록 및 서비스를 시작 한다.

systemctl

systemctl daemon-reload
systemctl enable elasticsearch.service
systemctl start elasticsearch

RestAPI 로 cluster 상태를 확인 하면 아래와 같이 총 4개의 Node에 3개의 Data Node를 확인 할 수 있다.

cluster check

# curl localhost:9200/_cluster/health?pretty
{
  "cluster_name" : "cy-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 4,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

동일한 방법으로 Master Node를 확인 하면 특정 한 Node가 Master 로 선출 된것을 확인 할 수 있다.

# curl localhost:9200/_cat/master?v
id                     host          ip            node
ReqqThDCSimxGwgPZSRiHg 172.16.194.23 172.16.194.23 kafka-es-data03.novalocal

kibana도 동일하게 RPM파일을 다운로드 및 설치 한다(coodinator node)

install kibana

cd /opt
wget https://artifacts.elastic.co/downloads/kibana/kibana-6.3.2-x86_64.rpm
yum localinstall kibana-6.3.2-x86_64.rpm -y

Kibana 설정은 아래와 같이 0.0.0.0:5601로 바인딩 되며 coodinator node 로 Rest API 사용 하여 볼 수 있도록 한다.(coodinator node)

/etc/kibana/kibana.yml (coordinator)

server.port: 5601
server.host: "0.0.0.0"
elasticsearch.url: "http://localhost:9200"

systemctl 에 등록 및 서비스를 시작 한다.(coodinator node)

systemctl

systemctl daemon-reload
systemctl enable kibana.service
systemctl start kibana

Kibana 에 접속 후 Monitoring 기능을 활성화 하면 아래와 같이 전반적인 상태를 체크 할 수 있다

다음으로 Logstash 도 동일하게 RPM 파일로 다운로드/설치 진행 한다.(coodinator node)

download/install logstash (coordinator)

cd /opt
wget https://artifacts.elastic.co/downloads/logstash/logstash-6.3.1.rpm
yum localinstall logstash-6.3.1.rpm -

간단한 테스트를 위하여 tcp 1919 포트로 들어오면 elasticsearch로 들어 오는 설정을 추가 한다.(coodinator node)

/etc/logstash/conf.d/test.conf

input{
    tcp{
        port=>1919
    }
}
 
output {
    elasticsearch {
    hosts => ["localhost:9200"]
    }
}

systemctl 에 등록 및 서비스를 시작 한다.(coodinator node)

systemctl

systemctl daemon-reload
systemctl enable logstash.service
systemctl start logstash

nc 를 이용 하여 1919포트로 간단한 문자를 보내본다.(coodinator node)

tcp test

echo "test" | nc localhost 1919

위에서 테스트한 문자를 kibana에서 확인 할 수 있다.

또한, 3개의 node들로 5개의 shard가 분산 되어 저장 되어 있으며 1개의 복제 shard를 확인 할 수 있다 .

여기서 kafka-es-data01 node의 elasticsearch 서비스를 잠시 중단하면 cluster status가 Yellow 상태로 변경 되며 , kafka-es-data01 node 에서 Primary shard상태에서 복제본의 Shard가 Primary shard로 변경 된다.

만약 다시 서비스를 시작 하게 되면, 다시 Shard가 분산 되어 있는 상태로 변경 된다.

Ref: https://www.slideshare.net/snuffkin/elasticsearch-as-a-distributed-system

Ref: https://www.elastic.co/guide/en/elasticsearch/reference/6.3/modules-node.html#master-node

Ref: http://behonestar.tistory.com/49

Ref: http://kimjmin.net/2018/01/2018-01-build-es-cluster-1/

저작자표시

'System > Elastic Stack' 카테고리의 다른 글

Curator 를 이용한 오래된 index 자동 삭제 (0)	2018.10.21
Elasticsearch Hot Warm Architecture (0)	2018.09.09
Logstash Drop filter (1)	2018.07.22
Logging every shell command to elastic stack (0)	2018.05.20
Elasticsearch5, Logstash5, Kibana5 and Redis (ELKR5 Stack) install CentOS 7 (0)	2016.12.24

« 2024/11 »
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

Elasticsearch Cluster

1. Elasticsearch Cluster Node

Master-eligible node(Master node)

Data node

Ingest node

coordination node

2. Elasticsearch Cluster (Index Shard/Replica)

Index Shard

Index Replica

* Kibana 상으로 확인 가능

3. Elasticsearch Cluster Test

'System > Elastic Stack' 카테고리의 다른 글

'System/Elastic Stack' 관련글

티스토리툴바