首页 > 网络 > 云计算 >

CEPHUbuntu14.04集群部署

2017-04-18

CEPHUbuntu14 04集群部署,platform: Ubuntu 14 04。

CEPHUbuntu14.04集群部署,platform: Ubuntu 14.04。
layout:\

集群节点布局:

Node Ip OSD MON RGW MDS
ceph-admin-node 192.168.1.253
ceph-node1 192.168.1.252 osd.0 mon.ceph-node1
ceph-node2 192.168.1.251 osd.1 mon.ceph-node2 Yes
ceph-node3 192.168.1.250 osd.2 mon.ceph-node3 Yes

1. install
export CEPH_DEPLOY_REPO_URL=https://mirrors.aliyun.com/ceph/debian-jewel
export CEPH_DEPLOY_GPG_URL=https://mirrors.aliyun.com/ceph/keys/release.asc
wget -q -O- ‘https://mirrors.aliyun.com/ceph/keys/release.asc’ | sudo apt-key add -
echo deb https://mirrors.aliyun.com/ceph/debian-jewel/ $(lsb_release -sc) main | sudo tee /etc/apt/sources.list.d/ceph.list
sudo apt-get update && sudo apt-get install ceph-deploy

ceph node setup
2.1 ntp install
sudo apt-get install ntp

2.2 ssh server install
sudo apt-get install openssh-server

2.3 ceph deploy user create
sudo useradd -d /home/ceph-admin-node -m ceph-admin-node
sudo passwd ceph-admin-node
Enter new UNIX password:123456
sudo gpasswd -a ceph-admin-node sudo
echo “ceph-admin-node ALL = (root) NOPASSWD:ALL” | sudo tee /etc/sudoers.d/ceph-admin-node
sudo chmod 0440 /etc/sudoers.d/ceph-admin-node

add hostname in /etc/hosts

cat /etc/hosts
192.168.1.253 ceph-admin-node
192.168.1.252 ceph-node1
192.168.1.251 ceph-node2
192.168.1.250 ceph-node3

add hostname/username in ~/.ssh/config

ceph-admin-node@ceph-admin-node:~/my-cluster$ cat ~/.ssh/config
Host ceph-admin-node
Hostname ceph-admin-node
User ceph-admin-node
Host ceph-node1
Hostname ceph-node1
User ceph-node1
Host ceph-node2
Hostname ceph-node2
User ceph-node2
Host ceph-node3
Hostname ceph-node3
User ceph-node3

ceph-node1

ceph-admin-node@ceph-admin-node:~/my-cluster$ssh ceph-node1
sudo useradd -d /home/ceph-node1 -m ceph-node1
sudo passwd ceph-node1
Enter new UNIX password:123456
sudo gpasswd -a ceph-node1 sudo
username=ceph-node1
echo “${username} ALL = (root) NOPASSWD:ALL” | sudo tee /etc/sudoers.d/${username}
sudo chmod 0440 /etc/sudoers.d/${username}

ceph-node2

ceph-admin-node@ceph-admin-node:~/my-cluster$ ssh ceph-node2
sudo useradd -d /home/ceph-node2 -m ceph-node2
sudo passwd ceph-node2
Enter new UNIX password:123456
sudo gpasswd -a ceph-node2 sudo
$ username=ceph-node2
echo “${username} ALL = (root) NOPASSWD:ALL” | sudo tee /etc/sudoers.d/${username}
sudo chmod 0440 /etc/sudoers.d/${username}

ceph-node3

ceph-admin-node@ceph-admin-node:~/my-cluster$ ssh ceph-node3
sudo useradd -d /home/ceph-node3 -m ceph-node3
sudo passwd ceph-node3
Enter new UNIX password:123456
sudo gpasswd -a ceph-node3 sudo
$ username=ceph-node3
echo “${username} ALL = (root) NOPASSWD:ALL” | sudo tee /etc/sudoers.d/${username}
sudo chmod 0440 /etc/sudoers.d/${username}

2.4 ssh password-less

ceph-admin-node@ceph-admin-node:~$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/ceph-admin-node/.ssh/id_rsa):
Created directory ‘/home/ceph-admin-node/.ssh’.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/ceph-admin-node/.ssh/id_rsa.
Your public key has been saved in /home/ceph-admin-node/.ssh/id_rsa.pub.
The key fingerprint is:

copy ssh to each node

ssh-copy-id ceph-node1@ceph-node1
ssh-copy-id ceph-node2@ceph-node2
ssh-copy-id ceph-node3@ceph-node3

2.5 ports required open

Ceph Monitors communicate using port 6789 by default. Ceph OSDs communicatein a port range of6800:7300 by default.
sudo ufw enable
sudo ufw default deny
sudo ufw allow 6789
sudo ufw allow 22
sudo ufw allow proto tcp from any to any port 6800:7300
sudo ufw allow proto udp from any to any port 6800:7300

3. storage cluster

mkdir my-cluster
cd my-cluster
ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy new ceph-node1

3.1 replicas number set to 2

add below 3 lines to my-cluster/ceph.conf
osd pool default size = 2
osd max object name len = 256
osd max object namespace len = 64

3.2 ceph install 安装 ceph 软件

ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy install ceph-admin-node ceph-node1 ceph-node2 ceph-node3

3.3 ceph monitor initial and gather the keys

ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy mon create-initial

3.4 add three OSDs

OSD均用两个虚拟磁盘/dev/vdb(8G)做 osd data存储用,/dev/vdc(6G)做 osd journal用,都格式化为 xfs 文件系统
osd的创建一般分为2个阶段(prepare阶段和activate阶段)

3.4.1 osd 创建 prepare阶段

ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy osd prepare –fs-type xfs ceph-node1:/dev/vdb:/dev/vdc
ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy osd prepare –fs-type xfs ceph-node2:/dev/vdb:/dev/vdc
ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy osd prepare –fs-type xfs ceph-node3:/dev/vdb:/dev/vdc

3.4.2 osd 创建 activate 阶段

ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy osd activate ceph-node1:/dev/vdb1:/dev/vdc1
ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy osd activate ceph-node2:/dev/vdb1:/dev/vdc1
ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy osd activate ceph-node3:/dev/vdb1:/dev/vdc1

将配置文件和client.admin key对送给各个 ceph 节点
cceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-admin-node ceph-node1 ceph-node2 ceph-node3

3.5 Ensure that you have the correct permissions for the.

ceph-admin-node@ceph-admin-node:~/my-cluster$ sudo chmod +r /etc/ceph/ceph.client.admin.keyring
ceph-node1@ceph-node1:~$ sudo chmod +r /etc/ceph/ceph.client.admin.keyring
ceph-node2@ceph-node2:~$ sudo chmod +r /etc/ceph/ceph.client.admin.keyring
ceph-node3@ceph-node3:~$ sudo chmod +r /etc/ceph/ceph.client.admin.keyring

3.6 Check your cluster’s health.

ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph -s
cluster e435a411-765d-4b1d-9685-e70b3cb09115
health HEALTH_WARN
clock skew detected on mon.ceph-node1
Monitor clock skew detected
monmap e3: 3 mons at {ceph-node1=192.168.1.252:6789/0,ceph-node2=192.168.1.251:6789/0,ceph-node3=192.168.1.250:6789/0}
election epoch 10, quorum 0,1,2 ceph-node3,ceph-node2,ceph-node1
osdmap e28: 3 osds: 3 up, 3 in
flags sortbitwise
pgmap v222: 112 pgs, 7 pools, 1636 bytes data, 171 objects
109 MB used, 24433 MB / 24542 MB avail
112 active+clean

看看 osd 节点分布清空, ceph osd tree可以查看
ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 0.02339 root default
-2 0.00780 host ceph-node1
0 0.00780 osd.0 up 1.00000 1.00000
-3 0.00780 host ceph-node2
1 0.00780 osd.1 up 1.00000 1.00000
-4 0.00780 host ceph-node3
2 0.00780 osd.2 up 1.00000 1.00000

3.7 add an OSD on ceph-node1

           上面统一做了,这里略过

3.8 add a Metadata Server(MDS) in order to use CephFS

ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy mds create ceph-node2

3.9 add an RGW instance

ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy rgw create ceph-node3

3.10 Add two Ceph Monitors

ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy mon add ceph-node2
ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy mon add ceph-node3
新的 mon 节点添加后, mon 之间会有数据同步操作, 并且形成法定人数,选举出 leader 节点,通过下面的命令可以看到选举后的结果
ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph quorum_status –format json-pretty
使用了 多 mon 节点, 需要保证 mon 之间的时间同步, 这往往用 ntp 实现,

3.11 storing/retrieving object data

集群搭建基本完毕, ceph 作为典型的 对象 存储系统,咱们做一下存储对象的测试, 存储和提取对象数据均是从client的视角进行的.
存储对象数据需要两个条件:
<1>. 需要为对象设置一个名字, 也就是要有一个对象名。
<2>. 需要一个存放对象的池子,也就是要有一个pool。
ceph client端会获取最新的cluseter map,然后运用 CRUSH 算法计算object到placement group的映射关系,进而计算出placement goup中哪几个OSD用于存储该对象数据。
构造测试文件(上传的时候是文件, ceph 集群中是按object存储的)
ceph-admin-node@ceph-admin-node:~/my-cluster$ echo {Test-data} > testfile.txt
创建一个名叫data数据池(集群刚刚搭建完毕仅有一个名叫rbd的pool)
ceph-admin-node@ceph-admin-node:~/my-cluster$ rados mkpool data
successfully created pool data
往ceph集群名叫data的pool中上传文件initrd.img
rados put {object-name} {file-path} &ndash;pool={pool-name}
ceph-admin-node@ceph-admin-node:~/my-cluster$ rados put pool-data-object-1 testfile.txt &ndash;pool=data
查看data池子中的对象情况
ceph-admin-node@ceph-admin-node:~/my-cluster$ rados -p data ls
pool-data-object-1
文件 initrd.img大概21M, 在 rados put操作前后,集群消耗存储空间从106M变化为152M,用的2副本存放策略。152-106-21*2=4M,这多余的4M是怎么消耗的呢?

看看刚刚上传的文件对应的对象,在集群里面的具体位置
ceph osd map {pool-name} {object-name}
ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph osd map data pool-data-object-1
ceph-node1@ceph-node1:~$ ceph osd map data pool-data-object-1
osdmap e28 pool &lsquo;data&rsquo; (6) object &lsquo;pool-data-object-1&rsquo; -> pg 6.618d9d99 (6.1) -> up ([2,0], p2) acting ([2,0], p2)
测试完毕,将对象 pool-data-object-1 删除, 看看集群消耗空间能否恢复106M.
rados rm {object-name} &ndash;pool={poo-name}
ceph-admin-node@ceph-admin-node:~/my-cluster\$ rados rm pool-data-object-1 &ndash;pool=data
测试后ceph -s显示集群消耗空间为112M,比106M多了6M, 这部分空间是消耗在哪儿了呢?

4. Question

4.1 ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph health

HEALTH_ERR 64 pgs are stuck inactive for more than 300 seconds; 64 pgs stuck inactive

ERROR: osd init failed: (36) File name too long

vi ceph.conf

osd max object name len = 256
osd max object namespace len = 64

[问题] [ceph-node2][WARNIN] ceph_disk.main.Error: Error: No cluster conf found in /etc/ceph with fsid 190aecc2-9b36-436e-a0de-658857152894
[解答] ceph-node2@ceph-node2:/var/local/osd1sudorm?fr?ceph?node3@ceph?node3:/var/local/osd2 sudo rm -fr *
ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy osd prepare ceph-node2:/var/local/osd1 ceph-node3:/var/local/osd2
ceph-admin-node@ceph-admin-node:~/my-cluster$ ceph-deploy osd activate ceph-node2:/var/local/osd1 ceph-node3:/var/local/osd2

[问题] sudo rbd map foo &ndash;name client.admin
rbd: sysfs write failed
RBD image feature set mismatch. You can disable features unsupported by the kernel with “rbd feature disable”.
In some cases useful info is found in syslog - try “dmesg | tail” or so.
rbd: map failed: (6) No such device or address
[解答]
rbd feature disable foo fast-diff
rbd feature disable foo deep-flatten
rbd feature disable foo object-map
rbd feature disable foo exclusive-lock
sudo rbd map foo &ndash;name client.admin

[问题] 2017-04-16 11:55:36.541615 7f1c8058e700 0 &ndash; :/2328386680 >> 192.168.1.252:6789/0 pipe(0x7f1c74015410 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f1c74004d80).fault
2017-04-16 11:55:39.544059 7f1c8048d700 0 &ndash; :/2328386680 >> 192.168.1.252:6789/0 pipe(0x7f1c7400c600 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f1c7400e8b0).fault
2017-04-16 11:55:42.547404 7f1c8058e700 0 &ndash; :/2328386680 >> 192.168.1.252:6789/0 pipe(0x7f1c74015410 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f1c74004d80).fault
[解答] 查看 ceph.conf 中 mon_host 对应的 ip 中是否有 192.168.1.252
若有, 则问题处在 mon 服务没启动成功。

[问题] ceph-node1@ceph-node1:~$ sudo ceph-disk activate /dev/vdb1
mount_activate: Failed to activate
ceph-disk: Error: ceph osd create failed: Command &lsquo;/usr/bin/ceph&rsquo; returned non-zero exit status 1: 2017-04-16 12:15:14.780249 7f33d4971700 -1 auth: unable to find a keyring on /var/lib/ceph/bootstrap-osd/ceph.keyring: (2) No such file or directory
2017-04-16 12:15:14.780358 7f33d4971700 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication
2017-04-16 12:15:14.780394 7f33d4971700 0 librados: client.bootstrap-osd initialization error (2) No such file or directory
Error connecting to cluster: ObjectNotFound
[解答] 说明该 monitor 节点曾经部署过 ceph 环境, 该问题的解决方法很简单,在执行该命令之前删除该 monitor 节点 /etc/ceph/ 下的{cluster}.client.admin.keyring文件(cluster名默认配置情况下是ceph), 我是先删除/etc/ceph/ceph.client.admin.keyring, 再重启 mon 服务, 使用命令 , 最后再重试出错的命令 sudo ceph-disk activate /dev/vdb1, 最好在 admin 节点上用 ceph-deploy 统一部署
详细分析参考:www.cppblog.com/runsisi/archive/2014/08/28/208168.html

[问题] ceph-node1@ceph-node1:~$ ceph -s
2017-04-16 14:01:58.311148 7ff779cbd700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
2017-04-16 14:01:58.312722 7ff779cbd700 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication
2017-04-16 14:01:58.313643 7ff779cbd700 0 librados: client.admin initialization error (2) No such file or directory
Error connecting to cluster: ObjectNotFound
[解答] ceph-node1@ceph-node1:~\$ sudo chmod +r /etc/ceph/ceph.client.admin.keyring

[问题] 查看 osd 所有的配置项设置值的方法
[解答] sudo ceph &ndash;admin-daemon /var/run/ceph/ceph-osd.0.asok config show

[问题] 如何更改 ceph.conf 并且立即生效
[解答] 可以用 ceph tell 命令将要设置的参数注入到集群里面,再在ceph.conf 文件中修改, 并且同步到所有的集群。比如修改osd_pool_default_size 从 2 更改为 3. 执行命令。

相关文章
最新文章
热点推荐