Sharing

2011年9月7日 星期三

Setup Ceph Cluster



把三個 partition create 出來
pjack@ubuntu1104-64-5:/etc/ceph$ sudo lvcreate -L 50G ubuntu1104-64-5
  Logical volume "lvol0" created
pjack@ubuntu1104-64-5:/etc/ceph$ sudo lvcreate -L 50G ubuntu1104-64-5
  Logical volume "lvol1" created
pjack@ubuntu1104-64-5:/etc/ceph$ sudo lvcreate -L 50G ubuntu1104-64-5
  Logical volume "lvol2" created



分別指定成 ext3, ext4, btrfs

第一塊是 ext3
pjack@ubuntu1104-64-5:/etc/ceph$ sudo mkfs -t ext3 /dev/ubuntu1104-64-5/lvol0
mke2fs 1.41.14 (22-Dec-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
3276800 inodes, 13107200 blocks
655360 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
400 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424

Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 23 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

第二塊是 ext4
pjack@ubuntu1104-64-5:/etc/ceph$ sudo mkfs -t ext4 /dev/ubuntu1104-64-5/lvol1
mke2fs 1.41.14 (22-Dec-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
3276800 inodes, 13107200 blocks
655360 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
400 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424

Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 31 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.


第三塊是 btrfs, 官方建議的也是這個設定
pjack@ubuntu1104-64-5:/etc/ceph$ sudo mkfs -t btrfs /dev/ubuntu1104-64-5/lvol2

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/ubuntu1104-64-5/lvol2
        nodesize 4096 leafsize 4096 sectorsize 4096 size 50.00GB
Btrfs Btrfs v0.19


# 把這三塊 Mount 起來
pjack@ubuntu1104-64-5:/mnt$ sudo mount /dev/mapper/ubuntu1104--64--5-lvol0 /mnt/lvol0
pjack@ubuntu1104-64-5:/mnt$ sudo mount /dev/mapper/ubuntu1104--64--5-lvol1 /mnt/lvol1
pjack@ubuntu1104-64-5:/mnt$ sudo mount /dev/mapper/ubuntu1104--64--5-lvol2 /mnt/lvol2

# 看一下結果
pjack@ubuntu1104-64-5:/mnt$ df
Filesystem                               1K-blocks      Used Available Use% Mounted on
/dev/mapper/ubuntu1104--64--5-lvol0       51606140    184268  48800432   1% /mnt/lvol0
/dev/mapper/ubuntu1104--64--5-lvol1       51606140    184136  48800564   1% /mnt/lvol1
/dev/mapper/ubuntu1104--64--5-lvol2       52428800        56  50302976   1% /mnt/lvol2

# 看一下每一塊的 format
pjack@ubuntu1104-64-5:/lib/modules/2.6.38-8-server/kernel/fs$ mount -l
/dev/mapper/ubuntu1104--64--5-lvol0 on /mnt/lvol0 type ext3 (rw)
/dev/mapper/ubuntu1104--64--5-lvol1 on /mnt/lvol1 type ext4 (rw)
/dev/mapper/ubuntu1104--64--5-lvol2 on /mnt/lvol2 type btrfs (rw)

不過他對 Ext4 有一些要求

  1. user_xattr
  2. noatime
  3. nodiratime
  4. disable the ext journal

The ext4 partition must be mounted with -o user_xattr or else mkcephfs will fail. Also using noatime,nodiratime boosts performance at no cost. When using ext4, you should disable the ext4 journal, because Ceph does its own journalling. This will boost performance.       

Data Mode
=========
There are 3 different data modes:

* writeback mode
In data=writeback mode, ext4 does not journal data at all. This mode provides a similar level of journaling as that of XFS, JFS, and ReiserFS in its default mode - metadata journaling. A crash+recovery can cause incorrect data to appear in files which were written shortly before the crash. This mode will typically provide the best ext4 performance.

* ordered mode
In data=ordered mode, ext4 only officially journals metadata, but it logically groups metadata information related to data changes with the data blocks into a single unit called a transaction. When it's time to write the new metadata out to disk, the associated data blocks are written first. In general, this mode performs slightly slower than writeback but significantly faster than journal mode.

* journal mode
data=journal mode provides full data and metadata journaling. All new data is written to the journal first, and then to its final location.
In the event of a crash, the journal can be replayed, bringing both data and
metadata into a consistent state. This mode is the slowest except when data
needs to be read from and written to disk at the same time where it outperforms all others modes. Curently ext4 does not have delayed allocation support if this data journalling mode is selected.

修改之後再看一次結果
pjack@ubuntu1104-64-5:/lib/modules/2.6.38-8-server/kernel/fs$ mount -l
/dev/mapper/ubuntu1104--64--5-lvol0 on /mnt/lvol0 type ext3 (rw)
/dev/mapper/ubuntu1104--64--5-lvol2 on /mnt/lvol2 type btrfs (rw)
/dev/mapper/ubuntu1104--64--5-lvol1 on /mnt/lvol1 type ext4 (rw,noatime,nodiratime,user_xattr,data=writeback)


為了之後的方便, 每一台 Server 都先生成 ssh key, 然後 import 到其他台去
pjack@ubuntu1104-64-5:/etc/ceph$ sudo ssh-keygen -d
Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
82:8a:85:37:a2:17:f2:41:4f:e8:96:d0:a6:1b:c9:6c root@ubuntu1104-64-5
The key's randomart image is:
+--[ DSA 1024]----+
|                 |
| . .             |
|. = .            |
|oO + .           |
|BEX o . S        |
|o@ =   .         |
|+ +              |
| .               |
|                 |
+-----------------+

root@ubuntu1104-64-5:~$ ssh-copy-id -i /root/.ssh/id_dsa.pub root@172.16.33.6


接下來把 sample.ceph.conf & sample.fetch_config 複製到 /etc/ceph 下
把設定修改好, 大部份都不必修改

[global]
        ; enable secure authentication
        auth supported = cephx

        ; allow ourselves to open a lot of files
        max open files = 131072

        ; set log file
        log file = /var/log/ceph/$name.log
        ; log_to_syslog = true        ; uncomment this line to log to syslog

        ; set up pid files
        pid file = /var/run/ceph/$name.pid

        ; If you want to run a IPv6 cluster, set this to true. Dual-stack isn't possible
        ;ms bind ipv6 = true


        keyring = /etc/ceph/keyring.admin

[mon]
        mon data = /data/$name
[mon.alpha]
        host = ubuntu1104-64-5
        mon addr = 172.16.33.5:6789
[mds]
        ; where the mds keeps it's secret encryption keys
        keyring = /data/keyring.$name

        ; mds logging to debug issues.
        ;debug ms = 1
        ;debug mds = 20

[mds.alpha]
        host = ubuntu1104-64-5
[osd]
        ; This is where the btrfs volume will be mounted.
        osd data = /data/$name
        keyring = /etc/ceph/keyring.$name

        osd journal = /data/$name/journal
        osd journal size = 1000 ; journal size, in megabytes

[osd.0]
        host = ubuntu1104-64-5
        btrfs devs = /dev/mapper/ubuntu1104--64--5-lvol2

[osd.1]
        host = ubuntu1104-64-5
        btrfs devs = /dev/mapper/ubuntu1104--64--5-lvol0

[osd.2]
        host = ubuntu1104-64-6
        btrfs devs = /dev/mapper/ubuntu1104--64--6-lvol0

[osd.3]
        host = ubuntu1104-64-6
        btrfs devs = /dev/mapper/ubuntu1104--64--6-lvol1


#!/bin/sh
conf="$1"
scp -i /root/.ssh/id_dsa root@172.16.33.5:/etc/ceph/ceph.conf $conf

然後因為 ceph.conf 內用的都是 hostname, 而非 ip address, 所以要去設定一下 /etc/hosts, 不然之後 script 會出問題

127.0.0.1       localhost
127.16.33.5     ubuntu1104-64-5
172.16.33.6     ubuntu1104-64-6
172.16.33.7     ubuntu1104-64-7



另外是發現 Ceph 的 Scipt 好像有些問題, 進去後修改了其中一行的順序, 不然他似乎會把 ceph.conf 內的設定蓋掉
-------- 略 ------------
[ -z "$conf" ] && [ -n "$dir" ] && conf="$dir/conf"

# 多加這一行
[ -z "$conf" ] && [ -z "$dir" ] && conf=$default_conf



經過一長串的前置動作, 終於可以開始把 Ceph Filesystem 建起來

root@ubuntu1104-64-5:/tmp$ /sbin/mkcephfs -a --mkbtrfs
here 0 /etc/ceph/ceph.conf
[/etc/ceph/fetch_config /tmp/fetched.ceph.conf.13131]
ceph.conf                                                         100% 4455     4.4KB/s   00:00
temp dir is /tmp/mkcephfs.pt2DlXEHkB
here 0 /tmp/fetched.ceph.conf.13131
preparing monmap in /tmp/mkcephfs.pt2DlXEHkB/monmap
/usr/bin/monmaptool --create --clobber --add alpha 172.16.33.5:6789 --print /tmp/mkcephfs.pt2DlXEHkB
/monmap
/usr/bin/monmaptool: monmap file /tmp/mkcephfs.pt2DlXEHkB/monmap
/usr/bin/monmaptool: generated fsid 9cc6b2d5-1eba-50b2-bd43-7b3807ce301b
epoch 1
fsid 9cc6b2d5-1eba-50b2-bd43-7b3807ce301b
last_changed 2011-09-07 18:18:00.219236
created 2011-09-07 18:18:00.219236
0: 172.16.33.5:6789/0 mon.alpha
/usr/bin/monmaptool: writing epoch 1 to /tmp/mkcephfs.pt2DlXEHkB/monmap (1 monitors)

=== osd.0 ===
here 0 /tmp/mkcephfs.pt2DlXEHkB/conf
umount: /data/osd.0: not mounted
umount: /dev/mapper/ubuntu1104--64--5-lvol2: not mounted

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/mapper/ubuntu1104--64--5-lvol2
        nodesize 4096 leafsize 4096 sectorsize 4096 size 50.00GB
Btrfs Btrfs v0.19
Scanning for Btrfs filesystems
here 0 /tmp/mkcephfs.pt2DlXEHkB/conf
2011-09-07 18:18:00.995785 7fb8f3dfc760 created object store /data/osd.0 journal /data/osd.0/journal
 for osd0 fsid 9cc6b2d5-1eba-50b2-bd43-7b3807ce301b
creating private key for osd.0 keyring /etc/ceph/keyring.osd.0
creating /etc/ceph/keyring.osd.0

=== osd.1 ===
here 0 /tmp/mkcephfs.pt2DlXEHkB/conf
umount: /data/osd.1: not mounted
umount: /dev/mapper/ubuntu1104--64--5-lvol0: not mounted

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/mapper/ubuntu1104--64--5-lvol0
        nodesize 4096 leafsize 4096 sectorsize 4096 size 50.00GB
Btrfs Btrfs v0.19
Scanning for Btrfs filesystems
here 0 /tmp/mkcephfs.pt2DlXEHkB/conf
2011-09-07 18:18:01.689284 7fa826bb2760 created object store /data/osd.1 journal /data/osd.1/journal
 for osd1 fsid 9cc6b2d5-1eba-50b2-bd43-7b3807ce301b
creating private key for osd.1 keyring /etc/ceph/keyring.osd.1
creating /etc/ceph/keyring.osd.1

=== osd.2 ===
pushing conf and monmap to ubuntu1104-64-6:/tmp/mkfs.ceph.13131
umount: /data/osd.2: not mounted
umount: /dev/mapper/ubuntu1104--64--6-lvol0: not mounted

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/mapper/ubuntu1104--64--6-lvol0
        nodesize 4096 leafsize 4096 sectorsize 4096 size 50.00GB
Btrfs Btrfs v0.19
Scanning for Btrfs filesystems
2011-09-07 18:18:03.823692 7fdc3c04c760 created object store /data/osd.2 journal /data/osd.2/journal for osd2 fsid 9cc6b2d5-1eba-50b2-bd43-7b3807ce301b
creating private key for osd.2 keyring /etc/ceph/keyring.osd.2
creating /etc/ceph/keyring.osd.2
collecting osd.2 key


=== osd.3 ===
pushing conf and monmap to ubuntu1104-64-6:/tmp/mkfs.ceph.13131
umount: /data/osd.3: not mounted
umount: /dev/mapper/ubuntu1104--64--6-lvol1: not mounted

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/mapper/ubuntu1104--64--6-lvol1
        nodesize 4096 leafsize 4096 sectorsize 4096 size 50.00GB
Btrfs Btrfs v0.19
Scanning for Btrfs filesystems
 ** WARNING: Ceph is still under development.  Any feedback can be directed  **
 **          at ceph-devel@vger.kernel.org or http://ceph.newdream.net/.     **
2011-09-07 18:18:06.293806 7f91b9b63760 created object store /data/osd.3 journal /data/osd.3/journal for osd3 fsid 9cc6b2d5-1eba-50b2-bd43-7b3807ce301b
creating private key for osd.3 keyring /etc/ceph/keyring.osd.3
creating /etc/ceph/keyring.osd.3
collecting osd.3 key

=== mds.alpha ===
here 0 /tmp/mkcephfs.pt2DlXEHkB/conf
creating private key for mds.alpha keyring /data/keyring.mds.alpha
creating /data/keyring.mds.alpha
here 0 /tmp/mkcephfs.pt2DlXEHkB/conf
Building generic osdmap
 highest numbered osd in /tmp/mkcephfs.pt2DlXEHkB/conf is osd.3
 num osd = 4
/usr/bin/osdmaptool: osdmap file '/tmp/mkcephfs.pt2DlXEHkB/osdmap'
/usr/bin/osdmaptool: writing epoch 1 to /tmp/mkcephfs.pt2DlXEHkB/osdmap
Generating admin key at /tmp/mkcephfs.pt2DlXEHkB/keyring.admin
creating /tmp/mkcephfs.pt2DlXEHkB/keyring.admin
Building initial monitor keyring
added entity mds.alpha auth auth(auid = 18446744073709551615 key=AQDeRGdOQBGPLhAAA3owSiBl0H4ozL4dy0H7Rg== with 0 caps)
added entity osd.0 auth auth(auid = 18446744073709551615 key=AQDZRGdOKLL3ABAAlU4NM3xNTe+m/dUXEvKCRw== with 0 caps)
added entity osd.1 auth auth(auid = 18446744073709551615 key=AQDZRGdOAJ1MKhAAY69HXzl8QLxZ3/MCHP2Cnw== with 0 caps)
added entity osd.2 auth auth(auid = 18446744073709551615 key=AQDbRGdOeEtQMhAAHhbE8EuTxqpobHIUR0SCdg== with 0 caps)
added entity osd.3 auth auth(auid = 18446744073709551615 key=AQDeRGdOYImzEhAAHQkcZtR4E8npHlgpAT8NpQ== with 0 caps)
=== mon.alpha ===
here 0 /tmp/mkcephfs.pt2DlXEHkB/conf
/usr/bin/cmon: created monfs at /data/mon.alpha for mon.alpha
placing client.admin keyring in /etc/ceph/keyring.admin

看一下 Key 的設置
# 檢查一下 Server 1 (Ubuntu1104-64-5) 的狀況
root@ubuntu1104-64-5:/etc/ceph$ ll
drwxr-xr-x  2 root root 4096 2011-09-07 18:16 ./
drwxr-xr-x 86 root root 4096 2011-09-07 18:18 ../
-rw-r--r--  1 root root 4455 2011-09-07 17:31 ceph.conf
-rwxr-xr-x  1 root root  392 2011-09-07 11:32 fetch_config*
-rw-------  1 root root   92 2011-09-07 18:18 keyring.admin
-rw-------  1 root root   85 2011-09-07 18:18 keyring.osd.0
-rw-------  1 root root   85 2011-09-07 18:18 keyring.osd.1

root@ubuntu1104-64-5:/etc/ceph$ cauthtool -l keyring.admin
[client.admin]
        key = AQDeRGdOMNL3MhAAuzvelwICjpYhLIk7IMcX2g==
        auid = 18446744073709551615

# 檢查一下 Server 2 (Ubuntu1104-64-5) 的狀況
root@ubuntu1104-64-6:/etc/ceph$ ll
drwxr-xr-x  2 root root 4096 2011-09-07 18:16 ./
drwxr-xr-x 86 root root 4096 2011-09-07 18:18 ../
-rwxr-xr-x  1 root root  392 2011-09-07 11:56 fetch*
-rw-------  1 root root   85 2011-09-07 18:18 keyring.osd.2
-rw-------  1 root root   85 2011-09-07 18:18 keyring.osd.3


每個 node 的 Key 看起來就定位了, 讓我們把 service 叫起來吧!

root@ubuntu1104-64-5:/tmp$ service ceph -a start
[/etc/ceph/fetch_config /tmp/fetched.ceph.conf.16083]
ceph.conf                                                         100% 4455     4.4KB/s   00:00
=== mon.alpha ===
Starting Ceph mon.alpha on ubuntu1104-64-5...
starting mon.alpha rank 0 at 172.16.33.5:6789/0 mon_data /data/mon.alpha fsid 9cc6b2d5-1eba-50b2-bd43-7b3807ce301b
=== mds.alpha ===
Starting Ceph mds.alpha on ubuntu1104-64-5...
starting mds.alpha at 0.0.0.0:6800/16268
=== osd.0 ===
Mounting Btrfs on ubuntu1104-64-5:/data/osd.0
Scanning for Btrfs filesystems
Starting Ceph osd.0 on ubuntu1104-64-5...
starting osd0 at 0.0.0.0:6801/16371 osd_data /data/osd.0 /data/osd.0/journal
=== osd.1 ===
Mounting Btrfs on ubuntu1104-64-5:/data/osd.1
Scanning for Btrfs filesystems
Starting Ceph osd.1 on ubuntu1104-64-5...
starting osd1 at 0.0.0.0:6804/16464 osd_data /data/osd.1 /data/osd.1/journal
=== osd.2 ===
Mounting Btrfs on ubuntu1104-64-6:/data/osd.2
Scanning for Btrfs filesystems
Starting Ceph osd.2 on ubuntu1104-64-6...
starting osd2 at 0.0.0.0:6800/14475 osd_data /data/osd.2 /data/osd.2/journal
=== osd.3 ===
Mounting Btrfs on ubuntu1104-64-6:/data/osd.3
Scanning for Btrfs filesystems
Starting Ceph osd.3 on ubuntu1104-64-6...
starting osd3 at 0.0.0.0:6803/14676 osd_data /data/osd.3 /data/osd.3/journal


檢查一下整體的狀況及 Authentication list
root@ubuntu1104-64-5:/etc/ceph$ ceph -s
2011-09-07 18:29:24.305413    pg v160: 792 pgs: 792 active+clean; 24 KB data, 112 MB used, 191 GB / 200 GB avail
2011-09-07 18:29:24.307445   mds e4: 1/1/1 up {0=alpha=up:active}
# 有 4 個 osd, 4 個都 turn on 並且加入 storage pool
2011-09-07 18:29:24.307483   osd e7: 4 osds: 4 up, 4 in   
2011-09-07 18:29:24.307539   log 2011-09-07 18:29:20.760469 osd3 172.16.33.6:6803/14676 130 : [INF] 1.8c scrub ok
2011-09-07 18:29:24.307617   mon e1: 1 mons at {alpha=172.16.33.5:6789/0}

root@ubuntu1104-64-5:/etc/ceph$ ceph auth list
2011-09-07 18:29:41.564151 mon <- [auth,list]
2011-09-07 18:29:41.564718 mon0 -> 'installed auth entries:
mon.
        key: AQDeRGdOiEk2NBAAVHVGzaeOFcgSmbZZ2xPu+w==
mds.alpha
        key: AQDeRGdOQBGPLhAAA3owSiBl0H4ozL4dy0H7Rg==
        caps: [mds] allow
        caps: [mon] allow rwx
        caps: [osd] allow *
osd.0
        key: AQDZRGdOKLL3ABAAlU4NM3xNTe+m/dUXEvKCRw==
        caps: [mon] allow rwx
        caps: [osd] allow *
osd.1
        key: AQDZRGdOAJ1MKhAAY69HXzl8QLxZ3/MCHP2Cnw==
        caps: [mon] allow rwx
        caps: [osd] allow *
osd.2
        key: AQDbRGdOeEtQMhAAHhbE8EuTxqpobHIUR0SCdg==
        caps: [mon] allow rwx
        caps: [osd] allow *
osd.3
        key: AQDeRGdOYImzEhAAHQkcZtR4E8npHlgpAT8NpQ==
        caps: [mon] allow rwx
        caps: [osd] allow *
client.admin
        key: AQDeRGdOMNL3MhAAuzvelwICjpYhLIk7IMcX2g==
        caps: [mds] allow
        caps: [mon] allow *
        caps: [osd] allow *
' (0)


把 ceph mount 起來, 先用 Kernel 的方式.. 不過不知道為什麼, 一直出現 "No such device" 這樣的訊息, 無法解決就放棄了
root@ubuntu1104-64-5:/etc/ceph$ mount -t ceph 172.16.33.5:6789:/ /mnt/ceph -v -o name=admin,secret=AQDeRGdOMNL3MhAAuzvelwICjpYhLIk7IMcX2g==
parsing options: rw,name=admin,secret=AQDeRGdOMNL3MhAAuzvelwICjpYhLIk7IMcX2g==
error adding secret to kernel, key name client.admin: No such device.

改成用 cfuse 的方式就沒什麼問題.. 怪怪~ 有可能是要去更新 mount.ceph ?
root@ubuntu1104-64-5:/etc/ceph$ cfuse -m 172.16.33.5:6789 /mnt/ceph
 ** WARNING: Ceph is still under development.  Any feedback can be directed  **
 **          at ceph-devel@vger.kernel.org or http://ceph.newdream.net/.     **
cfuse[3506]: starting ceph client
cfuse[3506]: starting fuse
root@ubuntu1104-64-5:/etc/ceph$ df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/ubuntu1104--64--5-root
                      47328184   5929696  38994344  14% /
none                  12358244       220  12358024   1% /dev
none                  12366300         0  12366300   0% /dev/shm
none                  12366300        60  12366240   1% /var/run
none                  12366300         0  12366300   0% /var/lock
/dev/sda1               233191     45262    175488  21% /boot
/dev/mapper/ubuntu1104--64--5-lvol2
                      52428800     30244  50275500   1% /data/osd.0
/dev/mapper/ubuntu1104--64--5-lvol0
                      52428800     31272  50274416   1% /data/osd.1
cfuse                209715200   8616960 201098240   5% /mnt/ceph

單獨加一個 osd 的方法

可以參考 http://ceph.newdream.net/wiki/OSD_cluster_expansion/contraction
但事實上他有前置作業, 必須要先把 /etc/ceph/keyring.admin copy 到新機器上, 否則無法執行這些指令

root@ubuntu1104-64-6:/etc/ceph$ mkfs.btrfs /dev/mapper/ubuntu1104--64--6-lvol1
WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using
fs created label (null) on /dev/mapper/ubuntu1104--64--6-lvol1
        nodesize 4096 leafsize 4096 sectorsize 4096 size 50.00GB
Btrfs Btrfs v0.19

root@ubuntu1104-64-6:/etc/ceph$ mount /dev/mapper/ubuntu1104--64--6-lvol1 /data/osd.3

root@ubuntu1104-64-6:/etc/ceph$ cosd -i 3 --mkfs --monmap /tmp/monmap --mkkey
 ** WARNING: Ceph is still under development.  Any feedback can be directed  **
 **          at ceph-devel@vger.kernel.org or http://ceph.newdream.net/.     **
2011-09-07 15:17:17.756554 7ff1ca3a4760 created object store /data/osd.3 journal /data/osd.3/journal for osd3 fsid d9dbbbfc-12ec-7d89-49cd-c91d6c598715
2011-09-07 15:17:17.756987 7ff1ca3a4760 created new key in keyring /etc/ceph/keyring.osd.3


root@ubuntu1104-64-6:/etc/ceph$ ceph auth add osd.3 osd 'allow *' mon 'allow rwx' -i /etc/ceph/keyring.osd.3
2011-09-07 15:17:55.574961 7f7d4bcc5740 read 85 bytes from /etc/ceph/keyring.osd.3
2011-09-07 15:17:55.578720 mon <- [auth,add,osd.3,osd,allow *,mon,allow rwx]
2011-09-07 15:17:55.790027 mon0 -> 'added key for osd.3' (0)


root@ubuntu1104-64-6:/etc/ceph$ ceph osd setmaxosd 4
2011-09-07 15:19:50.847283 mon <- [osd,setmaxosd,4]
2011-09-07 15:19:51.210703 mon0 -> 'set new max_osd = 4' (0)

root@ubuntu1104-64-6:/etc/ceph$ service ceph start osd.3
[/etc/ceph/fetch_config /tmp/fetched.ceph.conf.9423]
ceph.conf                                                         100% 4454     4.4KB/s   00:00
=== osd.3 ===
Mounting Btrfs on ubuntu1104-64-6:/data/osd.3
Scanning for Btrfs filesystems
Starting Ceph osd.3 on ubuntu1104-64-6...
starting osd3 at 0.0.0.0:6803/9532 osd_data /data/osd.3 /data/osd.3/journal

root@ubuntu1104-64-6:/etc/ceph$ ceph -s
2011-09-07 15:21:52.259399    pg v112: 594 pgs: 594 active+clean; 24 KB data, 65856 KB used, 191 GB / 200 GB avail
2011-09-07 15:21:52.260887   mds e4: 1/1/1 up {0=alpha=up:active}
2011-09-07 15:21:52.260924   osd e7: 4 osds: 4 up, 4 in
2011-09-07 15:21:52.260979   log 2011-09-07 15:21:48.274685 osd2 172.16.33.6:6800/9040 127 : [INF] 1.1p2 scrub ok
2011-09-07 15:21:52.261057   mon e1: 1 mons at {alpha=172.16.33.5:6789/0}


沒有留言: