Sharing

2012年2月25日 星期六

30 days with the cloud


http://www.pcworld.com/businesscenter/article/243128-2/30_days_with_the_cloud.html

這個作家還滿有趣的, 花 30 天寫一篇 journal , 目前為止已經寫了很多類似的 journal, 而這次他挑的主題是  Cloud. 我覺得是這系列還滿有意思的, 你可以看的出來一般使用者在想什麼, 會遇到什麼問題, 不管是大是小, 那的確就是讓人困擾的一部份, 而我們工程師不就是為了解決這些困擾而存在的嗎 (笑) 而且這個使用者也不是很普通的使用者, 所以他質疑的點, 或是他試用的東西, 都還滿深入的. 看看囉!

2012年2月14日 星期二

網路流量監控 ntop


在 Ubuntu 上安裝 ntop 很方便, 只要透過 apt-get 即可, 中間會要求你輸入 admin 的密碼

wistor@wistor-003:~$ sudo apt-get install ntop
[sudo] password for wistor:
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
  javascript-common libdbi1 libjs-mochikit librrd4 ntop-data python-mako python-markupsafe ttf-dejavu ttf-dejavu-extra
  wwwconfig-common
Suggested packages:
  graphviz gsfonts geoip-database-contrib python-beaker python-mako-doc mysql-client postgresql-client apache2
The following NEW packages will be installed:
  javascript-common libdbi1 libjs-mochikit librrd4 ntop ntop-data python-mako python-markupsafe ttf-dejavu
  ttf-dejavu-extra wwwconfig-common
0 upgraded, 11 newly installed, 0 to remove and 1 not upgraded.
Need to get 5,864 kB of archives.
After this operation, 16.8 MB of additional disk space will be used.
Do you want to continue [Y/n]? y
Get:1 http://tw.archive.ubuntu.com/ubuntu/ oneiric/universe wwwconfig-common all 0.2.2 [18.0 kB]
Get:2 http://tw.archive.ubuntu.com/ubuntu/ oneiric/universe javascript-common all 8 [4,208 B]
Get:3 http://tw.archive.ubuntu.com/ubuntu/ oneiric/main libdbi1 amd64 0.8.4-5.1 [28.5 kB]
Get:4 http://tw.archive.ubuntu.com/ubuntu/ oneiric/universe libjs-mochikit all 1.4.2-3fakesync1 [376 kB]
...
Fetched 5,864 kB in 2s (2,346 kB/s)
Preconfiguring packages ...
Selecting previously deselected package wwwconfig-common.
(Reading database ... 64249 files and directories currently installed.)
Unpacking wwwconfig-common (from .../wwwconfig-common_0.2.2_all.deb) ...
Selecting previously deselected package javascript-common.
Unpacking javascript-common (from .../javascript-common_8_all.deb) ...
Selecting previously deselected package libdbi1.
Unpacking libdbi1 (from .../libdbi1_0.8.4-5.1_amd64.deb) ...
Selecting previously deselected package libjs-mochikit.
Unpacking libjs-mochikit (from .../libjs-mochikit_1.4.2-3fakesync1_all.deb) ...
Selecting previously deselected package librrd4.
Unpacking librrd4 (from .../librrd4_1.4.3-3.1ubuntu2_amd64.deb) ...
Selecting previously deselected package ntop-data.
Unpacking ntop-data (from .../ntop-data_3%3a4.0.3+dfsg1-3build1_all.deb) ...
Selecting previously deselected package python-markupsafe.
Unpacking python-markupsafe (from .../python-markupsafe_0.12-2build1_amd64.deb) ...
Selecting previously deselected package python-mako.
Unpacking python-mako (from .../python-mako_0.4.1-2_all.deb) ...
Selecting previously deselected package ntop.
Unpacking ntop (from .../ntop_3%3a4.0.3+dfsg1-3build1_amd64.deb) ...
Selecting previously deselected package ttf-dejavu-extra.
Unpacking ttf-dejavu-extra (from .../ttf-dejavu-extra_2.33-1ubuntu1_amd64.deb) ...
Selecting previously deselected package ttf-dejavu.
Unpacking ttf-dejavu (from .../ttf-dejavu_2.33-1ubuntu1_amd64.deb) ...
Processing triggers for man-db ...
Processing triggers for ureadahead ...
ureadahead will be reprofiled on next reboot
Processing triggers for fontconfig ...
Setting up wwwconfig-common (0.2.2) ...
Setting up javascript-common (8) ...
Setting up libdbi1 (0.8.4-5.1) ...
Setting up libjs-mochikit (1.4.2-3fakesync1) ...
Setting up librrd4 (1.4.3-3.1ubuntu2) ...
Setting up ntop-data (3:4.0.3+dfsg1-3build1) ...
Setting up python-markupsafe (0.12-2build1) ...
Setting up python-mako (0.4.1-2) ...
Setting up ntop (3:4.0.3+dfsg1-3build1) ...
Adding system user: ntop.
Warning: The home dir /var/lib/ntop you specified already exists.
Adding system user `ntop' (UID 106) ...
Adding new group `ntop' (GID 114) ...
Adding new user `ntop' (UID 106) with group `ntop' ...
The home directory `/var/lib/ntop' already exists.  Not copying from `/etc/skel'.
adduser: Warning: The home directory `/var/lib/ntop' does not belong to the user you are currently creating.
Wed Feb 15 14:31:59 2012  NOTE: Interface merge enabled by default
Wed Feb 15 14:31:59 2012  Initializing gdbm databases
Wed Feb 15 14:31:59 2012  Setting administrator password...
Wed Feb 15 14:31:59 2012  Admin user password has been set
Wed Feb 15 14:31:59 2012  Admin password set...
Starting network top daemon: Wed Feb 15 14:32:00 2012  NOTE: Interface merge enabled by default
Wed Feb 15 14:32:00 2012  Initializing gdbm databases
ntop
Setting up ttf-dejavu-extra (2.33-1ubuntu1) ...
Setting up ttf-dejavu (2.33-1ubuntu1) ...
Processing triggers for libc-bin ...
ldconfig deferred processing now taking place


安裝好之後, 預設 port number 是 3000, 所以連到 http://:3000 就可以看到結果



不過馬上就發現怎麼預設是只有 eth0 而已, 如果想要轉換不同的 NIC , 必須修改一下設定檔 /var/lib/ntop/init.cfg

root@wistor-007:/var/lib/ntop$ cat /var/lib/ntop/init.cfg
USER="ntop"
INTERFACES="eth0,eth1"

root@wistor-006:~$ sudo /etc/init.d/ntop restart
Stopping network top daemon: ntop
Starting network top daemon: Wed Feb 15 14:39:35 2012  NOTE: Interface merge enabled by default
Wed Feb 15 14:39:35 2012  Initializing gdbm databases
ntop


重啟之後進入頁面會發現 eth1 也出來了,



接下來記得把 NetFlow Plugin 打開來, 那就可以在 eth0 和 eth1 之間順利轉換








2012年2月6日 星期一

NTP 設定筆記

http://linux.vbird.org/linux_server/0440ntp.php

台灣常見的 NTP Server List:


  • tick.stdtime.gov.tw
  • tock.stdtime.gov.tw
  • time.stdtime.gov.tw
  • clock.stdtime.gov.tw
  • watch.stdtime.gov.tw

NTP 是使用 port 123 為連結的埠口

NTP 有階層的概念, 以用來分散流量, 以及決定時間的準確性

主要時間伺服器: http://support.ntp.org/bin/view/Servers/StratumOneTimeServers
次要時間伺服器: http://support.ntp.org/bin/view/Servers/StratumTwoTimeServers

Configuration

restrict [你的IP] mask [netmask_IP] [parameter]


其中 parameter 的參數主要有底下這些:

http://linux.die.net/man/5/ntp.conf
http://linux.die.net/man/5/ntp_acc

http://support.ntp.org/bin/view/Support/AccessRestrictions

ignore -- 拒絕所有類型的 NTP 連線;
nomodify -- 用戶端不能使用 ntpdc 與 ntpq 這兩支程式來修改伺服器的時間參數, 但用戶端仍可透過這部主機來進行網路校時的;
noquery -- 用戶端不能夠使用 ntpq, ntpdc 等指令來查詢時間伺服器,等於不提供 NTP 的網路校時囉;

不太確定鳥哥是筆誤還是年代的關係, 我看官網其實指的是 ntpdc 和 ntpq, 而不是 ntpd, 我還順便找了一下 man page 來看

http://linux.die.net/man/8/ntpdc
http://linux.die.net/man/8/ntpq

pjack@ubuntu:~$ ntpdc -p
     remote           local      st poll reach  delay   offset    disp
=======================================================================
=europium.canoni 192.168.0.3      2   64    3 0.30498  0.007932 1.98438
=211.79.171.150  192.168.0.3      3   64    3 0.05931  0.040888 1.98438
=220-133-13-3.HI 192.168.0.3      2   64    3 0.02657  0.004869 1.98438
=59-124-196-84.H 192.168.0.3      2   64    3 0.02979  0.008247 1.98438
=tock.stdtime.go 192.168.0.3      2   64    3 0.01994  0.017888 1.98438

pjack@ubuntu:~$ ntpdc -l
client    europium.canonical.com
client    211.79.171.150
client    220-133-13-3.HINET-IP.hinet.net
client    59-124-196-84.HINET-IP.hinet.net
client    tock.stdtime.gov.tw

pjack@ubuntu:~$ ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 59-124-196-83.H 59.124.196.87    2 u   16   64    7   30.366    4.155   2.271
 220-133-13-3.HI 192.43.244.18    2 u   14   64    7   26.719    2.564   2.793
 tock.stdtime.go 59.124.196.87    2 u   13   64    7   17.852    6.259   1.371
 59-124-196-84.H 59.124.196.87    2 u   14   64    7   58.868  -17.627  10.498
 europium.canoni 193.79.237.14    2 u   11   64    7  302.951   -2.471   2.228

pjack@ubuntu:~$ ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 59.124.196.83   59.124.196.87    2 u   37   64    3   30.188    2.385   0.910
 220.133.13.3    192.43.244.18    2 u   36   64    3   26.420    0.641   1.528
 220.130.158.71  59.124.196.87    2 u   34   64    3   21.191    6.301   1.981
 59.124.196.84   59.124.196.87    2 u   35   64    3   72.039  -25.230  20.354
 91.189.94.4     193.79.237.14    2 u   33   64    3  302.555   -3.432   2.041


pjack@ubuntu:~$ more /etc/ntp.conf

# Specify one or more NTP servers.

server 0.ubuntu.pool.ntp.org
server 1.ubuntu.pool.ntp.org
server 2.ubuntu.pool.ntp.org
server 3.ubuntu.pool.ntp.org

# Use Ubuntu's ntp server as a fallback.
server ntp.ubuntu.com

# 加入要同步的 ntp server

# 加入本機, 並且設定成第十層(最大是 15)
server 127.127.1.0
fudge 127.127.1.0 stratum 10

# Access control configuration; see /usr/share/doc/ntp-doc/html/accopt.html for
# details.  The web page < http://support.ntp.org/bin/view/Support/AccessRestrictions>

# 預設先把所有的東西都關起來, 也可以寫成 
# restrict default ignore
restrict -4 default kod notrap nomodify nopeer noquery
restrict -6 default kod notrap nomodify nopeer noquery

# 放行在同一個區網的 clinet 但不能修改
restrict 192.168.100.0 mask 255.255.255.0 nomodify notrap

# 放行自己
# Local users may interrogate the ntp server more closely.
restrict 127.0.0.1
restrict ::1

pjack@ubuntu:~$ sudo netstat -tlunp | grep ntp
[sudo] password for pjack:
udp        0      0 192.168.0.3:123         0.0.0.0:*                           1947/ntpd
udp        0      0 127.0.0.1:123           0.0.0.0:*                           1947/ntpd
udp        0      0 0.0.0.0:123             0.0.0.0:*                           1947/ntpd
udp6       0      0 ::1:123                 :::*                                1947/ntpd
udp6       0      0 fe80::20c:29ff:fe78:123 :::*                                1947/ntpd
udp6       0      0 :::123                  :::*                                1947/ntpd

pjack@ubuntu:~$ ntptime
ntp_gettime() returns code 0 (OK)
  time d2da6bc3.34d94000  Mon, Feb  6 2012  7:09:55.206, (.206440),
  maximum error 239875 us, estimated error 6508 us, TAI offset 0
ntp_adjtime() returns code 0 (OK)
  modes 0x0 (),
  offset 0.000 us, frequency 18.986 ppm, interval 1 s,
  maximum error 239875 us, estimated error 6508 us,
  status 0x1 (PLL),
  time constant 6, precision 1.000 us, tolerance 500 ppm,
pjack@ubuntu:~$ ntptrace
localhost: stratum 3, offset 0.000000, synch distance 0.017913
220.130.158.71: timed out, nothing received
***Request timed out



這個 ntpq -p 可以列出目前我們的 NTP 與相關的上層 NTP 的狀態,上頭的幾個欄位的意義為:

remote:亦即是 NTP 主機的 IP 或主機名稱囉~注意最左邊的符號
如果有『 * 』代表目前正在作用當中的上層 NTP
如果是『 + 』代表也有連上線,而且可作為下一個提供時間更新的候選者。
refid:參考的上一層 NTP 主機的位址
st:就是 stratum 階層囉!
when:幾秒鐘前曾經做過時間同步化更新的動作;
poll:下一次更新在幾秒鐘之後;
reach:已經向上層 NTP 伺服器要求更新的次數
delay:網路傳輸過程當中延遲的g時間,單位為 10^(-6) 秒
offset:時間補償的結果,單位與 10^(-3) 秒
jitter:Linux 系統時間與 BIOS 硬體時間的差異時間, 單位為 10^(-6) 秒。



pjack@ubuntu:~$ sudo service ntp stop
 * Stopping NTP server ntpd                                                                       pjack@ubuntu:~$ sudo ntpdate tick.stdtime.gov.tw
 6 Feb 07:28:08 ntpdate[2132]: adjust time server 59.124.196.83 offset -0.001831 sec



補充:
ntp org 的網址
http://www.ntp.org/
wiki 的網址, 裡面有演算法和層次
http://en.wikipedia.org/wiki/Network_Time_Protocol#Clock_strata

英文版最清楚的設定教學

http://www.brennan.id.au/09-Network_Time_Protocol.html


英文版稍微進階一點

http://www.akadia.com/services/ntp_synchronize.html


2012年2月3日 星期五

Evolution of the Storage Brain 筆記 (二)


Chapter 4. A Journey to the Center of the Storage Brain

The Past: Refrigerators and Cards
  • Skip
The Middle Ages: The Age of RAID Controllers

    RAID controllers combined the basic functionality of disk controllers with the ability to group drives together for added performance and reliability
    • Communication between the underlying disks and the attached host computer
    • Protection against disk failure through the creation of one or more RAID groups
    • Dual-controller systems, which could be used for added performance, availability and automated failover
    The Modern Age: Storage Array Controllers
    • The storage industry became very volatile during the 90s
    • Modern-day networked storage was born. 
    • EMC and NetApp capitalized on increased storage intelligence and grew at record paces
    Enterprise Array Controllers: A More Intelligent Brain

    As the storage industry moved from the late 90s into the mid 00s, the market stabilized into a smaller albeit more mature set of suppliers

    Table: Data Management Tasks Addressed by Modern Arrays

    Function
    Description
    Performance
    Storage arrays are required to quickly process a multitude of data I/O requests arriving simultaneously from hundreds (or thousands) of desktops and servers
    Resiliency
    Ø  Automated RAID Rebuilds
    Ø  Data Integrity Checks
    Ø  Phone Home” Alerts
    Ø  Environmental Monitoring
    Ø  Non-disruptive Software Updates
    Virtualization
    Ø  Transparent Volume/LUN resizing
    Ø  Thin Provisioning
    Ø  Data Cloning
    Ø  Data Compression
    Ø  Data Deduplication

    Chapter 5.  Without Memory, You Don’t Have a Brain

    Human
    PC
    Sensory memory
    buffers and registers
    Short-term memory
    cache, Random Access Memory (RAM), flash and solid-state disks (SSDs)
    Long-term memory
    hard disk drives

    Era
    Memory Type
    40s-50s
    Cathode Ray Tubes
    50s-60s
    Magnetic Core Memory
    70s-present
    Random Access Memory (RAM) integrated circuits

    NetApp add some interesting algorithm-level intelligence to its PAM cache
    • Priority-based caching
    • Non-redundant data caching
    • Predictive caching
    • Immediate caching
    • Metadata caching
    Here are the trends we’ll see in the industry’s efforts to reach this goal
    • Application servers and workstations will cache more and more of their own data
    • Storage networking switches will cache more and more data as it travels through their path
    • Storage systems will cache more and more front-end data
    • SSDs will become the first line of defense for large amounts of data that can’t be stored in cache
    • Hybrid disk drives will be the final step
    Chapter 6. The Storage Nervous System

    Different between NAS, SAN, and iSCSI, 看了這麼多解釋, 我還是覺得鳥哥的圖最棒




    NAS Communications Protocols
    NFS
    NFS stands for the Network File System protocol used by UNIX-
    based clients or servers
    CIFS
    CIFS stands for Common Internet File System. CIFS is another
    network-based protocol used for file access communications
    between Microsoft Windows clients and servers.
    SAN Communications Protocols
    Fibre Channel Protocol (FCP)
    Typically occurs via specialized Fibre Channel cabling, Fibre Channel host bus adapters (HBAs) and Fibre Channel switches operating between the SAN and its hosts
    iSCSI
    iSCSI stands for Internet Small Computer Systems Interface. iSCSI is “a transport protocol that provides for the SCSI protocol to
    be carried over a TCP-based IP network.”
    FcoE
    FcoE stands for Fibre Channel over Ethernet. FcoE allows Fibre Channel storage traffic to be sent over Ethernet networks

      What is data package collision in network? 

      • A network collision occurs when more than one device attempts to send a packet on a network segment at the same time
      • Collisions are resolved using carrier sense multiple access with collision detection in which the competing packets are discarded and re-sent one at a time
        • This becomes a source of inefficiency in the network
      • Collision domains are found in a hub environment
      • Collision domains are also found in wireless networks such asWi-Fi.
      • Modern wired networks use a network switch to eliminate collisions.

      Early Heritage Evolved into Separate Paths & Growing Confusion


      10 Megabit per second (Mbps) 
      Ethernet was the norm with 100 Mbps Ethernet just 
      emerging. 
      SANs came out of the gate with fiber optic cables and a protocol that could move data at 1 Gigabit per second (Gbps), a ten-fold improvement over the fastest Ethernet-based NAS transport
      • anyone with a “need for speed” simply had to use SAN storage
        • Databases, transaction processing, analytics, and similar applications fell into this category
      • NAS, although less costly and easier to implement than SAN, was usually relegated to “slower” applications
        • User files, images and Web content
      New Realities Point to SAN/NAS Convergence and Unification 


      10-Gigabit Ethernet (10-GbE or 10 Gbps) is common. 100-Gigabit Ethernet (100-GbE or 100 Gbps) devices have also been demonstrated.  Fibre Channel networks operating at 4-Gbps are common today, with 8-Gbps having also been newly delivered.


      • Unifying block and file data transport onto the same network fabric
      • Unifying SAN and NAS data storage functionality onto the same storage system.



      Advancing Intelligence for Internal Communications

      SAN Virtual Storage
      • add a second logical abstraction layer to the physical disk drives
      • did not map LUNs to physical drives, it mapped LUNs to the logical blocks stored on these drives
      NAS Virtual Storage
      • NetApp had created an intelligent internal communications network specifically designed for storage systems. The WAFL file system provided a communications network based on logical file-based objects



      Next breakthrough in storage protocols and communication becomes object-based storage, SONET, RDMA






























      2012年2月2日 星期四

      Homomorphic Encryption and Unhosted

      雲端上的隱私還滿重要的, 如何能讓提供服務的人幫你解決問題, 但又不需要透露太多真實資料給對方, 變成是現在大家努力尋找的解答,這感覺很像是隔空就醫, 醫生幫你醫了病, 但又不知道你是誰, 不知道你長什麼樣。或是你請人運送東西, 對方只要負責送達即可, 不需要告訴他你要送的是什麼。

      這兩篇有一些資訊, 不過其實看了有些霧煞煞

      http://www.openfoundry.org/tw/foss-forum/8599-tech-of-protecting-privacy-on-the-cloud-homomorphic-encryption-and-unhosted

      http://ckhung0.blogspot.com/2011/09/homomorphic-encryption-and-unhosted.html

      先看以下的說明會比較容易了解什麼是 Homomorphic Encryption, 再回頭看文章應該就會明白了

      同代加密(Homomorphic Encryption)
      一般而言,密文對不持有金鑰的人而言是均勻分佈的亂數字串,無法進行有意義的計算。 在某些特殊狀況下,加密方會希望能不經加密金鑰而直接在密文上運算,產生與明文相對應的效果。
      例如密文上的*運算對應明文上的+運算:
      If c=a+b, then E(c)=E(a)+E(b)
      具備這樣特性的加密法E()稱為同代加密。




      原文: 雲端運算, 其實 「運算」 不見得一定要發生在 「雲端」 上 



      不過對於  Unhosted  Application 的想法, 我覺得總是那裡怪怪的, 按照原文的意思來看, 似乎暗指"運算"會變成都在本地執行, 這和雲端概念其實是有些相違背, 原文又提到


       (1) 自己的資料放在雲端上 (2) 大家共用一套來自雲端的軟體


      雲端其實並不只是這樣, 更重要的是服務商提供大量的硬體及運算能力, 讓大家能夠共享這些資源, 不必家家戶戶都具備高性能的機器, 將來每個人手中拿著手機, 拿著 pod, 用的就算是十年前的機種, 仍可以使用這些服務, 因為大部份的運算都在雲端上, 資料也在雲端上, 所以我覺得 Unhosted application / Unhosted accout 只能使用在部份甚至是很少的範圍內, 又或許他必須還是要搭配著 Homomorphic Encryption 使用, 而非獨立存在。

      我又找了幾篇英文說明

      http://i4bi.org/?p=344
      http://nezerbahn.wordpress.com/2011/01/02/2011-is-the-year-of-unhosted/
      http://www.readwriteweb.com/cloud/2010/12/unhosted.php?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+readwriteweb+%28ReadWriteWeb%29

      解釋什麼是  Unhosted Application
      http://lwn.net/Articles/424822/

      Unhosted is a new project attempting to break the monopoly that SaaS providers have over users' data by seperating applications from data
      http://unhosted.org/index.html


      看完後, 我認為最上面的兩個連結對於  Unhosted 的解釋不全然是對的,  Unhosted 重頭到尾最重要的概念只是 "資料" 和 "程式" 要分開管理, 這樣我們可以針對資料做更好的控管及保護, 並不是說運算要改在本地執行, 事實上, 這樣子的分工影子已經在 Amazon EC2 & EBS 實現了, 資料的流向是
                                                   Data                                 Result
                        Storage Provider  ----->  Compute Provider  --------> User
      又或者

                                                   Data            Data                               Result
                        Storage Provider  ----->  User -----> Compute Provider  --------> User


      可以全部在本地執行的程式, 一般來說還是集中在小型程式, 大型的應該不會這樣做, 所以我認為 Unhosted 的概念是幫助我們實現更多的安全性機制的第一步, 最終還是要仰賴適合的加密演算法來達到終極保密。




      Ceph with ext4

      Ceph 預設是使用 btrfs, 所以第一次初始化時, 可以透過參數把 storage 準備好

      mkcephfs -a -c /etc/ceph/ceph.conf --mkbtrfs
      

      不過如果要改用 ext4 就沒有這麼輕鬆, 首先你必須在每一台 osd 自行準備好一塊 ext4, 而且掛載到正確的路徑
      根據網站上的描述
      The ext4 partition must be mounted with -o user_xattr or else mkcephfs will fail. Also using noatime,nodiratime boosts performance at no cost. When using ext4, you should disable the ext4 journal

      root@wistor-dev-7:~$ mke2fs -t ext4 /dev/mapper/ubuntu64--33--7-lvol0
      mke2fs 1.41.14 (22-Dec-2010)
      Filesystem label=
      OS type: Linux
      Block size=4096 (log=2)
      Fragment size=4096 (log=2)
      Stride=0 blocks, Stripe width=0 blocks
      13107200 inodes, 52428800 blocks
      2621440 blocks (5.00%) reserved for the super user
      First data block=0
      Maximum filesystem blocks=4294967296
      1600 block groups
      32768 blocks per group, 32768 fragments per group
      8192 inodes per group
      Superblock backups stored on blocks:
              32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
              4096000, 7962624, 11239424, 20480000, 23887872
      
      Writing inode tables: done
      Creating journal (32768 blocks): done
      Writing superblocks and filesystem accounting information: done
      
      This filesystem will be automatically checked every 28 mounts or
      180 days, whichever comes first.  Use tune2fs -c or -i to override.
      
      root@wistor-dev-7:~$ tune2fs -o journal_data_writeback /dev/mapper/ubuntu64--33--7-lvol0
      tune2fs 1.41.14 (22-Dec-2010)
      
      root@wistor-dev-7:~$ tune2fs -O ^has_journal /dev/mapper/ubuntu64--33--7-lvol0
      tune2fs 1.41.14 (22-Dec-2010)
      
      root@wistor-dev-7:~$ e2fsck -f /dev/mapper/ubuntu64--33--7-lvol0
      e2fsck 1.41.14 (22-Dec-2010)
      Pass 1: Checking inodes, blocks, and sizes
      Pass 2: Checking directory structure
      Pass 3: Checking directory connectivity
      Pass 4: Checking reference counts
      Pass 5: Checking group summary information
      /dev/mapper/ubuntu64--33--7-lvol0: 11/13107200 files (0.0% non-contiguous), 837781/52428800 blocks
      
      

      為了讓每次開機能自動掛載這個磁區, 要修改一下 /etc/fstab

      root@wistor-dev-7:~$ cat /etc/fstab
      proc            /proc           proc    nodev,noexec,nosuid 0       0
      /dev/mapper/ubuntu64--33--7-root /               ext4    errors=remount-ro 0       1
      
      # /boot was on /dev/sda1 during installation
      UUID=bf0a72da-7ca7-4960-9a7e-f90298b95609 /boot           ext2    defaults        0       2
      
      /dev/mapper/ubuntu64--33--7-swap_1 none            swap    sw              0       0
      
      # 加入這一行
      /dev/mapper/ubuntu64--33--7-lvol0 /srv/osd.2  ext4 errors=remount-ro,data=writeback,noatime,nodiratime,user_xattr  0  1
      
      
      root@wistor-dev-7:~$ mount -a
      root@wistor-dev-7:~$ mount
      /dev/mapper/ubuntu64--33--7-lvol0 on /srv/osd.2 type ext4 (rw,noatime,nodiratime,errors=remount-ro,data=writeback,user_xattr)
      
      






      然後修改 /etc/ceph/ceph.conf

      [osd]
              ; This is where the btrfs volume will be mounted.
              osd data = /srv/osd.$id
      
      [osd.0]
              host = wistor-dev-5
      
      #        把原本指定 btrfs devs 的這行拿掉
      #        btrfs devs = /dev/mapper/ubuntu1104--64--5-lvol0
      
      [osd.1]
              host = wistor-dev-6
      
      #        把原本指定 btrfs devs 的這行拿掉
      #        btrfs devs = /dev/mapper/wistor--dev--6-lvol0
      
      [osd.2]
              host = wistor-dev-7
      
      #        把原本指定 btrfs devs 的這行拿掉
      #        btrfs devs = /dev/mapper/ubuntu64--33--7-lvol0
      
      


      2012年2月1日 星期三

      Evolution of the Storage Brain 筆記 (一)

      Chapter 1. And Then There Was Disk


      First came to market in the late 50s and 60s, disk drives have relied on the following core components

      • Read/write heads that use electrical impulses to store and retrieve magnetically recorded bits of data
      • Magnetically coated disk platters that spin and house these bits 
      • Mechanical actuator arms that move the heads back and forth across the spinning disk platters, forming concentric ‘tracks’ of recorded data
      The Past: Disk Drives Prior to 1985

      The Early Days of Disk Drive Communications
      • Bus-and-Tag Systems
        • via two copper-wire cables: Bus and Tag, Bus for data, Tag for communication protocols
      • SMD Disk Drives
        • Control Data Corporation first shipped its minicomputer with storage module device (SMD) disk drives in late 1973
        • Used much smaller ―A and B flat cables to transfer control instructions (from the A cable) and data (from the B cable)
        • The disk controller shrank down to a single board which was inserted into the system‘s CPU card cage
      The Middle Ages: Disk Drives in the 80s -90s


      Disk drives in the late 80s and 90s went through a number of significant transformations that allowed them to be widely used in the emerging open systems world of servers and personal computers. These included:

      • 19” disk drives => less expensive 5.25” (and, eventually, 3.5”) drives.
      • Advances associated with redundant arrays of independent disks (RAID) technology. 
      • The development and widespread adoption of the Small Computer Systems Interface (SCSI). 

      Disk drives produced today fall into four categories, depending on their cable connections
      • SCSI
      • Fibre Channel
      • Serial ATA (SATA)
      • Serial Attached SCSI (SAS)
      SCSI used a single data cable to present its Common Command Set (CCS) interface with built-in intelligence.


      Today: Disk Drives, Pork Bellies and Price Tags

      Are Disk Drives in Our Future?

      Today‘s latest battle cry is that solid state disks (SSDs) will completely replace magnetic disk storage

      Research into the area of higher capacities for magnetic disk drives

      • Perpendicular Magnetic Recording (PMR)
        • 多層次的儲存,原本是平面,變成是 3D
      • Patterned Media Recording
      • Heat-Activated Magnetic Recording
        • relies on first heating the media so that it can store smaller bits of data per square inch
        • PMR appears to be winning the short-term race
      • Nanostorage
      Chapter 2. “Oh, @#$%!” 

      Address protection in two separate chapters
      • Protecting against disk drive failure
        • users can continue to access data previously stored on failed disks
      • Protecting against data loss or corruption
        • moves more deeply into storage software intelligence

      The Past: Protecting SLEDs

      When a drive crashed, data was recovered from tape and restored back to a new disk


      The Middle Ages: RAID in the 80s

      1987 paper 
    • called, “A Case for Redundant Arrays of Inexpensive 
    • Disks.” RAID was born.

      • provide greater efficiency and faster I/O performance
      • successfully survive a failure of any one disk drive
      • describe five different RAID methods (RAID 1 through RAID 5)


      RAID Technique
      Description
      No Parity
      (RAID 0)
      Ø  increase I/O performance by striping  (or logically distributing) data across several disk drives.
      Ø  offered no protection against failed disk drives
      Mirroring
      (RAID 1)
      Ø  data is mirrored onto a second set of disks
      Ø  exacts a high capacity penalty
      Fixed Parity
      (RAID 3 / RAID 4)
      Ø  both use parity calculations (sometimes known as checksum) to perform error-checking
      Ø  recovery of missing data from failed drives
      Striped Parity
      (RAID 5)
      Ø  parity is striped (or logically distributed) across all disks in the RAID set in an attempt to boost RAID read/write performance
      Multiple Parity
      (RAID 6)
      Ø  using multiple iterations of fixed or striped parity on a group of drives, which allows for multiple drive failures without data loss.


      The Future: Smarter, Self-Healing Disk Drives

      S.M.A.R.T. technology

      Chapter 3. Virus? What Virus? 

      Approaches to Data Loss or Corruption


      Approach
      Description
      Data replication
      Mirroring critical data to an alternate location
      Data backup
      Restore data that may have been accidentally deleted or earlier data version


      The Past: The Tale of the Tape


      Today’s Backup Tapes


      Decades of “format wars” ensued amongst vendors fighting for market share. Sample formats from this era included:



      • Quarter-Inch Cartridge (QIC)
      • 4mm or 8mm Tape
      • Digital Linear Tape (DLT)
      • Advanced Intelligent Tape (AIT)
      • Linear Tape Open (LTO)


      LTO-4 has become the reigning tape format today.

      The Middle Ages: Early Tape Backup Automation


      Tape-Based  Innovations:
      Interleaving
      Ø  improved tape backup speeds by allowing backups to be written to multiple tapes concurrently
      Ø  writing to several tapes in parallel
      Synthetic Backup
      Ø  required just a single full backup and used an intermediate database to track and map the location of the continuous incremental backups performed to tape thereafter
      Reclamation
      Ø  Also pioneered by Tivoli Storage Manager (TSM)
      Ø  the tape reclamation process solved a problem created by Synthetic tape backups
      Disk-Based Innovations
      Disk Staging
      Ø  data stored on optical media that could be moved by a staging manager to magnetic disk drives
      Ø  could be used to send the data to tape without affecting production workloads
      D2D2T
      disk-to-disk-to-tape
      D2D
      disk-to-disk without tape


      The Modern Age: Emerging D2D Efficiencies

       A Snapshot is Not a Backup


      NetApp explains this space-saving functionality as follows
        • We are able to create a snapshot in constant time because we have a map of the blocks that are allocated on disk. A snapshot is really just a copy of the block map rather than the actual disk blocks



        Use of local snapshots alone, however, still exposes the data to other corruption risks, such as

        • Widespread data corruption of the primary data set
        • Hardware failure impacting the data stored within
        How SnapVault Works:
        • The SnapVault “primary” system needing data protection.
        • The SnapVault “secondary” system where backup data is stored.
        1. SnapVault initially stores one “full” backup of the primary system‟s data set on the secondary
        2. builds on NetApp Snapshot efficiencies by quickly transmitting only the changed blocks found in the most recent snapshot of the primary system


        The Future of Data Protection


        • The new gold standard: Annual off-site archival of data to tape
        • Tape backup will become a service in lieu of local tape libraries.
        • D2D will be managed by the storage array itself.
        • Say goodbye to VTLs