Tuesday, September 18, 2018

A utility to dump block devices data in Linux

Many a times it is necessary to read raw disk blocks, such as data corruption, magic block corruption.
xxd is a simple and lightweight utility to dump a device.

# xxd /dev/vdb|less
00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000100: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000110: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000120: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000130: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000140: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000150: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000160: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000170: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000180: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000190: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000200: 4c41 4245 4c4f 4e45 0100 0000 0000 0000  LABELONE........
00000210: ddf0 c3e3 2000 0000 4c56 4d32 2030 3031  .... ...LVM2 001
00000220: 5352 7661 7344 6162 7964 4d36 6569 6a64  SRvasDabydM6eijd
00000230: 6833 4338 3462 5532 536c 444b 564c 7943  h3C84bU2SlDKVLyC
00000240: 0000 0000 2000 0000 0000 1000 0000 0000  .... ...........

Another utility is debugfs that can dump a valid disk having a file system.

Written with StackEdit.

Friday, September 14, 2018

General Script to run Linux Shell Commands

#for i in {0..24}
#for i in $(cat meta.osd.ip)
do
    #sudo ceph osd purge $i --yes-i-really-mean-it
    #ssh -q -o "StrictHostKeyChecking no" $i sudo reboot
done

Written with StackEdit.

Thursday, September 13, 2018

OSD on Debian Jessie : No cluster conf found

The problem is in the ceph-disk code.
ceph-disk prepare has a following log on the destination node

# ceph-disk -v prepare /dev/vdb
command: Running command: 
/usr/bin/ceph-osd --cluster=None --show-config-value=fsid

The value of cluster is None and that is incorrect. It must be ceph.

# /usr/bin/ceph-osd --cluster=None --show-config-value=fsid
00000000-0000-0000-0000-000000000000
# /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
68eabcd3-a4fd-4c80-9e9c-56577d841234

The code change involved the following in /usr/sbin/ceph-disk:

prepare_parser.add_argument(
    '--cluster
    metavar='NAME', default='ceph', # added the parameter `default`
    help='cluster name to assign this disk to',
  )

  • diff output
<         metavar='NAME',
---
>         metavar='NAME', default='ceph',

Written with StackEdit.

Wednesday, September 5, 2018

Write CRUSH rule for a Cluster

CRUSH rules are described as follows:

        {
            "rule_id": 1,
            "rule_name": "replicated_ruleset_hdd",
            "ruleset": 1,
            "type": 1,
            "min_size": 1,
            "max_size": 10,
            "steps": [
                {
                    "op": "take",
                    "item": -481,
                    "item_name": "hdd"
                },
                {
                    "op": "chooseleaf_firstn",
                    "num": 0,
                    "type": "jbod"
                },
                {
                    "op": "emit"
                }
            ]
        },

Visualize the CRUSH rule as a way to traverse a tree. The ‘steps’ section decides the traversal. The above rule takes the root ‘hdd’ and picks num_replicas count of bucket of type ‘jbod’. Next, it picks num_replicas count of leaves (e.g. OSDs) in the chosen three buckets.
The leaves are decided by type 0 in the CRUSH types list.

Get the current CRUSH rules

$ sudo ceph osd getcrushmap -o crush.org
The above command gets us a compiled version of CRUSH rule. If required to make changes, we need to decompile the CRUSH rule.

$ sudo crushtool -d /tmp/crush.org -o crush.org.d

Make changes and recompile the /tmp/crush.org
$ sudo crushtool -c crush.org.d -o crush.new.c

Test the new rule

# Find out incorrect mappings from the new rule
$ sudo crushtool -i <compiled crush file> --test --show-bad-mappings

# Find out behavior of a random placement
$ sudo crushtool --test -i /tmp/crush.org --show-utilization --rule 3 --num-rep=3 --simulate

# Find out behavior of the new CRUSH rule placement
$ sudo crushtool --test -i /tmp/crush.org --show-utilization --rule 3 --num-rep=3
Sample output
  device 1334:           stored : 4      expected : 2.26049
  device 1335:           stored : 2      expected : 2.26049
  device 1336:           stored : 3      expected : 2.26049
  device 1337:           stored : 2      expected : 2.26049
  device 1338:           stored : 1      expected : 2.26049
  device 1339:           stored : 2      expected : 2.26049
  device 1340:           stored : 2      expected : 2.26049

References

Written with StackEdit.