Retrieving SMART status from a SDA disk attached to a MegaRAID card

Today I realised that manually checking the smart status of a disk required a bit more.

[root@box ~]# smartctl -a -d megaraid,0 /dev/sda
smartctl 5.43 2016-09-28 r4347 [x86_64-linux-2.6.32-696.16.1.el6.x86_64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

Vendor:               SEAGATE
Product:              ST3146356SS
Revision:             HS10
User Capacity:        146,815,733,760 bytes [146 GB]
Logical block size:   512 bytes
Logical Unit id:      -----
Serial number:        ----
Device type:          disk
Transport protocol:   SAS
Local Time is:        Thu Mar 15 05:18:57 2018 CDT
Device supports SMART and is Enabled
Temperature Warning Disabled or Not Supported
SMART Health Status: OK

Current Drive Temperature:     32 C
Drive Trip Temperature:        68 C
Elements in grown defect list: 15
Vendor (Seagate) cache information
  Blocks sent to initiator = 3694557980
  Blocks received from initiator = 4259977977
  Blocks read from cache and sent to initiator = 2859908284
  Number of read and write commands whose size <= segment size = 1099899109
  Number of read and write commands whose size > segment size = 0
Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 65098.07
  number of minutes until next internal SMART test = 23

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   105645673        6         0  105645679   105645679      65781.538           0
write:         0        0        38        38         45      48511.618           7
verify: 48452245        7         0  48452252   48452259      43540.092           7

Non-medium error count:       48

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                  16       1                 - [-   -    -]
# 2  Background short  Completed                  16       0                 - [-   -    -]

In order to retrieve this detail you need to use -d megaraid,n where n is the disk id number. Try 0, 1, 2, 3, etc. Or use the megaraidCLI to get a list of all the disks. I dunno I thought it was worth mentioning at least. It always pays to check this if customer is having weird I/O troubles. Quite a lot of detail is provided about errors the disk encounters. So looking here, even if SMART OK. Gives you an idea if any test failing for disk.

Converting a QEMU qcow2 cloud server image to an native disk img and putting on physical disk

Got this question at work a lot. Thought I’d finally get around to putting it down since it’s came up for me. I’ve got a virtual machine using virtio passthrough for my pcie, and I found actually that disk access via the qcow2 is pretty naff.

sudo apt-get install qemu-kvm

qemu-img convert windows10cloudimage.qcow2 -O raw diskimage.img

dd if=/path/to/windos10cloudimage.qcow2 of=/dev/sdc2

Please note in my case the physical partition I’d made was sdc2. I’d actually resized another 5TB disk I have in my system using gparted. Just so I can attach a physical partition with libvirtd. Evidently though libvirtd-manager doesn’t allow this business so I have to edit the xlm file in /etc/qemu/windows10.xml .

 

root@adam:/etc/libvirt/qemu# virsh  define /etc/libvirt/qemu/win10-uefi.xml 
Domain win10-uefi defined from /etc/libvirt/qemu/win10-uefi.xml
root@adam:/etc/libvirt/qemu# virt-manager

yeah baby!


You could alternatively do it all in one like below, though you may desire a copy of the img file as well as putting it to the disk.

qemu-img convert windows10.qcow2 -O raw /dev/sdc

Help! I can’t login to my cloud-server even though I’ve reset my root password

The most common cause of this is the permit root login is set to no, although there might be other causes, like a really broken sshd_config, instead of just one variable. The procedure for looking into this is pretty much the same regardless of the breakage that has occurred. Here is what you need to do:

Here’s the full procedure:

1) Put server into rescue mode.
2) Login to cloud-server on SSH port, please note rescue mode gives you a new temporary root password allowing you to reset the password for SSH on the ‘original disk’.
3) once logged in mount the /dev/xvdb devices, this may be /dev/xvdb1 or /dev/xvdb2 but is usually /dev/xvdb1 and chroot (change root to the ‘original disk’)

# Mount old disk
mnt /dev/xvdb1 /mnt

# Change to the ‘old disk’
chroot /mnt

# Set the new password for root on the old disk:

passwd
# enter the new password when prompted

and specifically ensure that /etc/ssh/sshd_config has this line:

PermitRootLogin no

changed to:

PermitRootLogin yes

Your developer or sysad won’t be able to login until you reset the root password here, and if you do not know the username to su to root from, it is absolutely critical to perform this work, otherwise you won’t be able to access the server.

Also, once you have allowed the root login, and changed the password to something you recognise you will be able to exit rescue mode thru the control panel and login to the machine as normal.

For more detail about how to do this (although all the steps are here pretty much, please see):

https://support.rackspace.com/how-to/rackspace-cloud-essentials-rescue-mode-on-linux-cloud-servers/

I hope this helps you folks out some,

DISASTER RECOVERY! Exporting a Broken Cloud-server image VHD from Rackspace and attempting to recover data

Thanks to my colleague Marcin for thie guestmount tools protip.

I wrote a previous guide which explains how to download/export a Cloud server image VHD from Rackspace Cloud, which is failing to build. It might allow you to perform data recovery, even if the image can’t be booted. Which I’m guessing someone is going to run into sooner or later, and will be pleased to see this article, it will at least give you a best shot at reading the VHD and recovering it, since as you might know already, just because boot or kernel is broken, doesn’t mean that the data isn’t there!

# A better article to use if you want to download via commandline
https://community.rackspace.com/products/f/25/t/3583

# My article doing this thru a web-browser which might be useful too for some customers
https://community.rackspace.com/products/f/25/t/7089

Once the image gets downloaded to your new cloud instance you can use ‘libguestfs-tools’ package (same name on Ubuntu and CentOS) which contains tools necessary for mounting .vhd image files.

The command would be (read-only mode):

guestmount -a {image-name}.vhd -i --ro {mount-point}