Sunday, December 21, 2014

BeagleBone Bugs

I've been fighting with the BeagleBone (white) for the last couple weeks, and wanted to point out a couple issues I've had.

  • First, the board behaves differently depending on how you power it!  If you power it from the USB connector, some parts of it don't run at full speed (at least the PRU/DMA), which completely breaks projects like LEDscape (it doubles the length of the waveform generated, which causes random flickering on the LEDs).  The only thing I could find that might shed some light on it is in this code, line 272.  Ridiculous.

  • I had a lot of problems with the SD card reader.  Mostly, I keep getting kernel panics from mmcqd, but I also had multiple corrupted SD cards, and even one die.  I'm not sure if the corruption and failure are related to the kernel issues, but it has made the beaglebone basically unusable for me.  Some mailing lists suggested the kernel panics were fixed in newer versions of linux (3.12+?  if I find the link I'll update this), but unfortunately no kernels past 3.8 support the beaglebone cape manager (device tree overlays), so you can't really use it with any peripherals (or LEDscape).

  • On that note, it sounds like newer versions of linux do support peripherals, but can only be configured at run-time.  This would be fine, but documentation is scarce (e.g. I'm not going to waste time figuring it out).  (A bit more info here.)

  • I was intrigued by the armhf builds, and even used the 14.04 Ubuntu build briefly, but didn't see a way to easily add the RNDIS usbnet (usb0) drivers to it, which made it much less useful for me.  (And even if I could, I couldn't get the dtos working anyway...)
In short, they really need to release a new official linux build for the beaglebone which is stable and standardizes the way capes are managed.

Update:  Because of these issues I ended up buying a BeagleBone Black, mainly to bypass the need for an external SD card.  It does seem much more stable and faster (I was getting random lockups on the white), but even the first thing I tried to do, backup the onboard eMMC, was a hassle.

While there is a nifty tool for extracting the eMMC contents, I had a lot of problems getting it to work:

  • You have to power it off the barrel jack and only hold the s2 button for a few seconds while plugging it in (neither of which were documented).

  • It only writes a 2GB file!  I verified this was not a format or filesystem issue (I tested on multiple SD cards and filesystems, e.g. EXT2, EXT4, and FAT32, all of which could handle 4GB files just fine with the stock kernel), nor was it a ulimit issue.  It seems the OS used for the extractor image doesn't have large file support (LFS) built in, or the tools were not compiled with the _FILE_OFFSET_BITS=64.  Since no one else has complained about this problem it makes me think it's something I'm doing, but I'm stumped...  I tried both dd and cat, which complained about "File too large" when it reached 2GB (I don't recall the cat error message).
The temporary fix was to just to copy one of the stock images to the SD card, boot to it, and manually dd the eMMC over to a file.  This seems simple enough, but if you want to be able to read the images from Windows it's a bit trick since the FAT partition is FAT16 and only 96MB, and you can only increase the size of the ext4 partition (which has no free space).  You'd think you could just create a 3rd partition with a FAT32 format, but Windows is ridiculously stupid, and for some reason doesn't read more than on partition on removable media (or so it seems, there are driver hacks to do more, but I didn't think it was worth it).  Thus the fix is to move the ext4 partition to the end of the disk and replace the small FAT16 with a large FAT32 partition at the beginning of the disk, mark it bootable, and copy the contents back over.  It requires some fdisk aerobics, but the rough process is below, which ended up working great.  After that you just boot to the sdcard and run something like "dd if=/dev/mmcblk1 of=/boot/uboot/images/BeagleBone.img bs=16M", which will make the image show up in the images folder when you put the sdcard back in to windows.

Note this is done on a 32GB disk, which already had the stock image written to it, and is being manipulated from a BeagleBone running off the eMMC.  This is for reference only; you should backup all data before doing anything with fdisk!

root@beaglebone:~# mkdir stock-boot root@beaglebone:~# cp-r /media/9F8D-4438/ stock-boot root@beaglebone:~# umount /dev/mmcblk1* umount: /dev/mmcblk1: not mounted root@beaglebone:~# fdisk /dev/mmcblk1 Command (m for help): p Disk /dev/mmcblk1: 31.9 GB, 31914983424 bytes 4 heads, 16 sectors/track, 973968 cylinders, total 62333952 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/mmcblk1p1 * 2048 198655 98304 e W95 FAT16 (LBA) /dev/mmcblk1p2 198656 3481599 1641472 83 Linux Command (m for help): n Partition type: p primary (2 primary, 0 extended, 2 free) e extended Select (default p): p Partition number (1-4, default 3): 4 First sector (3481600-62333951, default 3481600): 58139648 Last sector, +sectors or +size{K,M,G} (58139648-62333951, default 62333951): Using default value 62333951 Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. Syncing disks. root@beaglebone:~# dd if=/dev/mmcblk1p2 of=/dev/mmcblk1p4 bs=16M 100+1 records in 100+1 records out 1680867328 bytes (1.7 GB) copied, 241.851 s, 7.0 MB/s root@beaglebone:~# umount /mnt root@beaglebone:~# fdisk /dev/mmcblk1 Command (m for help): d Partition number (1-4): 1 Command (m for help): d Partition number (1-4): 2 Command (m for help): d Selected partition 4 Command (m for help): Command (m for help): n Partition type: p primary (0 primary, 0 extended, 4 free) e extended Select (default p): p Partition number (1-4, default 1): 1 First sector (2048-62333951, default 2048): Using default value 2048 Last sector, +sectors or +size{K,M,G} (2048-62333951, default 62333951): 58139647 Command (m for help): n Partition type: p primary (1 primary, 0 extended, 3 free) e extended Select (default p): p Partition number (1-4, default 2): Using default value 2 First sector (58139648-62333951, default 58139648): Using default value 58139648 Last sector, +sectors or +size{K,M,G} (58139648-62333951, default 62333951): Using default value 62333951 Command (m for help): p Disk /dev/mmcblk1: 31.9 GB, 31914983424 bytes 4 heads, 16 sectors/track, 973968 cylinders, total 62333952 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/mmcblk1p1 2048 58139647 29068800 83 Linux /dev/mmcblk1p2 58139648 62333951 2097152 83 Linux Command (m for help): t Partition number (1-4): 1 Hex code (type L to list codes): c Changed system type of partition 1 to c (W95 FAT32 (LBA)) Command (m for help): a Partition number (1-4): 1 Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. WARNING: If you have created or modified any DOS 6.x partitions, please see the fdisk manual page for additional information. Syncing disks. root@beaglebone:~# resize2fs /dev/mmcblk1p2 resize2fs 1.42.5 (29-Jul-2012) Resizing the filesystem on /dev/mmcblk1p2 to 524288 (4k) blocks. The filesystem on /dev/mmcblk1p2 is now 524288 blocks long. root@beaglebone:~# mkfs.vfat /dev/mmcblk1p1 mkfs.vfat 3.0.13 (30 Jun 2012) root@beaglebone:~# mount /dev/mmcblk1p1 /mnt root@beaglebone:~# cp -r stock-boot/* /mnt

Update 2: I also noticed that while configuring a static IP address through /etc/network/interfaces works on boot, it gets lost any time you disconnect the ethernet cable, even if you plug it back in! The solution is to remove wicd (apt-get remove wicd), or to configure wicd via wicd-ncurses.
Update 3: I also had this issue (Element 14 image specific).

Sunday, December 30, 2012

Matlab 2011b and Xilinx 13.4 on Ubuntu Server

Just a couple quick notes on some issues I ran in to installing Xilinx and Matlab on Ubuntu server:
  • Matlab needs the package "default-jre" or else the installer fails with some cryptic error message.

  • You need to run "sudo ln -s /lib/x86_64-linux-gnu/ /lib64/" after you install Matlab on a 64 bit machine to avoid the launch error "./matlab: 1: /usr/local/MATLAB/R2011b/bin/util/ /lib64/ not found".

  • Xilinx doesn't like newer versions of glibc (segfaults and memory dumps), so I had problems running it on Ubuntu 12.04 and 12.10. I reverted to 11.10 and it worked fine.

  • Xilinx installed fine, and even seemed to run okay, but had weird GUI issues (e.g., no "Open Project" option in the file menu of XPS) without the full ubuntu-desktop package installed. It probably doesn't need every package, but I didn't care enough to figure out the exact dependencies that were causing the problem.

  • I think the missing package (which simulink also complains about) is "libglu1-mesa" rather than installing the whole ubuntu-desktop package.

  • I got the error "/usr/local/Xilinx/14.4/ISE_DS/ISE/sysgen/util/sysgen: 82: /usr/local/Xilinx/14.4/ISE_DS/ISE/sysgen/util/sysgen: Syntax error: "(" unexpected", which was resolved by editing that sysgen run script to use bash("#!/bin/bash") rather than sh (or changing sh to link to bash rather than dash in /bin).

  • I wanted to run sysgen from a build script, but it had trouble passing the -r argument to matlab (actually, it would run one command, but not exit, since it is actually using the -r switch itself...).  To keep the Xilinx Blockset in matlab without running sysgen directly, first run sysgen, then do a "savepath" in matlab.  Now when you run matlab (assuming you have run "source /usr/local/Xilinx/14.4/ISE_DS/ ") you can get to all the sysgen stuff. 

  • I also needed to do this.

Wednesday, December 19, 2012

Ubuntu 12.04 (LTS), Xen 4.1, and DRBD

I recently set up a couple high-end servers as a small "cluster" to replace an old Xen VM server. The idea was to use DRBD (Distributed Replicated Block Device) to mirror all of the storage between servers for both data redundancy and resource flexibility. This allows separate VMs to run on both servers, taking full advantage of available CPU and memory, while still being able to be quickly and easily migrated between servers in the event of failure, or just to optimize performance and resource usage. DRBD allows this to be done without some sort of iSCSI or NAS typically used in larger clusters, with the added benefit of being decentralized and not suffering from a network latency overhead. Additionally, the targeted VMs have vastly different performance requirements, ranging from simple web servers to high-performance research workstations, so having two servers allowed me to economically provision one as "low-end" for simple web/file server VMs that just needed high availability, and one as "high-end" for the compute and memory intensive VMs.

Both servers use Supermicro's brand new, very, very, nice, X9DAX-7TF motherboard, which supports dual-socket Intel E5-2600 series CPUs, along with dual 10 GbE ports, USB 3.0, and a ton of other neat goodies. Unfortunately this board is oversized, and, since I wanted both rackmount and tower configurations along with full size PCIe cards, the only chassis that really worked was the somewhat overpriced Supermicro SC747. The "low-end" server has dual E5-2620s (6-core, 2Ghz) and 64GB of RAM, and the "high-end" server has dual E5-2687Ws (8-core, 3.1Ghz, 3.8GHz turbo-boost) and 96GB of RAM. Since redudancy, uptime, and protection against data-loss were top priorities, both servers have RAID 6 arrays using OCZ Vertex 4 SSDs (along with a couple RevoDrives on the high-end server), which (as mentioned) are mirrored between servers using DRBD.

 Since this blog is just as much notes to myself as anything, I'll start with a couple ridiculous minor annoying issues that I ran into, but aren't really applicable to most setups.

  •  OCZ's 2.5" to 3.5" drive adapter DOES NOT MEET 3.5" drive specifications! This means that it will not fit in to 3.5" hot swap drive bays since the port is misaligned both horizontally and vertically :/. It is also pretty hard to find a drive adapter that meets spec, since the 2.5" drive has to be almost flush against the cage, so most that do meet spec are outrageously expensive (such as the Icy Dock drive converter). Since I used iStarUSA's 5.25" bay to 6x2.5" drive adapter for most of the SSDs, I only needed two 2.5" to 3.5" adapters, so a quick trip to the machine shop to mill down the bevels and screw holes of the OCZ adapters fixed the problem (it is a bit cludgy, since you can't put all the screws in, and it is hard to fasten the drive to the adapter and the adapter to the hot swap cage, but it worked well for an immediate/free solution).

  • The onboard LSI 2208 controller did not have JBOD enabled by default. This was a huge pain, since I wasn't able to use any drives without configuring a proprietary RAID array (not to mention I wanted to use data from existing disks). It turns out you can install the MegaCli tool (here), follow the instructions to install it (just a simple "alien" command to convert the .rpms, then "dpkg -i"), then run "sudo MegaCli -AdpSetProp -EnableJBOD -val 1 -aALL" to enable it (the all means enable it on all present LSI controllers... make sure to run it with sudo, or as root, since if you don't it gives cryptic error messages).

  •  On a side note, I was also a bit dismayed with the build quality of the SC747 chassis and X9DAX-7TF motherboards. It is certainly a nice case, but not up to par with what I would expect for $900. One of the power supplies doesn't line up well with the connector, so it is a pain to get back in. The hot-swap cages are flimsy and don't slide in and out easily. One of the motherboard ethernet ports doesn't clip in the cable, and the BIOS splash screen has a misspelling. The boot time is also ridiculously long; I realize there are a ton of peripherals for it to initialize, but it is pushing a whopping 2 minutes before it even gets to grub, which is a huge annoyance when you are constantly rebooting to debug/configure the bad LSI controller. (Though don't take this the wrong way, I'm still pretty happy with the cases/motherboards, it just seems that for the money they should have been better.)

  • I had a lot of problems getting the servers connected to the network for a really odd/surprising reason. Our network assigns affinity groups (VLANs) based on MAC address. I had the MAC addresses setup and registered correctly, but for some reason the ports were still going up and down randomly and hopping between VLANs. It turns out the built in NICs send frames from two different MAC addresses! I still haven't looked in to why (perhaps it is because of the VMDq, virtual machine device queue, or IOV functionality), but as soon as I made sure all of the MACs were properly registered it worked fine. Weird.

Now on to the good stuff =). So one of the main points of this setup is to have super high data redundancy and uptime. This is accomplished by using RAID 6 arrays which are mirrored across the two servers using DRBD, allowing up to 5 drives to fail, or an entire server to fail, and still have no data loss or downtime. (If each server has 2 drives fail then we are still running at 100% because of RAID 6, the next drive that fails takes down a server, but since everything is on the other server we can just move the VMs over.) For added reliability, the servers will also be placed in separate rooms (though still with a direct 10G ethernet connection over existing wiring), to provide some added location redundancy (against theft, A/C leaks, small fires, etc.). I'm working on an offsite backup too, but that's a pain to administer, and may be a bit overkill for our current needs.

 The Xen setup stuff is mostly covered here, but I have a couple points to add, mainly with regard to migration:

  • I found the command "dd if=/dev/myoldvolume bs=4M | ssh -c arcfour mynewserver dd of=/dev/mynewvolume bs=4M" useful for copying logical volumes to the new server (run it on the old server). It is pretty straightforward; the "bs" argument speeds up the disk access, and the "arcfour" encryption is much faster (though less secure) than the default. You can probably use netcat to get better performance, but it seems a bit less reliable to me, and this ran at ~80MB/s, which was plenty (and may have been a network bottleneck more than anything). 

  • If you are migrating pygrub VMs, make sure to update the path to pygrub (for me it was replacing "/usr/bin/pygrub'" with "/usr/lib/xen-4.1/bin/pygrub'" in the .cfg), along with, of course, all the appropriate logical volume and volume group adjustments for the drives. 

  • Both RealVNC and TightVNC had issues displaying the Xen HVMs, but UltraVNC worked great.

  • You need to turn hibernation off in Windows 7/8 HVM virtual machines using "powercfg -h off" in an administrator command prompt (cmd).

  • When I ran hvm hosts I noticed an error stating the qemu keymaps couldn't be found. This is quickly fixed by symlinking qemu using "ln -s /usr/share/qemu-linaro/ /usr/share/qemu/".

  • For some reason the xen-tools /usr/lib/xen-tools/debian.d/20-setup-apt script is hardcoded to add debian mirrors to the VMs /etc/apt/sources.list.  My quick-fix was to just add a couple lines before the security block (just before the chroot) to overwrite the entire sources list to match dom0's, with the appropriate version:
    #fix for ubuntu (since above security breaks ubuntu dists...) case ${mirror} in *ubuntu*) sed "s/`xt-guess-suite-and-mirror --suite`/${dist}/g" /etc/apt/sources.list > ${prefix}/etc/apt/sources.list esac

  • I noticed terrible block device performance in my VMs (at least for continuous reads). This is largely attributed to the default read-ahead not being set correctly in the domUs. Essentially the default read ahead value is 128 sectors, or 64kB in linux; however, in raid arrays this is multiplied by the number of disks (since each is read simultaneously). You can check the settings by running either "blockdev --report" or "cat /sys/block/blkdev/queue/read_ahead_kb". Similarly, to set it, you can run either "blockdev -setra XX blkdev" or "echo XX > /sys/block/blkdev/queue/read_ahead_kb" (note this has to be done as root, not with sudo, since the > won't be run as root). I haven't found a good way to automatically do this on VM creation yet, but for now I have just stuck it in the /etc/rc.local file (though apparently this may not work if you run a full gnome desktop in the environment, since it will reset it). You may also be able to do it with sysctl or /etc/sysctl.conf, but it isn't clear to me how. After setting the correct read-ahead in my domUs I noticed a 3x increase in speed. On a similar note, it seems that it is important to tune the ext filesystem parameters stride and stripe-width for RAID arrays.

  • To enable IOMMU, VTd, and SR-IOV (which my setup supports), I had to add modify /etc/default/grub by adding "pci_pt_e820_access=on" to the option, and adding a line that says GRUB_CMDLINE_XEN="iommu=1 vtd=1 dom0_mem=4096M,max:4096M dom0_max_vcpus=4".  (Don't forget to run update-grub after this.)  Note that this also caps the dom0 memory and cpus for performance reasons (you should probably also modify /etc/xen/xl.conf and xend-config.sxp to prevent autoballooning as well).  I also added "xm sched-credit -d Domain-0 -w 1024" to /etc/rc.local for better IO performance with the intensive raid 6 being processed by dom0.  Update:  Actually it looks like iommu and vtd are enabled by default in Xen 4.x, but it doesn't hurt to add them.  It also appears that there is an issue with Dom0 memory, which requires you to run "xm mem-set Domain-0 4096" after boot to reclaim your memory (mine was 2012 without it); just put it in rc.local.

  • There is some weird bug with openvswitch (at least in the kernel update to 3.2.0-35-generic) which didn't compile/load the new openvswitch module when I updated kernels.  You can have it recompiled by doing ("sudo apt-get remove openvswitch-datapath-dkms; sudo apt-get install  openvswitch-datapath-dkms xcp-guest-templates xcp-networkd xcp-xapi", since it removes more than just the openvswitch module.)  To update it install the source, then compile and install it using: "sudo apt-get install openvswitch-datapath-source && sudo m-a a-i openvswitch-datapath".  Also, since I compiled and installed drbd, I had to update the kernel module manually ("sudo m-a a-i drbd8" if the module source is already installed).

I looked at some cluster management software, like Pacemaker and Ganeti, to take care of VM management and volume replication, but they all seemed like they had a lot of overhead and were overkill for what I needed (plus had a somewhat steep learning curve). I could have replicated the entire LVM volume group (or RAID device) over DRBD, but this wouldn't have given me the flexibility I wanted (since I don't necessarily want *every* LV replicated, and I have some volumes on the RevoDrive that I may want replicated), so I ended up using DRBD on a per-volume basis. Since I wasn't about to manually setup every VM to use DRBD, I ended up writing some quick and dirty scripts to automatically take any VM and set it up to be replicated to the other server. This is actually a bit trickier than it sounds, since it involves:

  1. Writing resource files to /etc/drbd.d/ for every VM (made more complicated by VMs with more than one volume, such as their swap). 

  2. Modifying Xen's .cfg file in /etc/xen to point to the appropriate /dev/drbd_ devices and setting the device type to "drbd", if necessary. 

  3. Creating the appropriately sized logical volumes on the other server.

  4. Copying the configs to the other server. 

  5. And, finally, performing a number of administrative tasks, such as creating the DRBD meta data and forcing the primary. 

While I'm not going to detail every line of code, and it is too much to post inline, you can find the scripts here. THESE ARE PROVIDED WITH NO WARRANTY. I recommend you read every line of the code to figure out what it is doing. In fact, you should probably use them just as reference and never actually use them yourself unless it is on a sandbox with no real data, as THEY COULD CAUSE PERMANENT DATA LOSS. That being said, I did try to do a number of sanity checks and make them fairly robust (used on HVMs, pygrub, etc.). I do feel a bit like I was probably re-inventing the wheel though, so if anyone knows of a good light-weight Xen/DRBD volume manager I would love to hear about it.

I ran in to a number of not so obvious issues with DRBD while I was building and testing these tools:

  • DRBD 8.3 does not support user defined device names, which makes setting up Xen config files much, much, more difficult. It also does not support online switching of replication protocols. For performance reasons I wanted my synchronization to be asynchronous (Protocol A), but I also wanted to support Xen live migration of VMs, which requires "dual-primary" mode, that is only supported by fully synchronous replication (Protocol C). Ubuntu 12.04 (and 12.10 for that matter) do not have DRBD 8.4 in their repositories, and LinBit, the developer of DRBD, only has a repository for paying support customers. This means that I had to compile DRBD 8.4 from scratch, but this worked just fine following the instructions in the DRBD documentation.  (Update:  It looks like some of my VMs on old kernels had (seemingly intermittent) problems with live migration, giving a kernel BUG in spinlock.c.  This problem was reportedly fixed after kernel 2.6.38.)

  • DRBD does not do a good job of documenting how to use user-defined block device names. I was very, very dismayed to find that even with a user defined device name, you still require a device number (i.e., "minor number"). So the correct syntax in the resource file is "device /dev/drbd_mydefinedname minor myminor#;". It is also not clear that every resource has to have it's own port (though I guess it makes sense in retrospect...). This means that my scripts had to take over the unfortunate task of keeping track of minor numbers and port numbers :/, which greatly complicated things, and made device creating much more prone to error (you have to make sure to keep the minor number in sync across servers and line up with appropriate volumes). Because of this I ended up creating a separate resource for every volume (i.e., both swap and root on most VMs), just so that they port and minor number would stay in sync. I still only used one resource file per VM though, which helps keep things organized. 

  • The resync-rate is a bit confusing, and I still am not sure how to set it. Despite the fact that I am using a direct 10GbE connection and RAID arrays that have more than 1GB/s r/w on the storage, I was only syncing at ~100MB/s. I tried setting the "resync-rate 300M;" property in the global config under disk, but it seemed to have no affect on initial sync rate. I found older references to "syncer", but it seems to be deprecated in 8.4. I'm still looking around, but I guess it's not that big of a deal. 

  • Be careful NOT to delete any DRBD .res files without making sure that the resource has been stopped with "drbdadm stop myresource".  I did this a couple times during testing/development, and it makes impossible to stop the resource without recreating the .res file (or rebooting).  It seems like the whole .res system could be greatly improved.  If DRBD had internal management of ports and minor numbers, then perhaps all the resource/volume specific information could be moved in to the metadata block, and have everything managed through command line tools...  (more like mdadm).

  • /proc/drbd doesn't list the resource names, just the minor number, which is very, very annoying. It makes it impossible to tell the status of volumes, since you have to cross-reference with the actual resource config files in /etc/drbd.d/.  Or you could, you know, just use "drbd-overview" like you are supposed to.

  • When you use the "drbd:" type device in your Xen .cfg, you don't need the full device path, just the drbd resource name, i.e. 'drbd:myvm-disk,xvda2,w'

After I got everything setup correctly, live migration went without a hitch on para-virtualized linux VMs, which was pretty awesome =). Notably, defining the volumes with "drbd:" in the Xen .cfg file will automatically take care of everything if you are running Protocol C (it will even promote volumes to primary when you first try to run the VM). If you aren't running protocol C, then you will have to manually change the protocol and dual-primary mode (temporarily) on any volumes the VMs are using, by doing " sudo drbdadm net-options --protocol=C --allow-two-primaries ". Of course after the migration you will have to change everything back (if you want) using " drbdadm net-options --protocol=A --allow-two-primaries=no ".  I did try to live migrate a Windows 7 HVM host; it worked fine initially, but then the VM randomly rebooted and said it had recovered from a blue-screen. I'll have to play with it some more to see what the issue was.

I'm still looking at a good way to be alerted of any failures; it looks like mon is probably the best solution, but I haven't had chance to set it up yet. Of course installing sendmail and configuring the email in /etc/mdadm.conf will send you notifications of important mdadm RAID events (notably you need a fully qualified domain name for sendmail to work; this can be set in the /etc/hosts file like this " localhost myhost"). Unfortunately DRBD doesn't have a similar functionality.

Anyway, that's about it. If you want a small, reliable, high-availability cluster that is very resilient to data loss, I would highly recommend Xen+DRBD so far. RAID 6 may be a little overkill with the DRBD replication, but I would rather avoid the hassle of recovering from a server going down (and instead just hot-swap bad drives), and given the total cost, the extra two drives to bump from RAID 5 to RAID 6 were just a small percent. For smaller budgets RAID 5 or 1 would still offer a ton of protection ;).

Sunday, May 20, 2012

Xen on Ubuntu 12.04 Notes

The recently released Ubuntu 12.04 has great support for Xen, simply use "sudo apt-get install xcp-xapi" to install it. I did run in to just a couple issues though:

1.  There is an installer option to use "openvswitch" rather than typical Xen network bridging.  I initially chose openvswitch so I could play with it, but decided to temporarily revert back to traditional Xen scripts until everything was setup.  To revert all you have to do is add back "(network-script network-bridge)" to /etc/xen/xend-config.sxp.  This doesn't remove or even disable openvswitch, it just has Xen automatically create the bridges for you.

2.  While Xen was added to grub, it didn't become the default on the bootup menu.  To fix this you could change /etc/default/grub to point to the correct list entry, but this could cause issues down the line as you update kernels.  Instead, I renamed the "20_linux_xen" to be "06_linux_xen" in /etc/grub.d/, then ran "sudo update-grub".  Per bjmuccolloch, below, the best way to do this is "dpkg-divert --divert /etc/grub.d/08_linux_xen --rename /etc/grub.d/20_linux_xen".  This will put the Xen options at the top of the grub list and make the latest xen/linux kernel pair the default.  Be sure to make it "06" and not just "6", otherwise it won't work...

3.  While there appears to be a new xapi xe toolset that can be used to manage/create VMs (and comes with the xcp-xapi package we already installed), I'm more familiar with the existing xen-tools package.  This is still available in apt, so just do a simple "sudo apt-get install xen-tools" to get it.

Oddly, by default it doesn't allow you to use xen-create-image to make Ubuntu 12.04 or even 11.10 images.  To fix this simply create a new symlink in the /usr/lib/xen-tools/ directory by doing something like "sudo ln -s debian.d oneiric.d; sudo ln -s debian.d precise.d;".

I had some console issues using xen-tools (which probably aren't a problem if you use the xe tools).  To fix this you just have to add:

echo "start on stopped rc RUNLEVEL=[2345] stop on runlevel [!2345] respawn exec /sbin/getty -8 38400 ${serial_device}" > ${prefix}/etc/init/${serial_device}.conf

to /usr/lib/xen-tools/debian.d/30-disable-gettys.  This will make VMs spawn a tty on the console device.

Of course, as usual, you probably want to setup your defaults in /etc/xen-tools/xen-tools.conf

4.  Finally, I had some issues with the Xen bridge not picking up a custom MAC in /etc/network/interfaces file.  This is important, since the bridge and the Dom0 interface have to share a MAC address on my network (since that is how VLANs are assigned).  The cause of this looked to be that the init.d network script was being run after some of the xen init.d scripts, but I could be wrong.  The simple fix is simply to add a "ifconfig peth0 hw ether xx:xx:xx:xx:xx:xx" to the end of /etc/init.d/xendomains, before the "rc_exit".  This is a bit cludgy, but it worked.

That's about it, everything else worked great!  Maybe I'll do an update after I get a chance to play with the xe tools or openvswitch.

Tuesday, February 16, 2010

Compiling vnStat on iPhone

vnStat is a neat little bandwidth monitoring cli utility for Linux and FreeBSD.

Actually the hardest part about this is just getting the toolchain running, after that it is a couple modifications to the Makefiles.

So there is quite a bit of documentation out there on how to get the toolchain running. The easiest way I found was on the device itself, using these instructions.

After you do that then you need to just change the top of the Makefile to:


LDFLAGS= -multiply_defined suppress \
-L$(Sysroot)/usr/lib \
-Wall \
-Werror \
-march=armv6 \
-mcpu=arm1176jzf-s \

CFLAGS= -I$(Sysroot)/usr/include

# bin and man dirs for Linux
BIN = $(DESTDIR)/usr/bin
SBIN = $(DESTDIR)/usr/sbin
MAN = $(DESTDIR)/usr/share/man

# bin and man dirs for *BSD
BIN_BSD = $(DESTDIR)/usr/local/bin
SBIN_BSD = $(DESTDIR)/usr/local/sbin
MAN_BSD = $(DESTDIR)/usr/local/man

default: vnstat


This worked for vnStat 1.10 on an iPhone 3GS 3.0.1.

You also need to sign the files using ldid:

ldid -S src/vnstat
ldid -S src/vnstatd

Or you can just put them in the src/Makefile so it does it automatically for you:

all: vnstat vnstatd vnstati

vnstat: $(OBJS)
$(CC) $(LDFLAGS) $(OBJS) $(LDLIBS) -o vnstat
ldid -S vnstat
vnstatd: $(DOBJS)
$(CC) $(LDFLAGS) $(DOBJS) $(LDLIBS) -o vnstatd
ldid -S vnstatd

The make install didn't work well for me so I just run it from my /var/root/vnstat directory. You also want to make your db directory and copy the rc file over:

mkdir ~/.vnstat
cp cfg/vnstat.config ~/.vnstatrc

Then modify the ~/.vnstatrc to set the DatabaseDirectory correctly:

# location of the database directory
DatabaseDir "/var/root/.vnstat"

Sunday, February 14, 2010

Apple Late Model iPhone 3GS Clarifications

A couple things I've learned over the past couple days:

iRecovery reports the iBoot version wrong on all the 3GSs I have tried. It always reports 636.66, when in fact the versions were 359.3.2 and 359.3.

The 4th and 5th digits of the serial number are the manufacture week. According to theiphonewiki iPhones made after week 40 are not vulnerable to the 24kpwn (untethered) exploit. This isn't quite true, as I have at least one week 41 that was vulnerable to 24kpwn.

I have yet to find a good way to put 3GSs in DFU mode, or more importantly, get them out of DFU mode, which is the only way I have seen to reliably tell which iBoot version they have. I have tried holding the power and home button for 30 seconds, even a minute, or just the power, or just the home button for that long. iRecovery won't even detect it.

To build iRecovery you need a couple packages, namely readline and libusb. On the Mac this apparently requires darwin ports then libusb, but I never actually got it to build on OSX. For ubuntu you just need to apt-get libusb-dev and libreadline5-dev (although I also installed libusb-1.0-0-dev before I saw libusb-dev, just in case you need both). Also, while building on Ubuntu, I had to add "include <signal.h>" to irecovery.c, or else it couldn't find SIGINT.

Also, from what I can tell, even if you have your SHSH saved, you can never downgrade your baseband, so if you accidentally upgrade to 3.1.3 you lose your ability to unlock.

Finally, switching back and forth from blackra1n and redsn0w has worked fine so far.

Update: Switching from redsn0w to blackra1n caused iTunes not to be able to sync, giving error message "iTunes cannot read the contents of the iPhone xxxx. Go to the Summary tab in iPhone preferences and click Restore to restore this iPhone to factory settings." The fix is to delete /private/var/mobile/Media/iTunes_Control/iTunes/iTunesDB and any files under /private/var/mobile/Media/iTunes_Control/Music.

Monday, November 9, 2009

A Couple Issues Jailbreaking iPhone OS 3.1.2

I ran in to three issues jailbreaking iPhone 3GS OS 3.1.2 with PwnageTool 3.1.4:

1) Error 1064 when iTunes tried to deploy a custom image
2) No cellular signal after jailbreak
3) Cydia immediately crashing when you try to open it


The 1064 error was solved simply by pwning the phone again. The hardest part was getting the phone out of DFU mode. A couple of the published "hold home and power for 8 seconds, then hold home and power for 8 seconds, then again, but keep holding the home button for 20 seconds" methods didn't work. iRecovery worked perfectly first try. After downloading and extracting it run:

./iRecovery -s
#after the prompt appears enter:

setenv auto-boot true

Note: In Windows XP this needs to be run in compatibility mode and as administrator...

Then just run the PwnageTool again and tell it the phone hasn't been pwned. This only happened to one of the phones I was messing with, so it could have been a user error.


This is just because I told PwnageTool to activate the phone even though I had a legite cell plan... Disabling the activation feature made everything work fine again.


I traced this issue back to using expert mode of PwnageTool and telling it to install all of the outdated pre-downloaded packages that come with PwnageTool. The fix is to either just not preinstall any packages, or to update the packages before they are installed.

The reason that simply hitting the refresh button under Cydia Settings->Download Packages doesn't work is because it is using a different repository than Cydia ends up using on the iPhone. So you have to go to "Manage sources" and remove the existing "" repository (click it and hit delete), then add "". Now go back to "Download packages", select the 3.7 repository, sort by Status, highlight all of the packages that have a new version available (shift+click), and click add to queue and wait for them to download. Now you can add all the packages you want via "Select packages" and create an image that won't cause Cydia to immediately crash.


Another tidbit: iTunes asked me if I wanted to update my carrier settings. I was a bit skeptical after having just jailbroken the phone, but all it does is update some APN information and such, not the baseband or firmware or anything. So it is fine to say yes to AFAIK.

Also, I did notice the lowered WiFi signal after installing blackra1n, but a network settings reset seemed to fix it.