Saturday 11 April 2009

Java: Bad CPU id in executable

Here is a weird one:

Recently i have been doing a lot of java work, and many of the packages I use need to have Java_HOME set. So normally
i just add this line to my /etc/profile


JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home; export JAVA_HOME


Works just fine on my Macbook, and my recent 2009 Mac Mini. However i have an older 2006 Intel Mac Mini, the older Core Duo type, rather than the newer Core 2 Duo versions. Its been out of action for a few months due to a failed Hard Drive, but recently I revived it by buying a suitable hard drive from PCWorld and attacking the machine with a putty knife. (upgrading the ram whilst i was in there).

Everything seemed to be working fine, re-installed the OS, and I took the opportunity whilst i had a virgin system to document and automate the installation of my EC2 management scripts. Everything swam along fine until I actually tried to use the EC2 management scripts where upon I got the dreaded message "java: Bad CPU Id in executable" WTF....

Googling the error did not seem to help, so back to the drawing board, and take a microscope to the machine.

I checked the target directories, they are fine, in the "versions" directory under the java framework I can see 1.42, 1.50 and 1.60. so what is happening.

The first clue is that the java preferences applet does not list the 1.6.0 version as available.... Hmm why not????

The second clue is that switching to version 1.5.0 in the JAVA_HOME statement makes it all work.

The answer is that Apple never released a 32bit version of Java 1.6, but to rub salt into the wounds they load a perfectly non-functional 64bit version onto all 32bit only boxes. Ouch...

Come on Apple, there is no reason to limit java 1.6 to 64bit only, and even if there is, why on earth load a nonfunctional 64bit framework onto 32bit machines?

Shakes head in amazement.......

Tuesday 10 March 2009

EBS, Flash drives for EC2 machines

Recently i had a bizarre situation where i had been running a load of stuff through Hadoop on one of our EC2 clusters, and the job had failed, but it failed in such a way that if i could save away the data, (and the logs for diagnostic purposes), and restart the cluster with some changed parameters i would be able to recover the data and carry on, having invested about $60 in machine time and hours of head scratching already on the failed run, I thought this was a good idea.

So no problem just use hadoop's distcp to move the data up to s3n://bucketname, hmmm that did not work out: OK, how about hadoop dfs -cp to a local directory of all the data and use s3cmd to move it to s3 storage, hmm that did not work out either, something odd is going on here.

Then it dawned on me that some of the files in the run where very large, as where the intermediate products, around 10G on average and there where lots of them. Now s3 has a limit of 2G for objects stored in its file system. I did not want to use the hadoop s3:// non native filesystem as that is difficult to verify if all the data had arrived safely, as nothing else would read it, so i had a brainwave, and successively mounted and unmounted an EBS volume onto each machine , saved away the relevant bits and pieces, and then moved onto the next box. All of our base EC2 images have xfs support wired into the system, as well as /mnt/ebs as a mountpoint for an attached ebs volume. So it was simply a case of using ElasticFox to attach the EBS volume to the instance and the issuing a "mount/mnt/ebs" and then "umount /mnt/ebs" once i was done.

So effectively using an EBS volume as a virtual flash-drive.

Wednesday 21 January 2009

Fixing External monitors on the Aspire One and Fedora 10

I was recently forced to spend some time using my Aspire One as a primary development machine, having left my macbook PSU at work, and being too lazy to go all the way back in to retrieve it. 

So I thought, thats ok, I have this old monitor kicking around here, and a spare keyboard and mouse, just bang them into my Aspire One and off we go. But.......

Fedora 10 uses the new configless Xserver and the new Intel GEM enabled driver, a configless Xserver will happily run X without all the tedious fiddling with /etc/X11/xorg.conf. In fact if you take a peek in that directory you will see that there is no xorg.conf. 

So when I plugged my external monitor in, I was encouraged to see it display everything up to the login screen, logged in and was presented with a tiny 640x480 desktop on both monitors. 

Ok, no problem, pop up the Display properties dialog, set the sizes of the screens back to thier proper setup, 1024x600 for the builtin LCD, 1024x768 for the ext monitor. Press apply and ..... and .... and .... Nothing happened...  WHAT!!!!! 

OK it turns out that there is one vital piece of configuration that is missing in a configless Xserver boot, and thats the Virtual screen size.  You need a Virtual screen size that is bigger than the sum of both your monitor surfaces to run dual head (internal + external monitor). 

I tried everything, creating a minimal xorg.conf that just added that line, using sample xorg.confs for closely speced machines, nothing worked. And those that did, resulted in the disabling of DRI which produced syrupy screens on the external monitor. 

So the solution is to create an xorg.conf that is exactly what the autoconfig would produce, plus the extra line. 

So here are the steps

1. Reboot the Aspire and press RETURN when the initial bootloader screen comes up. 

2. press "a" and add a space then "3" to the end of the kernel boot line, press RETURN to boot the machine. 

3. The machine will boot to a command line instead of the usual graphical login screen. 

4. Login as root

5. Run the following

$cd /root
$Xorg -configure

The system will flash the screen and produce a file called xorg.conf.new in the current directory. This file contains all of the details that the configless startup would use.

$cp xorg.conf.new  /etc/X11/xorg.conf

Now locate the section at the end of that file that shows as below, and add the line indicated in

SubSection "Display"
Viewport        0 0
Depth     24
Virtual        2048 2048           # Add this line in 
EndSubSection

6. Reboot the machine and you will now be able to reconfigure the monitors to support 1024x600 internal and 1024x768 external

Note: Because of limitations in the driver, if you use a value of more than 2048 in either dimension, it will disable DRI and the screen will become very slow again. This unfortunately limits the external display to 1024 width. If you want it to go wider then turn off the internal LCD panel in the Display Properties. This is being fixed in the next version of the driver, and this whole proceedure may become redundant by then as it will also support configless virtual spaces.