The big upgrade

October 28th, 2007

I already planned to upgrade my main PC some time ago and last week I finally ordered the necessary parts. I replaced the mainboard (MSI K7N2G-ILSR -> GigaByte G33-DS3), the CPU (AMD Athlon XP 2600+ -> Intel Core 2 Duo E6750) and RAM (2×256MB + 1×512MB Kingston HyperX DDR333 -> 2×1GB Corsair PC2-6400). Of course, this also meant switching from a 32bit system to a 64bit.

The hardware changes took me approximately two hours (since I cleaned the whole tower and had to help my sister with building her own PC with my old hardware ;)), then I started re-partitioning one of my two main harddisks (which were part of a RAID-1 array) and installed Gentoo on it (this time using xfs as a file system and running a full unstable system, ACCEPT_KEYWORDS=”~amd64”). Then, the first reboot—while the kernel config looked to be OK, the damn thing didn’t boot as XFS wasn’t able to mount the root fs. When I was about to blame XFS and switch back to ext3 I realized it wasn’t a file system problem but rather a hard disk problem. And indeed, booting from the livecd again revealed a lot of ‘Medium errors’ when trying to work with the disk, and the `badblocks` tool showed 45 bad blocks… Great. :) I guess the bad blocks didn’t just come from one day to the other but had been there for some time and I didn’t see them because of the RAID setup or something. “Luckily” this drive was pretty new (Oct 2006) and I think I’ve got pretty good chances for getting a replacement because I should still have warranty on it.

Anyway, as replacement usually takes time and I don’t want to risk data loss I ordered two 500GB drives yesterday… hopefully they are a bit faster (they should be—they’re both SATA-II and said to be pretty fast and with the new board I can finally profit from SATA-II). Hopefully they’ll arrive on Monday or Tuesday so I can complete my upgrade hardware-wise.

So, after the second Gentoo install on the second harddisk (with some quirks to not destroy any data while still preparing for a nice partitioning scheme) I’m almost back.

Yesterday I solved the two outstanding software problems: My board has a nice intel graphics chip which should have worked out-of-the-box. Indeed, 2d graphics worked well, 3d not at all. While I was able to enable DRI etc. in xorg.conf, any tool which used OpenGL made X segfault, even glxinfo caused that. It turned out mesa-7.0.1 did not support the chip on the g33-based boards yet, but luckily the patches are already in upstream git and they will be part of mesa-7.0.2, so I just patched mesa locally and filed bug 197273 and hope for the best. :)

There was one last remaining issue, the TV card. I own a Pinnacle PCTV Stereo card which always needed some strange hacks. When building the first kernel for this machine I chose to include support for bt87x (this card has a bt878 chip, so this is the right driver)—which was the wrong choice. :) While I got a picture in tvtime and was even able to switch channels, I did not get any audio output. Instead, there was only a short noise when starting tvtime or switching channels. Then I remembered the hack I used on the old install—I had to build bttv as a module, unload it after boot when udev autoloaded it and load it again, then audio suddenly worked. Sadly this didn’t work on this install. I did not get any TV picture at all when using bttv as a module. After some hours of debugging it turned out that the bttv module no longer loaded the proper tuner and audio modules and after the discovery of this issue it was pretty “easy” to get the commands for getting working TV with audio together:
modprobe -r bttv tuner tvaudio
modprobe msp3400 # the audio decoder I need on this card
modprobe tuner
modprobe bttv
The important thing is that msp3400 needs to be loaded first as this is the driver which works for this card. If bttv is loaded first (which seems to happen when using in-kernel bttv) bttv (or some of its submodules, maybe tvaudio) captures the audio “port” and msp3400 is unable to get access to it…

Oh yes, and today I discovered another issue related to a 64bit system—proprietary crap again :) I wanted to use flash… and well, the standard procedure seems to be to use a 32bit browser (as there is no 64bit flash plugin). I tried that but it sucks. There is no easy way (yes I know, there are ways) to make firefox use a decent GTK+ theme and additionaly it was rather slow. Luckily someone (hello impulze!) reminded me of nspluginwrapper—I think it’s still considered an ugly hack, but it works. I just emerged it and now I’m able to use flash, whee!

aufs - another union fs

July 24th, 2007

If you are ever in the need of stacking multiple file systems (or directories) over each other, aufs might be a possibility. I’m currently using it to implement some system similar to HDGuard—normal users should be able to use the system, but after a reboot all changes should be reverted. Using aufs I can easily do that by combining the real root device (readonly) with a tmpfs (readwrite). In theory it’s pretty easy, but if you want to change the root fs you have to use an initrd/initramfs file.. and I’m currently fiddling with some problems. :)

An ebuild for aufs is in my overlay (stolen from sunrise, but updated and improved/fixed).

Today I finally got both iwlwifi (the free intel linux wireless drivers for the 3945 chipset) and s2ram (suspending to RAM) on my HP NX7400 notebook working. iwlwifi had always failed for me, stating the hardware rfkill switch was still on. Today I re-tried with ipw3945 (the binary drivers which had worked for me in the past) as someone at #ipw2100 irc channel suggested. And as it failed with the same message I began to worry whether the hardware was still OK. Then I remembered that there is an option for enabling/disabling WLAN in the BIOS setup. I checked and it was enabled, so it was not the cause of the problem. But there was another option which was WLAN-related: “LAN/WLAN switching”. Don’t ask me what that means, but it was enabled (so it can’t mean that kill switch as it should have worked while that option was enabled). I then switched that option to disabled and voila—it worked (well the device was called wlan0_rename which was not nice, but playing with udev rules a bit solved that).

The other thing I did was a BIOS update and I didn’t really do it with specific intentions. I had version F.08 before and upgraded to F.0B (HP is one of those companies which think it is funny to provide BIOS updates as .exe files; the best thing is: the .exe file is just a self-extracting binary (sadly `zip` doesn’t want to uncompress it); so if they had just used .zip or some other format for those files (including the ISO images) non-windows users would not be required to get to a Windows PC…). The update went smooth and I decided to try s2ram again. And wow—it worked! Previously I had the problem that the LCD backlight wasn’t switched on again on resume, but this works now. Either the BIOS update or the update to the recently released xf86-video-i810-2.1.0 driver (or both :p) had solved that. This only works for X btw, on a text-console the backlight still remains off. Maybe I should play with some options of s2ram, but I’m too lazy. ;)

There still doesn’t seem to be a clear statement whether the latest nvidia-driver-100* series fixes/is supposed to fix the “beryl black window bug” (well, not only beryl, but compiz as well of course). Some nvidia engineer posted to nvnews.net that it is not supposed to be fixed, it will rather be fixed in a later release. At the same time there are user comments which state that they no longer experience the problem… So I decided to give those drivers a try (they are not yet in the official gentoo tree, but there is already an ebuild for it in bug 175674). Performance seemed to have dropped a bit, but I was not sure whether it was the driver’s fault or just the usual variation in performance when using compiz (“compcom”) live ebuilds… Anyway, there is a pretty interesting thread at nvnews.net where someone talks about some Xorg (and other) options which are supposed to improve performance for beryl etc. a bit: http://www.nvnews.net/vbulletin/showpost.php?p=1276594&postcount=10 And indeed, it made a big difference for me, it’s much faster than before (I only applied the suggestions for xorg.conf, no other things).

ftp setup done

April 22nd, 2007

And an easier one: my proftpd setup structure changes have been done.. moving away from mod_sql as well and replacing it with files (AuthUserFile). Strange though that there is an extra use flag for that on gentoo, was a bit annoying to find out… (this feature does not have any external dependencies and does not reduce compile time that much when disabled.. so, no reason to keep it, imo).

mail setup done

April 21st, 2007

With some help from darix I finally managed to get postfix with virtual aliases from text files + dovecot (instead of courier-imap) with virtual users working! \o/

I always thought I was using the IDLE extension for IMAP in Thunderbird but that does not seem to be the case, because courier-imap did not support it (maybe I just misconfigured it, who knows). With dovecot I get notifications about new mails immediately... When I send a mail to myself the mail appears even before Thunderbird is able to close the ‘Sending mail’ window! And dovecot seems to much faster when copying/deleting/moving emails as well. :)

Tomorrow I’ll do the web part which is going to be more fun I think, but today I certainly learned a lot about postfix. I was backed up by some howtos though, namely http://www.howtoforge.com/linux_postfix_virtual_hosting and the dovecot wiki

And btw I also installed DSPAM (which replaces the previous memory hog amavis + clamav + spamassassin…) using http://gentoo-wiki.com/HOWTO_Spam_Filtering_with_DSPAM_and_Postfix—I don’t know if it is working yet as darix gave me some postfix rules which already prevent a lot of spam senders from successfully connecting:

smtpd_sender_restrictions = reject_unknown_sender_domain
smtpd_client_restrictions = permit_sasl_authenticated, permit_mynetworks, reject_unknown_client
smtpd_helo_restrictions = permit_mynetworks, reject_invalid_hostname

changing server setup

April 21st, 2007

Yesterday I wanted to set up mailman… and have still not been able to get it working. This server is completely messed up. I’m running SysCP (patched for lighty support) and now I realize that it’s overkill for me. I have 7 “customers” in SysCP while 5 of them are mine… I have absolutely nothing against SysCP, it’s a nice tool for webhosting, but for me it has just become annoying. The lighty config file with my custom changes grows and grows and it will probably be longer than the automatically generated one by SysCP once… that doesn’t make sense. The only reason which made me keep using SysCP was the mail setup (postfix + courier-imap using mysql) because I hardly had any knowledge about it. I don’t claim to have much more now but I think I will try to replace courier-imap by dovecot and start a postfix config from scratch. Additionally I’ll kill lighty-1.4 and replace it by lighty-1.5 with external spawned fastcgi servers.

So, if any services on this server do not work in the next hours/days, don’t be surprised. :)

Maybe I can get a working mailman setup then…

ipv6 with ayiya/sixxs

April 16th, 2007

I have an SixXS account since 2005 and had it already working, but at some time.. it stopped working or I was to lazy to set it up again (maybe it was after the Debian -> Gentoo switch). :) My server had ipv6 connectivity through SixXS all the time, static routes seem to be easier to set up.

Recently I wanted to set it up again for my workstation but was not able to get it to work. With some help from jokey I found out that the biggest problem was my version of aiccu, the tool which is supposed to automatically set up ipv6 configs for SixXS. After updating to the latest unstable version (2007.01.25, which is, thanks to jokey, stable by now) I was able to ping my PoP using ipv6—this was at least some kind of success.. :)

After that I found out that aiccu fails to set up a working default route and that this was the reason why I could not access any other ipv6 hosts. I fixed that by adding the appropriate rule manually:

ip route add 2000::/3 dev tun0

Now I’ve got a working ipv6 connection again. :)

For a project I recently needed a way to access a server behind a NAT from another server on the internet. This is no problem for ssh, but keeping the forward alive is not that easy, especially since the server behind the NAT is only on a dial-in account which means it’s public IP changes every 24h and all connections get dropped when this happens.

I tried writing a small shell script for respawning ssh when it exited but it really did not work as reliably as I expected.

Luckily someone recently mentioned a tool which was written exactly for this purpose: autossh

Now I’m spawning it like that on the server behind the nat:

autossh -M0 -o ServerAliveCountMax=2 -o ServerAliveInterval=15 \
  -o ExitOnForwardFailure=yes myserver -N -R5001:127.0.0.1:2800

And it’s even in portage :)

It seems that someone finally found a working workaround for the annoying bery black window bug with proprietary nvidia drivers. Transforming the description there into command line args this results in

  beryl --replace --use-tfp --force-aiglx --xgl-binding settings &
and that’s what I’m currently using. tvtime still runs smoothly (in contrast to the other workaround which was much smaller for me), I currently have a lot of windows open (thunderbird, firefox, more than 40 terminals (xfce-Terminal), tvbrowser, tvtime, scite…) and until now I did not get hit by the black window bug. Performance is a bit slow when moving the terminals around but I guess that’s related to the transparency the terminals use—without that workaround I would have never gotten 40 terminals spawned simultanously :)

Dass mit Compositing-Managern wie beryl bei Java-GUIs seltsame Probleme auftreten können, wusste ich bereits… dass es mich mit TV-Browser 2.5 dann auch erwischt, find ich allerdings nicht so toll. Fragt mich nicht, welche Auswirkungen das hat, aber es hilft: export AWT_TOOLKIT=MToolkit—dann TV-Browser ganz normal starten und fertig

Das ganze hab ich aus dem Bugreport zu diesem Problem geklaubt.