Mountains

Mountains

Sunday, October 24, 2010

Too Much Thrift...

I have become a little obsessed with the random crashing on the Thinkpad (a T42). Having a computer that hangs every 20-120 minutes is really weird and annoying.

Hangs without logfile notes are evil. That leaves me to trouble shoot by trial, error, and google in the middle. This also makes the problem more intriguing than average.

Trouble Shooting Timeline.
0) Discovered random hangs. Did a complete reinstall of xubuntu 10.10 from cd. Googled a lot. Found thinkwiki article that mentioned conflicts between bios power management (pm) and linux pm.
1) Disabled bios PM: this helped a lot.
2) Installed laptop-mode-tools to try to push things a little longer. Crash reappears!
 -Maybe crash is related to some specific aspect of powersaving?
3) Fallback to basic powersaving (whatever that is... more later). I thought it was fixed until the computer hung at 10% battery. Maybe XFCE4-power-manager can do something to cause a crash too, but it's less agressive? Maybe some other aspect of pm gets wrapped around the axle in panic-powersave?
4) Google more...


How does linux power management work?
Have you ever wanted to know how linux manages power on your computer?

You want to know, trust me:
 If you're a programmer, you'd love to see the power, and want to spend every waking moment of the next week trying to harness it.
- If you're a more normal human being, you need to see the horrorshow mess that is going on under the hood.

In short, The community has taken a lot of jabs at the problem, so now we are left with a really pissed off bull running around town with lots of spears stuck in it, so it's bleeding and shitting all over the place and destroying everything in it's path. GO TEAM!


Linux power management has gone through a lot of iterations, and there is a mish-mash of solutions to the power management issue. From I learned today.... and links below (no authority guaranteed here or there, mind you!):

  • speedstep-centrino became acpi-cpufreq
  • acpi-cpufreq got rolled into the 2.6.31> ubuntu kernel... so no module
  • something about Ubuntu reverting to a acpi-cpufreq module under some vague rumored condition
  • cpufreqd and powernowd are now userspace daemons: that is something you can control w/o loading modules from the command line....
  • powernowd is the preferred method of control, though cpufreqd could still be in the wild.
  • hal, although depreciated, could still be messing with your pm settings.
  • udisks-daemon/udisks can exhibit some control over hard drive power usage. Provides disk interface to freedesktop.org dbus interface.
  • apmd (advanced power management daemon) was used to control power on older machines, but was supplanted by acpid. (also deprecated)
  • acpid (advanced control and power interface daemon): Modern machines have acpi, and this package executes commands based on changes in the computers hardware state (temperature, power, buttons, ect....)
    • Power status can be found in /proc/acpi 
    • acpid executes commands based on events from /etc/acpi/ and /etc/acpi/events. Notably, the ac and battery commands in ~/events.
  • sleepd can put the machine to sleep... seems very basic. not part of standard ubuntu install
  • xfce4-governor-plugin used HAL to change the CPU speed, but is now depricated along with hal
  • pm-utils appears to be the current architecture for basic power control. I am not sure if it is part of the standard ubuntu installation. maybe just on laptops? There are some scripts in /usr/sbin/ that determine this.
    • pm-utils 1.4.x that ships with xubuntu 10.10 is incompatbile with laptop-mode-tools. It says it right on the freedesktop.org webpage!
      • apt-get removes pm-utils when it installs laptop-mode-tools.
  •  laptop-mode-tools is a set of scripts that provides fairly aggressive power. It modulates daemons, kernel laptop mode, and other hardware settings (hdparm) to squeeze power savings out of a machine
    • curiously, apt-get recommends installing pm-utils, even though the packages hate each other now.
  • management, making attempts to control almost every aspect of a machine.
  • upowerd/upower monitors power devices  and posts changes on the freedesktop.org dbus interface. Running upower -m will show the messages when the computer is plugged/unplugged. Things that listen on dbus might respond...
  • xfce4-power-manager monitors and controls power behaviour through dbus signals
  • gnome-power-manager C based power manager that uses upower, dbus, and libnotify. Ironic quote:
"Power management is an essential job on portable computers, and becoming more important on todays high-powered desktops. It uses many complex (and sometimes experimental) parts of the system - each of which are slightly different, and may contain quirks to work around. The power management policy could be influenced and tweaked by an huge number of options, and each new laptop model brings more possibilities and options. This should all work in the background without even being noticed by the user." (emphasis theirs!)
  •  hdparm is used to set the power management of hard drives
  • sdparm is the bastard cousin of hdparm, used for things that pretend to be, or really are, scsi. (for those poor bastards who still have it).
  • Network I/O can be optimized through
    • iwpriv (for older wireless)
    • /sys/bus/pci/drivers/iwl*/...(for newer wireless)
    • ethtool (for ethernet)
  • xrandr: controls active video outputs
  • x.org dpms module controls the lcd on/off state
  • lcd backlight control: mysterious! xbacklight is listed...
  • The virtual memory subsystem tries to optimize things when there is a limited power situation, with variables set in /proc/sys/vm/
  • cpufrequtils: provides some scripts that allow users to easily control the cpu speed. (possible to do this by writing values to /proc and /sys...)
  • uswsusp allows suspend and resume by writing system state to disk
  • noflushd: buffers hard drive writes to allow hard drives to spin down as long as possible
  • rovclock, radeontool: control radeon card based clock speed and lcd brightness
    • "Dynamic Clocks" option in the xorg.conf file also changes gpu speed.
    • kms seems to have replaced xorg.conf
      • unless you've disabled kms, then you need xorg.conf
      • how kms interacts with power management and cpu speed is mysterious and a popular source of hangs that are posted in discussion boards
      • This is the best info I have on using /sys/* to manage radeon: http://wiki.archlinux.org

It appears that there are three general pathways for power management. ACPI finds out that the power state has changed, then dbus, hal, and sysfs get updated. From there, the various daemons and power managers take over firing off scripts that control the sundry power settings on a modern machine.

It is probably important to dissable the xfce/gnome/kde power managers too...

So, finding the source of the crash in my computer entails fiddling with each subsystem until one generates a crash. Fun times.

Fun things to read about Power Management:

1 comment:

  1. So, Like, is this a common problem or is it, I wonder peculiar to the stinkpad of your vintage?

    ReplyDelete

Leave a message after the tone...