Phantom pain with windfarm on diskless iMac G5

Benjamin Herrenschmidt benh at kernel.crashing.org
Thu Jan 5 10:41:29 EST 2006


On Wed, 2006-01-04 at 15:20 +0100, Markus Demleitner wrote:
> Hi,
> 
> I tried 2.6.15 on my diskless iMac G5 clients today, resulting in
> 747 emulation mode (vrooom...!).  It turns out windfarm was
> querying the hard disk temperature sensor, which usually is mounted
> on the mounting bracket Apple uses.  We made the mistake of removing
> these (from about 30 machines:-(), which in turn made
> windfarm_lm75_sensor.c:wf_lm75_get return ffff, which translates into
> about 255 degrees celsius.  No wonder windfarm pumped like there's no
> tomorrow.
> 
> I've "fixed" this by returning some fixed low temperature if I see
> ffff in wf_lm75_get for now, but I *guess* it would be nice to have
> some way to detect the absence of the sensor (and tell it from a
> simple failure).  However, the OF device trees still list the sensor
> and even the hard disk itself even on the diskless machines.

The problem is to differenciate between a diskless machine and a
defective sensor. In the later case, you _want_ to pump the fans.

> Even if there were a way to detect the absence of the sensor, there's
> still the problem that windfarm_pm81.c insists on having a hd temp
> sensor to work, so a fix would probably require spoiling that
> wonderful
> 	if (sensor_cpu_power && sensor_cpu_temp && sensor_hd_temp)
> in there and replacing it with something like
>   if (sensor_cpu_power && sensor_cpu_temp && (machine_has_hd() 
> 		&& sensor_hd_temp))
> where I have no idea how to implement machine_has_hd().  A further
> similar hack would spoil wf_smu_sys_fans_tick, and uglyness prevails.
> 
> In short: Am I doomed to hack the kernels of my diskless clients to
> eternity (or retrofit the sensors)?  Or is there a sane way to treat
> that kind of problem?

Hrm... That isn't trivial as I don't see a clean way to detect that the
HD is not there from windfarm without doing gross hacks, unless we can
somewhat rely on the device-tree there...

What we could do is:

 - Make pm81 start the control loops regardless of the presence of the
sensor, and have the control loop itself set the disk fan to an
arbitrary low value if the sensor is not there. If the sensor kicks in
"later" (because lm75 loads later), it will automatically start using
the full control loop. That is easy.

 - In lm75 itself, in case of failure, add a little hack that tests if
the disk is present by looking in the device-tree, provided again that
there is a node for it that can be detected... If not, then return an
arbitrarily low temperature instead of a failure.

Either that or a module/kernel command line option... The later is
easier but less "neat" :)

Ben.
 




More information about the Linuxppc64-dev mailing list