View Single Post
  #16 (permalink)  
Old February 14, 2012, 01:18 PM
Dead Things's Avatar
Dead Things Dead Things is offline
Hall Of Fame
Join Date: Oct 2008
Location: Centre of the Universe
Posts: 1,962

My System Specs


Originally Posted by Mathus View Post
I just don't understand how putting a few CPU cycles towards GPU Folding makes that much of an impact. I would think that the PPD produced from the GPU's would be more than the lost PPD from the CPU. Isn't the CPU utilization from GPU folding like 3-5% for nvidia cards? If this is the case then I can't see it being better to only run CPUs instead of CPU and GPU in the same system.

Has somebody actually tested this? Perhaps somebody can clarify the situation for me and Linus a little as I am sure he is wondering the same thing.
Good question - and yes, it's been tried and tested numerous times by myself and many others here who can attest to the same thing. The issue stems from the "S" part of "SMP" - that is, symmetry. Even though a GPU may take only a small proportion of cycles away from a single core in a multi-threaded machine, it wreaks havoc on processing symmetry such that many cycles are displaced and have to be re-assigned. These cycles are appended to the end of other cycles already existing in the same processing thread in accordance with a round-robin schedule. Your kernel scheduler takes care of this in a manner that is seamless for most usage profiles, but significantly negative in terms of performance when dealing with a well-optimized multi-threaded process that is highly-sensitive to asymmetry like FAH.

The problem here is that every time this happens, all of the other processing threads have to stop and wait for the extra appended cycles on the one thread to complete. So while the GPU process may occupy merely 5% of the cycles of a thread, its net affect on the SMP process is much greater. I've seen boxes wherein the net time the fahcore has spent waiting for cycles to catch up has been in excess of 60%. That means out of every 100 minutes of processing time, just 40 were spent doing actual work.

The faster the machine (and thus, the more dependent upon quick return bonuses) the more negative this effect is. In other words, while GPU folding does the same thing to a Q6600 as it does to quad 6174's, for example, the net effect in terms of PPD is exponentially smaller on a Q6600 than the quad-Opty box due to the exponential impact of the QRB's on the SMP PPD of the two machines.

...But most folders don't really care about the "why" of this kind of stuff - mainly the "how" of maximizing the PPD of the hardware available to them, hence the recommendation to segregate GPU and CPU folding machines.

edit - @ the OP, my recommendation would be - if possible - to choose a relatively high-profile member of the NCIX folding team from the GVRD that can maybe even make an appearance on the episode. That way, new members will see the guy from the episode on the team and might be more inclined to ask him questions rather than inundate you with them.
Think you can overclock? Then show us what you got!
Join the Hardware Canucks Overclocking team today!

Follow my benching, folding, mining and miscellaneous shenanigans @dt_oc

Last edited by Dead Things; February 14, 2012 at 01:27 PM.
Reply With Quote