Go Back   Hardware Canucks > HARDWARE CANUCKS COMMUNITY > HardwareCanucks F@H Team

    
Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old February 10, 2009, 06:10 PM
Alwaysrun's Avatar
Allstar
F@H
 
Join Date: Sep 2008
Location: Qualicum Beach BC
Posts: 800

My System Specs

Default Help NANS detected on gpu

Been getting this error a few times since I installed the SMP console client and it's a pain because it pauses for 24 hours saying my EUE limit has been reached. Any ideas how to fix this? It seems that if I shut down the client and restart it sometimes it starts ok and sometimes it gives me the same error. I put my 260 back on stock settings but I still get these random unstable machines. :help:

[01:53:03] Entering M.D.
[01:53:09] Working on Protein
[01:53:10] Client config found, loading data.
[01:53:10] Starting GUI Server
[01:53:10] mdrun_gpu returned
[01:53:10] NANs detected on GPU
[01:53:10]
[01:53:10] Folding@home Core Shutdown: UNSTABLE_MACHINE
[01:53:13] CoreStatus = 7A (122)
[01:53:13] Sending work to server
[01:53:13] Project: 5768 (Run 9, Clone 54, Gen 145)
[01:53:13] - Error: Could not get length of results file work/wuresults_04.dat
[01:53:13] - Error: Could not read unit 04 file. Removing from queue.
[01:53:13] - Preparing to get new work unit...
Reply With Quote
  #2 (permalink)  
Old February 10, 2009, 06:43 PM
LCB001's Avatar
Folding Captain
 
Join Date: Feb 2008
Location: Aylmer QC.
Posts: 1,774

My System Specs

Default

Try deleting the work folder, queue.dat, unitinfo and FahCore_11 for the GPU. Reboot and restart the client it should download replacments. Sometimes it get in a loop and keeps trying the same WU. If that don't work you might have to reinstall...
__________________
Folding For Team 54196

Reply With Quote
  #3 (permalink)  
Old February 10, 2009, 06:53 PM
3.0charlie's Avatar
3.0 "I kill SR2's" Charlie
F@H
 
Join Date: May 2007
Location: Laval, QC
Posts: 9,613

My System Specs

Default

Make sure your oc was stable, temps were fine.

Which drivers are you running?
__________________
Hydro-Quebec is salivating...
Reply With Quote
  #4 (permalink)  
Old February 10, 2009, 07:20 PM
Alwaysrun's Avatar
Allstar
F@H
 
Join Date: Sep 2008
Location: Qualicum Beach BC
Posts: 800

My System Specs

Default

ATM charlie I have both my CPU and GPU running stock, temps at load are excellent for both. CPU 44C GPU 68C

I'm using 181.20 from December but I see they came out with 181.22 on the 22nd of January. Maybe update those? But strange this just started happening since I started using the SMP client.

LCB I'll do as you suggested and see if a sticky wicket is gumming up the works here. btw I haven't got around yet to lowering the CPU usage yet in the SMP client. I dunno if that may be causing some issues, but I did unlock the cores as you suggested earlier and I noticed the GPU works a bit faster now it can utilize those unused clock cycles. (seems it was losing the fight for CPU power against the SMP hogging it before)

Thanks gents.
Reply With Quote
  #5 (permalink)  
Old February 10, 2009, 07:58 PM
chrisk's Avatar
Folding Captain
 
Join Date: Jul 2008
Location: GTA, Ontario
Posts: 7,401

My System Specs

Default

I am wondering if its a machine ID issue....Make sure that in the advanced tabs for the gpu and the cpu clients, that you have different machine IDs selected (ie. GPU ID set to 1, CPU set to 3, etc) or they can conflict with each other. Do that first, and if the numbers were the same, change them, and then delete the files as stated by LCB001
__________________
Fold for team #54196
Reply With Quote
  #6 (permalink)  
Old February 10, 2009, 08:10 PM
LCB001's Avatar
Folding Captain
 
Join Date: Feb 2008
Location: Aylmer QC.
Posts: 1,774

My System Specs

Default

Quote:
Originally Posted by Alwaysrun View Post
ATM charlie I have both my CPU and GPU running stock, temps at load are excellent for both. CPU 44C GPU 68C

I'm using 181.20 from December but I see they came out with 181.22 on the 22nd of January. Maybe update those? But strange this just started happening since I started using the SMP client.

LCB I'll do as you suggested and see if a sticky wicket is gumming up the works here. btw I haven't got around yet to lowering the CPU usage yet in the SMP client. I dunno if that may be causing some issues, but I did unlock the cores as you suggested earlier and I noticed the GPU works a bit faster now it can utilize those unused clock cycles. (seems it was losing the fight for CPU power against the SMP hogging it before)

Thanks gents.
Did you up the priority of the GPU client, that will eliminate the fight for CPU cycles...
__________________
Folding For Team 54196

Reply With Quote
  #7 (permalink)  
Old February 10, 2009, 10:50 PM
Alwaysrun's Avatar
Allstar
F@H
 
Join Date: Sep 2008
Location: Qualicum Beach BC
Posts: 800

My System Specs

Default

Quote:
Originally Posted by chriskwarren View Post
I am wondering if its a machine ID issue...
SMP is ID 1 and GPU is ID 2

Quote:
Originally Posted by LCB001 View Post
Did you up the priority of the GPU client, that will eliminate the fight for CPU cycles...
Yes LCB the Core priority is set to "slightly higher" in the GPU client, or are you talking about the slider? it's at 100%.

I'm just going to uninstall and reinstall the gpu client I guess. In the morning I'm going to go through all the SMP advanced options and actually set the Core usage to 98% like you suggested LCB.

Hope it helps, having to babysit this is a pain.
Reply With Quote
  #8 (permalink)  
Old February 10, 2009, 11:12 PM
LCB001's Avatar
Folding Captain
 
Join Date: Feb 2008
Location: Aylmer QC.
Posts: 1,774

My System Specs

Default

Quote:
Originally Posted by Alwaysrun View Post
SMP is ID 1 and GPU is ID 2



Yes LCB the Core priority is set to "slightly higher" in the GPU client, or are you talking about the slider? it's at 100%.

I'm just going to uninstall and reinstall the gpu client I guess. In the morning I'm going to go through all the SMP advanced options and actually set the Core usage to 98% like you suggested LCB.

Hope it helps, having to babysit this is a pain.
Sometimes it takes deleting those files several times to get rid of a bad WU, if thats what's causing this. A fresh reinstall will usually fix it though. You might want to check if it's trying to redo the same WU each time, if it is it will usually clear up after a few deleting cycles.

Folding wasn't always as stable as it is now, when stanford starts fiddling with core revisions and client changes it can get really annoying and cause problems for days until you figure out how to stabilize your system...that's part of the FUN, just ask 3.0charlie, sswilson and some of the other Oldtimers...
__________________
Folding For Team 54196

Reply With Quote
  #9 (permalink)  
Old February 11, 2009, 07:40 AM
Alwaysrun's Avatar
Allstar
F@H
 
Join Date: Sep 2008
Location: Qualicum Beach BC
Posts: 800

My System Specs

Default

fun...heh. Well I unistalled the gpu client and reinstalled last night before I went to bed. Seems it did two WUs then it took a crap again and stayed idle all throughout the night. sucks losing 6 hours of downtime to this NANs whatever problem. I've shut off off the SMP console and will try folding today just with the GPU and see if I can isolate this problem. I'll make special note of which project is causing this if there is just one.

*Edit: Well after a lengthy read over at the stanford folding forums it appears that many people are getting this error and it's been happening for the last 3 weeks. Seems to be a few specific projects in the 57xx range. Nvidia cards are getting these and driver versions and OS used dosn't seem to be the problem as users with different setups are experiencing this same error. People have scrubbed their work folder and other files, also complete registry cleaning and reinstalling the client does not fix this problem. My initial thought that my new install of SMP was the culprit has been disproved as many people without SMP are getting this error as well.

I'm suspecting certain project servers are issuing bad WUs repeatedly. Vijay made an announcement about this issue and they are hard at work to resolve this. They have a thread on the official forums to report these errors and I found people with my exact setup getting this error so I didn't bother adding mine to the list.
Folding Forum • View topic - 57xx - NV GPUs failing all the time @ all projects

I just don't know what to do. Babysitting the client every completion is impossible so I guess I'll just have to bare with it until the Pande group figures this out.

Last edited by Alwaysrun; February 11, 2009 at 08:35 AM.
Reply With Quote
  #10 (permalink)  
Old February 11, 2009, 10:53 AM
chrisk's Avatar
Folding Captain
 
Join Date: Jul 2008
Location: GTA, Ontario
Posts: 7,401

My System Specs

Default

That sucks but thanks for letting us know in case it happens to us as well.
__________________
Fold for team #54196
Reply With Quote
Reply


Thread Tools
Display Modes

Similar Threads
Thread Thread Starter Forum Replies Last Post
SATA detected as SCSI in Vista? no_pulse O/S's, Drivers & General Software 2 October 7, 2008 11:52 PM
p5q: IDE hdd is not detected CTA Troubleshooting 3 September 4, 2008 10:22 PM
Multi-Hop Cycle detected tao5269 Off Topic 1 July 19, 2008 04:26 AM
DISC0 not detected PaxeSalute Storage 1 April 10, 2008 10:07 AM
Seagate Barracuda 500GB Not Detected in BIOS McLaren_F1 Troubleshooting 4 October 30, 2007 08:24 PM