Advanced search

Message boards : Number crunching : No WU's being sent

Author Message
Spatzthecat
Send message
Joined: 26 Nov 09
Posts: 32
Credit: 1,133,103,944
RAC: 2,449,667
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 52031 - Posted: 6 Jun 2019 | 18:31:56 UTC

As my computers are finishing WU's new units are NOT being sent.
Is there a problem as the server status suggests over 200 long WU's are ready to send.

rod4x4
Send message
Joined: 4 Aug 14
Posts: 72
Credit: 1,488,647,769
RAC: 2,203,234
Level
Met
Scientific publications
watwatwatwatwatwatwat
Message 52035 - Posted: 7 Jun 2019 | 0:23:00 UTC
Last modified: 7 Jun 2019 | 0:29:22 UTC

Same here. It appears to have happened due to the development of the new ACEMD3 app.

My Win machines (Pascal GPU) with 3xx drivers can download ok, my Win Machines (Pascal GPU) with 4xx drivers cannot.

I have downgraded one Win machine from 4xx driver to 3xx driver, and it can now download tasks.

hopefully only a temporary glitch.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 255
Credit: 647,700,389
RAC: 79
Level
Lys
Scientific publications
wat
Message 52036 - Posted: 7 Jun 2019 | 2:10:26 UTC

App 2.04 should support cuda10, if the scheduler collaborates.
Expect hiccups...

T

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 800
Credit: 4,294,282
RAC: 1
Level
Ala
Scientific publications
watwatwatwat
Message 52037 - Posted: 7 Jun 2019 | 6:47:59 UTC - in response to Message 52036.

Ouch. It must be because of the conditions for offering cuda100 vs cuda80. I promote "new" hosts to cuda100 but there is no such app for acemd2 (there is for acemd3).

Will try to find a solution.

t

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2031
Credit: 14,705,115,669
RAC: 1,365,887
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52038 - Posted: 7 Jun 2019 | 7:54:56 UTC - in response to Message 52037.

Ouch. It must be because of the conditions for offering cuda100 vs cuda80. I promote "new" hosts to cuda100 but there is no such app for acemd2 (there is for acemd3).

Will try to find a solution.

t

That's easy: create a CUDA10 Windows app. :)

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 800
Credit: 4,294,282
RAC: 1
Level
Ala
Scientific publications
watwatwatwat
Message 52039 - Posted: 7 Jun 2019 | 8:00:52 UTC - in response to Message 52037.

Please check if solved

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 800
Credit: 4,294,282
RAC: 1
Level
Ala
Scientific publications
watwatwatwat
Message 52040 - Posted: 7 Jun 2019 | 8:01:45 UTC - in response to Message 52038.

Ouch. It must be because of the conditions for offering cuda100 vs cuda80. I promote "new" hosts to cuda100 but there is no such app for acemd2 (there is for acemd3).

Will try to find a solution.

t

That's easy: create a CUDA10 Windows app. :)


We'll probably do it, but for acemd3. In the meantime, acemd2 should not stop working.

Profile [AF>Occitania]franky82
Send message
Joined: 2 May 08
Posts: 2
Credit: 696,033,348
RAC: 1,098,695
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52041 - Posted: 7 Jun 2019 | 8:02:22 UTC

I see 844 tasks on server-status but nothing for the GPUs my computers on Windows 10 !?

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 800
Credit: 4,294,282
RAC: 1
Level
Ala
Scientific publications
watwatwatwat
Message 52042 - Posted: 7 Jun 2019 | 8:11:15 UTC

Please check if solved

Profile [AF>Occitania]franky82
Send message
Joined: 2 May 08
Posts: 2
Credit: 696,033,348
RAC: 1,098,695
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52043 - Posted: 7 Jun 2019 | 8:46:22 UTC

Yes, new tasks ready to crunch !

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2031
Credit: 14,705,115,669
RAC: 1,365,887
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52044 - Posted: 7 Jun 2019 | 9:01:22 UTC - in response to Message 52043.
Last modified: 7 Jun 2019 | 9:02:43 UTC

Please check if solved
Yes, new tasks ready to crunch !
My CUDA10/Windows 10 host received 2 CUDA8.0 tasks at 10:08 CET (08:08 UTC)

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 800
Credit: 4,294,282
RAC: 1
Level
Ala
Scientific publications
watwatwatwat
Message 52045 - Posted: 7 Jun 2019 | 9:18:35 UTC - in response to Message 52044.

:)

The "legacy" apps will stay cuda 80 only.

Spatzthecat
Send message
Joined: 26 Nov 09
Posts: 32
Credit: 1,133,103,944
RAC: 2,449,667
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 52046 - Posted: 7 Jun 2019 | 9:40:46 UTC

All hosts now with tasks.
Thanks everyone :)

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 363
Credit: 4,722,569,239
RAC: 974,026
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52047 - Posted: 7 Jun 2019 | 10:18:18 UTC

It looks like cuda 6.5 is no longer available for Kepler cards, with Windows 7 and 3xx.xx drivers.


http://www.gpugrid.net/results.php?hostid=494023


Is this by accident or intent?


Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 800
Credit: 4,294,282
RAC: 1
Level
Ala
Scientific publications
watwatwatwat
Message 52048 - Posted: 7 Jun 2019 | 11:06:07 UTC - in response to Message 52047.
Last modified: 7 Jun 2019 | 11:12:26 UTC

Might be fixed. Please check.

Diplomat
Send message
Joined: 1 Sep 10
Posts: 8
Credit: 204,328,650
RAC: 8
Level
Leu
Scientific publications
wat
Message 52049 - Posted: 7 Jun 2019 | 13:35:25 UTC
Last modified: 7 Jun 2019 | 14:07:21 UTC

Hi guys, I have Linux system with Nvidia Pascal graphics card, Driver 418.56
Can't get any new tasks when was trying to connect this week.

---

Update:
Apparently in Ubuntu 18.04 you have to install CUDA separately in addition to drivers.

Was able to install nvcc from default repository and after restart started to get tasks: "New version of ACEMD 2.04 (cuda100)".

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85


Hope to get some long runs too

---
Update2

It didn't last for long, got only 6 tasks and silence again, it's really confusing to figure out if something wrong on user side or the project is just being picky :)

kksplace
Send message
Joined: 4 Mar 18
Posts: 34
Credit: 59,327,675
RAC: 395,689
Level
Thr
Scientific publications
wat
Message 52050 - Posted: 7 Jun 2019 | 14:17:50 UTC - in response to Message 52049.
Last modified: 7 Jun 2019 | 14:19:41 UTC

It's not you -- the tasks you received are 'test tasks' testing a new version of the Linux application. The regular/long work units right now are only for Windows hosts until the new application is deployed. (The new app is for Linux hosts at the moment.)

http://www.gpugrid.net/forum_thread.php?id=4935

Diplomat
Send message
Joined: 1 Sep 10
Posts: 8
Credit: 204,328,650
RAC: 8
Level
Leu
Scientific publications
wat
Message 52052 - Posted: 7 Jun 2019 | 16:42:41 UTC - in response to Message 52031.

As my computers are finishing WU's new units are NOT being sent.
Is there a problem as the server status suggests over 200 long WU's are ready to send.


ok, thanks for clarification !

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 363
Credit: 4,722,569,239
RAC: 974,026
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52055 - Posted: 7 Jun 2019 | 21:45:26 UTC - in response to Message 52048.

Might be fixed. Please check.



It's not fixed.


http://www.gpugrid.net/result.php?resultid=20990388


Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 800
Credit: 4,294,282
RAC: 1
Level
Ala
Scientific publications
watwatwatwat
Message 52056 - Posted: 8 Jun 2019 | 9:28:44 UTC - in response to Message 52055.
Last modified: 8 Jun 2019 | 9:59:44 UTC

Might be fixed. Please check.



It's not fixed.


http://www.gpugrid.net/result.php?resultid=20990388




Made yet another attempt. Is there a reason for not upgrading the driver? Supporting old versions is really a problem.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 255
Credit: 647,700,389
RAC: 79
Level
Lys
Scientific publications
wat
Message 52057 - Posted: 8 Jun 2019 | 11:02:09 UTC

391 isn't that old.
Sometimes drivers get slower as they get bigger.

biodoc
Send message
Joined: 26 Aug 08
Posts: 158
Credit: 1,405,900,597
RAC: 137,721
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52058 - Posted: 8 Jun 2019 | 12:16:05 UTC

A GTX 690 is compute capability 3.0. Driver versions 391.xx are cuda 9 capable. Upgrading the driver to 4xx.xx will be cuda 10 capable.

Maybe I'm wrong, but I think Cuda 8,9 and 10 will work on CC 3.0 cards like the GTX 690 as along as you enable CC 3.x support in the makefile.

https://en.wikipedia.org/wiki/CUDA

CUDA SDK 6.5 support for compute capability 1.1 – 5.x (Tesla, Fermi, Kepler, Maxwell). Last version with support for compute capability 1.x (Tesla) CUDA SDK 7.0 – 7.5 support for compute capability 2.0 – 5.x (Fermi, Kepler, Maxwell) CUDA SDK 8.0 support for compute capability 2.0 – 6.x (Fermi, Kepler, Maxwell, Pascal). Last version with support for compute capability 2.x (Fermi) CUDA SDK 9.0 – 9.2 support for compute capability 3.0 – 7.2 (Kepler, Maxwell, Pascal, Volta) CUDA SDK 10.0 – 10.1 support for compute capability 3.0 – 7.5 (Kepler, Maxwell, Pascal, Volta, Turing)

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 363
Credit: 4,722,569,239
RAC: 974,026
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52059 - Posted: 8 Jun 2019 | 12:56:42 UTC
Last modified: 8 Jun 2019 | 13:06:21 UTC

Now I was getting this this on windows 10 machine with Maxwell cards:


06/08/2019 8:51:33 AM | GPUGRID | Requesting new tasks for CPU and NVIDIA GPU
06/08/2019 8:51:34 AM | GPUGRID | Scheduler request completed: got 0 new tasks
06/08/2019 8:51:34 AM | GPUGRID | No tasks sent
06/08/2019 8:51:34 AM | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)

When long runs were still available.

Long runs (8-12 hours on fastest card) 24 2,468 2.78 (0.44 - 19.86) 663

But my Kelper card windows 7 machine, just received cuda 6.5 WUs.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 800
Credit: 4,294,282
RAC: 1
Level
Ala
Scientific publications
watwatwatwat
Message 52060 - Posted: 8 Jun 2019 | 13:49:05 UTC - in response to Message 52059.
Last modified: 8 Jun 2019 | 14:00:07 UTC

Now I was getting this this on windows 10 machine with Maxwell cards:


06/08/2019 8:51:33 AM | GPUGRID | Requesting new tasks for CPU and NVIDIA GPU
06/08/2019 8:51:34 AM | GPUGRID | Scheduler request completed: got 0 new tasks
06/08/2019 8:51:34 AM | GPUGRID | No tasks sent
06/08/2019 8:51:34 AM | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)

When long runs were still available.

Long runs (8-12 hours on fastest card) 24 2,468 2.78 (0.44 - 19.86) 663

But my Kelper card windows 7 machine, just received cuda 6.5 WUs.


I had raised the minimum driver version required for cuda80 in order to avoid the previous problem. Now I reverted the change. Still, this level of ad-hocness is not sustainable; the next app may need updated drivers.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 363
Credit: 4,722,569,239
RAC: 974,026
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52061 - Posted: 8 Jun 2019 | 14:57:16 UTC - in response to Message 52060.

Now I was getting this this on windows 10 machine with Maxwell cards:


06/08/2019 8:51:33 AM | GPUGRID | Requesting new tasks for CPU and NVIDIA GPU
06/08/2019 8:51:34 AM | GPUGRID | Scheduler request completed: got 0 new tasks
06/08/2019 8:51:34 AM | GPUGRID | No tasks sent
06/08/2019 8:51:34 AM | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)

When long runs were still available.

Long runs (8-12 hours on fastest card) 24 2,468 2.78 (0.44 - 19.86) 663

But my Kelper card windows 7 machine, just received cuda 6.5 WUs.


I had raised the minimum driver version required for cuda80 in order to avoid the previous problem. Now I reverted the change. Still, this level of ad-hocness is not sustainable; the next app may need updated drivers.



It seems to be working well, right now. Both my machines received WUs.

Just, tell us what we need to upgrade, and which cards will be supported and not supported.


Zalster
Avatar
Send message
Joined: 26 Feb 14
Posts: 175
Credit: 4,013,368,076
RAC: 10,143
Level
Arg
Scientific publications
watwatwat
Message 52064 - Posted: 8 Jun 2019 | 17:03:34 UTC - in response to Message 52061.

.Just, tell us what we need to upgrade, and which cards will be supported and not supported.





Yes please. I have 2 machines I converted back to windows after the linux issue. But I haven't updated the drivers as they were working. If there is a minimum driver version that you would require please let us know and we can make that happen. On my linux machines I run the latest drivers as my other project requires cuda 10.1 to run on them.
____________

mmonnin
Send message
Joined: 2 Jul 16
Posts: 255
Credit: 647,700,389
RAC: 79
Level
Lys
Scientific publications
wat
Message 52065 - Posted: 8 Jun 2019 | 17:42:31 UTC - in response to Message 52060.

Now I was getting this this on windows 10 machine with Maxwell cards:


06/08/2019 8:51:33 AM | GPUGRID | Requesting new tasks for CPU and NVIDIA GPU
06/08/2019 8:51:34 AM | GPUGRID | Scheduler request completed: got 0 new tasks
06/08/2019 8:51:34 AM | GPUGRID | No tasks sent
06/08/2019 8:51:34 AM | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)

When long runs were still available.

Long runs (8-12 hours on fastest card) 24 2,468 2.78 (0.44 - 19.86) 663

But my Kelper card windows 7 machine, just received cuda 6.5 WUs.


I had raised the minimum driver version required for cuda80 in order to avoid the previous problem. Now I reverted the change. Still, this level of ad-hocness is not sustainable; the next app may need updated drivers.


What is special with 4xx drivers?

Zalster
Avatar
Send message
Joined: 26 Feb 14
Posts: 175
Credit: 4,013,368,076
RAC: 10,143
Level
Arg
Scientific publications
watwatwat
Message 52068 - Posted: 8 Jun 2019 | 18:51:56 UTC - in response to Message 52065.
Last modified: 8 Jun 2019 | 18:53:23 UTC

418 has cuda 10.1 as well as support for Turing cards
____________

mmonnin
Send message
Joined: 2 Jul 16
Posts: 255
Credit: 647,700,389
RAC: 79
Level
Lys
Scientific publications
wat
Message 52069 - Posted: 8 Jun 2019 | 20:32:46 UTC

Thats nice but that doesn't change anything for a CUDA80 app or older cards like the 690 mentioned above.

Unless there is a specific CUDA10 command in the new version then drivers that are older should still be supported. It's like calling something an AVX512 app and limiting to only CPUs with AVX512 even though only SSE2 is used.

It's not like people are asking for pre Kepler drivers to be supported. Just last years.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 257
Credit: 234,046,463
RAC: 85
Level
Leu
Scientific publications
wat
Message 52070 - Posted: 8 Jun 2019 | 21:11:30 UTC - in response to Message 52069.

The 390 series drivers are Legacy drivers now. Minimal support. But the 390 series is still the default repository version offered in many distros.

This Nvidia document shows the changes from CUDA backwards compatibility to CUDA forwards compatibility. It also shows CUDA level compatibility with the various generations of GeForce cards.

https://docs.nvidia.com/deploy/cuda-compatibility/

I agree it may be difficult to maintain support for the older hardware in the future.

Post to thread

Message boards : Number crunching : No WU's being sent