Advanced search

Message boards : Number crunching : Short Run Error rates

Author Message
Zalster
Avatar
Send message
Joined: 26 Feb 14
Posts: 177
Credit: 4,121,030,726
RAC: 843,988
Level
Arg
Scientific publications
watwatwat
Message 51620 - Posted: 11 Mar 2019 | 21:51:07 UTC
Last modified: 11 Mar 2019 | 21:51:33 UTC

Anyone else notice the high number of failure rates for all short runs? Was looking and have noticed I had a few resends to my machines that completed correctly. Was trying to figure out why the previous machines errored out. Couldn't find a common reason other than BOINC version. I hope there is a way for the scientist to look at these and see what might have been the reason. Since we can't access those error logs, only they can tell us if it's the work unit that errored or something with the computers.

Z
____________

Jimbocous
Avatar
Send message
Joined: 2 Mar 19
Posts: 3
Credit: 3,432,175
RAC: 0
Level
Ala
Scientific publications
wat
Message 51677 - Posted: 30 Mar 2019 | 18:55:01 UTC

Not sure if this is the same issue, but also ran into this:
http://www.gpugrid.net/result.php?resultid=20751166
Runs, for about 2 hrs, gets into the high 30% complete range, then crashes. Restarts, gets to about the same point, crashes again. Finally just aborted this, as it looks as though it would have looped forever.
Looking through the logs, can't see anything that would indicate I have an issue here.
Jim
____________

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1104
Credit: 6,101,732,079
RAC: 3
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51683 - Posted: 31 Mar 2019 | 23:38:56 UTC - in response to Message 51620.

Couldn't find a common reason other than BOINC version.

WU errors here (and elsewhere) are common. Be assured that BOINC version is not the reason.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 265
Credit: 647,845,139
RAC: 966
Level
Lys
Scientific publications
wat
Message 51706 - Posted: 19 Apr 2019 | 22:06:14 UTC
Last modified: 19 Apr 2019 | 22:06:39 UTC

Here's one. All users had an error in a couple of seconds.
https://www.gpugrid.net/workunit.php?wuid=16425136

Post to thread

Message boards : Number crunching : Short Run Error rates