c# - Parallel.For performance -


this code microsoft article http://msdn.microsoft.com/en-us/library/dd460703.aspx, small changes:

        const int size = 10000000;         int[] nums = new int[size];         parallel.for(0, size, => {nums[i] = 1;});         long total = 0;          parallel.for<long>(             0, size, () => 0,             (j, loop, subtotal) =>             {                 return subtotal + nums[j];             },             (x) => interlocked.add(ref total, x)          );          if (total != size)         {             console.writeline("error");         } 

non-parallel loop version is:

        (int = 0; < size; ++i)         {             total += nums[i];         } 

when measure loop execution time using stopwatch class, see parallel version slower 10-20%. testing done on windows 7 64 bit, intel i5-2400 cpu, 4 cores, 4 gb ram. of course, in release configuration.

in real program trying compute image histogram, , parallel version runs 10 times slower. can such kind of computation tasks, when every loop invocation fast, parallelized tpl?

edit.

finally managed shave more 50% of histogram calculation execution time parallel.for, when divided whole image number of chunks. every loop body invocation handles whole chunk, , not 1 pixel.

because parallel.for should used things little heacy, not sum simple numbers! use of delegate (j, loop, subtotal) => more enough give 10-20% more time. , aren't speaking of threading overhead. interesting see benchmark against delegate summer in cycle , see not "real world" time, cpu time.

i have added comparison "simple" delegate same thing parallel.for<> delegate.

mmmh... have numbers @ 32 bits, on pc (an amd 6 core)

32 bits parallel: ticks:      74581, total processtime:    2496016 base    : ticks:      90395, total processtime:     312002 func    : ticks:     147037, total processtime:     468003 

the parallel little faster @ wall time, 8x slower @ processor time :-)

but @ 64 bits:

64 bits parallel: ticks:     104326, total processtime:    2652017 base    : ticks:      51664, total processtime:     156001 func    : ticks:      77861, total processtime:     312002 

modified code:

console.writeline("{0} bits", intptr.size == 4 ? 32 : 64);  var cp = process.getcurrentprocess(); cp.priorityclass = processpriorityclass.high;  const int size = 10000000; int[] nums = new int[size]; parallel.for(0, size, => { nums[i] = 1; });  gc.collect(); gc.waitforpendingfinalizers();  long total = 0;  {     timespan start = cp.totalprocessortime;     stopwatch sw = stopwatch.startnew();      parallel.for<long>(         0, size, () => 0,         (j, loop, subtotal) =>         {             return subtotal + nums[j];         },         (x) => interlocked.add(ref total, x)     );      sw.stop();     timespan end = cp.totalprocessortime;      console.writeline("parallel: ticks: {0,10}, total processtime: {1,10}", sw.elapsedticks, (end - start).ticks); }  if (total != size) {     console.writeline("error"); }  gc.collect(); gc.waitforpendingfinalizers();  total = 0;  {     timespan start = cp.totalprocessortime;     stopwatch sw = stopwatch.startnew();      (int = 0; < size; ++i)     {         total += nums[i];     }      sw.stop();     timespan end = cp.totalprocessortime;      console.writeline("base    : ticks: {0,10}, total processtime: {1,10}", sw.elapsedticks, (end - start).ticks); }  if (total != size) {     console.writeline("error"); }  gc.collect(); gc.waitforpendingfinalizers();  total = 0;  func<int, int, long, long> adder = (j, loop, subtotal) => {     return subtotal + nums[j]; };  {     timespan start = cp.totalprocessortime;     stopwatch sw = stopwatch.startnew();      (int = 0; < size; ++i)     {         total = adder(i, 0, total);     }      sw.stop();     timespan end = cp.totalprocessortime;      console.writeline("func    : ticks: {0,10}, total processtime: {1,10}", sw.elapsedticks, (end - start).ticks); }  if (total != size) {     console.writeline("error"); } 

Comments

Popular posts from this blog

ios - UICollectionView Self Sizing Cells with Auto Layout -

node.js - ldapjs - write after end error -

DOM Manipulation in Wordpress (and elsewhere) using php -