@@ -60,14 +60,15 @@ js_distance <- function(mat) {
60
60
61
61
Here is a re-implementation of ` js_distance ` using Rcpp. Note that this
62
62
doesn't yet take advantage of parallel processing, but still yields an
63
- approximately 35x speedup over the original R version.
63
+ approximately 50x speedup over the original R version on a 2.6GHz Haswell
64
+ MacBook Pro.
64
65
65
66
Abstractly, a Distance function will take two vectors in R<sup >J</sup > and
66
67
return a value in R<sup >+</sup >. In this implementation, we don't support
67
68
arbitrary distance metrics, i.e., the JSD code computes the values from
68
69
within the parallel kernel.
69
70
70
- Our distance function ` kl_divergence ` is defined below and takes three
71
+ Our distance function ` kl_divergence ` is defined below and takes three
71
72
parameters: iterators to the beginning and end of vector 1 and an iterator to
72
73
the beginning of vector 2 (the end position of vector2 is implied by the end
73
74
position of vector1).
@@ -168,9 +169,9 @@ parallel code is almost identical to the serial code. The main difference is
168
169
that the outer loop starts with the ` begin ` index passed to the worker
169
170
function rather than 0.
170
171
171
- Parallelizing in this case has big payoff: we observe performance of about 6x
172
- the serial version on a machine with 4 cores (8 with hyperthreading). Here is
173
- the definition of the ` JsDistance ` function object:
172
+ Parallelizing in this case has a big payoff: we observe performance of about
173
+ 5.5x the serial version on a 2.6GHz Haswell MacBook Pro with 4 cores (8 with
174
+ hyperthreading). Here is the definition of the ` JsDistance ` function object:
174
175
175
176
{% highlight cpp %}
176
177
// [[ Rcpp::depends(RcppParallel)]]
@@ -274,10 +275,9 @@ res[,1:4]
274
275
1 js_distance(m) 3 35.560 323.273
275
276
</pre >
276
277
277
- The serial Rcpp version yields a more than 50x speedup over straight R code.
278
- On a machine with 4 cores (8 with hyperthreading) the parallel Rcpp version
279
- provides another 5.5x speedup, amounting to a total gain of over 300x
280
- compared to the original R version.
278
+ The serial Rcpp version yields a more than 50x speedup over straight R code.
279
+ The parallel Rcpp version provides another 5.5x speedup, amounting to a total
280
+ gain of over 300x compared to the original R version.
281
281
282
282
Note that performance gains will typically be 30-50% less on Windows systems
283
283
as a result of less sophisticated thread scheduling (RcppParallel does not
0 commit comments