correct me if i’m wrong, but NVIDIA’s bandwidthTest program included in the CUDA SDK takes single timing measurements and reports them. if there’s any noise in the measurements, these single reports may be misleading. i wrote my own bandwidth test program that takes 100 measurements and spews out the resulting data for analysis. with my program at least, there is plenty of noise if you measure unpinned rather than pinned transfers.

below are some plots to show it. each plot shows the percentage of transfers that completed in <= the amount of time on the x axis. so the closer a line is to being purely vertical, the less variation there is in the transfer times for that device. the farther left a line is, the faster the transfer times for that device. the plots compare transfer times on my two systems–maul and vader, to transfer times on a friend’s system. note that i’ve restricted the plot so the y-axis only goes to 0.99 (99%) so that outliers don’t squish the x-axis.

in this plot, the green and orange lines are all over each other, but in the next one…
…they separate neatly (up till 90% or so).
if you want my measurement code or the R code to analyze its results, shoot me an email or comment.