Matlab large array (1e6 values) computation speed -
i trying optimize matlab code statistic calculation large array of data (1e6 values). tried several methods, loops or fun functions, diff or basic math. need calculate accumulation set of data , standard deviation it.
i cannot running under 24 seconds. there way improve code, without using additional toolboxes?
here tried until now:
clear close mydata = rand(1e5, 1)/5e6; m = 1000; n = length(mydata)-m; pkpk = nan(m, 1); std = nan(m, 1); mymat = nan (1, n); %%%%%%%%%%%%%%%%%%%%%%%%%% peak2peak part of signal processing toolbox: %%%%%%%%%%%%%%%%%%%%%%%%%% can use max()-min() tic x = 1 : m mymat = diff( (reshape(mydata(1:x*floor(n/x)),x,floor(n/x)))') ; pkpk (x) = peak2peak(mymat(:)) ; std(x) = sqrt(sum(sum((mymat-mean(mymat(:))).^2))/numel(mymat)); end time1 = toc; %%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%% tic x = 1 : m mymat = bsxfun(@minus, mydata(x+1 : x+n) , mydata(1:n)) '; % edit here: transpose pkpk (x) = peak2peak(mymat(:)) ; % max - min std(x) = sqrt(sum(sum((mymat-mean(mymat(:))).^2))/numel(mymat)); % std end time2 = toc; %%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%% tic x = 1 : m mymat = mydata(x+1 : x+n) - mydata(1:n);% pkpk (x) = peak2peak(mymat(:)) ; % max - min std(x) = sqrt(sum(sum((mymat-mean(mymat(:))).^2))/numel(mymat)); % std end time3 = toc; %%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%% tic x = 1 : m std(x) = std( reshape( diff(reshape( mydata(1:x*floor(n/x)) , x ,floor(n/x))'), floor(n/x)' * x -x, 1 ) ) ; pkpk(x) = peak2peak( reshape( diff(reshape( mydata(1:x*floor(n/x)) , x ,floor(n/x))'), floor(n/x)' * x -x, 1 ) ); end time4 =toc; %%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%% tic x = 1 : m pkpk (m) = peak2peak( mydata(x+1 : x+n) - mydata(1:n)) ; std(m) = std( mydata(x+1 : x+n) - mydata(1:n)) ; end time5 =toc; %%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%% tic pkpk = (cellfun(@(x) peak2peak( reshape( diff(reshape( mydata(1:x*floor(n/x)) , x ,floor(n/x))'), floor(n/x)' * x -x, 1 ) ) , num2cell(1:m) )); std = (cellfun(@(x) std( reshape( diff(reshape( mydata(1:x*floor(n/x)) , x ,floor(n/x))'), floor(n/x)' * x -x, 1 ) ) , num2cell(1:m) )); time6 =toc; %%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%% tic pkpk = cellfun( @(x) peak2peak( mydata(x:n+x-1) - mydata(1:n) ) , num2cell(1:m) ) ; std = cellfun( @(x) std( mydata(x:n+x-1) - mydata(1:n) ) , num2cell(1:m) ) ; time7 =toc; %%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%% tic std = cellfun( @(x) std( mydata(x+1 : x+n) - mydata(1:n)), num2cell(1:m) ) ; pkpk = cellfun( @(x) max( mydata(x+1 : x+n) - mydata(1:n)) - min( mydata(x+1 : x+n) - mydata(1:n)) , num2cell(1:m) ); time8 =toc; %%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%% tic std = arrayfun( @(x) std( mydata(x+1 : x+n) - mydata(1:n)), (1:m) ) ; pkpk = arrayfun( @(x) peak2peak( mydata(x+1 : x+n) - mydata(1:n)) , (1:m) ); time9 =toc; %%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%
and here time results(in seconds):
time1: 24.47 time2: 23.56 time3: 25.20 time4: 45.44 time5: 42.99 time6: 46.27 time7: 43.62 time8: 62.49 time9: 41.69
thank you!
i took second solution (the fastest on benchmark) , did modifications.
a performance improvement can achieved if stop acessing mydata(1:n)
every loop iteration , assign array before loop, this:
tic mydata1ton = mydata(1:n); x = 1 : m mymat = bsxfun(@minus, mydata(x+1 : x+n) , mydata1ton); pkpk (x) = peak2peak(mymat(:)) ; % max - min std(x) = sqrt(sum(sum((mymat-mean(mymat(:))).^2))/numel(mymat)); % std end clear mydata1ton; time2 = toc
time before:
time2: 20.5618
time after:
time2: 14.2260
another modification: sum(sum(...
can changed sum(...
, because outer sum summing single value.
time after:
time2: 11.6573
by way, numel(mymat)
can replaced n
, didn't note performance improvement.
Comments
Post a Comment