方法1
[val,ind] = max(bsxfun(@eq,permute(M,[4 2 1 3]),permute(N,[2 3 4 1])),[],2)matches = squeeze(all(diff(ind,1)>0,1).*all(val,1))out1 = any(matches,2) %// Solution - 1out2 = sum(matches,1) %// Solution - 2
方法#2
另一种避免
permuting N并且可能更长寿的方法
N-
[val,ind] = max(bsxfun(@eq,N,permute(M,[3 4 1 2])),[],4)matches = squeeze(all(diff(ind,[],2)>0,2).*all(val,2))out1 = any(matches,1) %// Solution - 1out2 = sum(matches,2) %// Solution - 2
方法#3
适用于大数据量的内存分配方法-
out1 = false(size(M,1),1); %// Storage for Solution - 1out2 = zeros(size(N,1),1); %// Storage for Solution - 2for k=1:size(N,1) [val3,ind3] = max(bsxfun(@eq,N(k,:),permute(M,[1 3 2])),[],3); matches = all(diff(ind3,[],2)>0,2).*all(val3,2); out1 = or(out1,matches); out2(k) = sum(matches);end
方法#4
适用于GPU的内存分配方法-
gM = gpuArray(M);gN = gpuArray(N);gout1 = false(size(gM,1),1,'gpuArray'); %// GPU Storage for Solution - 1gout2 = zeros(size(gN,1),1,'gpuArray'); %// GPU Storage for Solution - 2for k=1:size(gN,1) [val3,ind3] = max(bsxfun(@eq,gN(k,:),permute(gM,[1 3 2])),[],3); matches = all(diff(ind3,[],2)>0,2).*all(val3,2); gout1 = or(gout1,matches); gout2(k) = sum(matches);endout1 = gather(gout1); %// Solution - 1out2 = gather(gout2); %// Solution - 2
现在,这种GPU方法已经吹走了所有其他方法。它使用
M : 320000X5和
N :2100X3(与您的输入大小相同)运行,并用随机整数填充。使用
GTX 750 Ti,只需花
13.867873 seconds!!
因此,如果您的GPU具有足够的内存,这可能也是您的制胜法宝。
方法5
适用于GPU的极端存储方式
gM = gpuArray(M);gN = gpuArray(N);gout1 = false(size(gM,1),1,'gpuArray'); %// GPU Storage for Solution - 1gout2 = zeros(size(gN,1),1,'gpuArray'); %// GPU Storage for Solution - 2for k=1:size(gN,1) [val2,ind2] = max(bsxfun(@eq,gM,permute(gN(k,:),[1 3 2])),[],2); matches = all(diff(ind2,[],3)>0,3).*all(val2,3); gout1 = or(gout1,matches); gout2(k) = sum(matches);endout1 = gather(gout1); %// Solution - 1out2 = gather(gout2); %// Solution - 2
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)