聚类中心
C
=
{
c
1
,
.
.
.
,
c
k
}
C = \{c_1,...,c_k\}
C={c1,...,ck}, 包
B
i
=
{
b
i
1
,
.
.
.
,
b
i
m
}
\mathbf{B}_i=\{b_{i1},...,b_{im}\}
Bi={bi1,...,bim}
一、最短距离映射
映射方式如下
h
i
=
m
a
x
e
x
p
j
(
−
∥
b
i
j
−
c
i
∥
σ
2
)
h_i=\underset{j}{\rm{max}\,\rm{exp}}(- \frac{\|b_{ij}-c_i \|}{\sigma^2})
hi=jmaxexp(−σ2∥bij−ci∥)
B
i
\mathbf{B}_i
Bi最终的映射向量是
H
(
B
i
)
=
[
h
1
,
.
.
.
,
h
k
]
H(\mathbf{B}_i) = [h_1,...,h_k]
H(Bi)=[h1,...,hk]
# 多样密度映射
ret_vec = np.ones(self.k_m)
index = 0
for item in centers:
ret_vec[index] = np.linalg.norm(item - self.bags[idx, 0][:, :self.dimensions-1], axis=1).min()
index += 1
ret_vec = [math.exp(-x) for x in ret_vec]
return ret_vec / dis_euclidean(ret_vec, np.zeros_like(ret_vec))
二、去中心化映射
B
i
\mathbf{B}_i
Bi里面的实例根据
C
C
C 被分为了
K
K
K类,
K
≤
k
K\le k
K≤k, 每一类为一个集合
S
i
=
{
s
1
,
.
.
.
,
s
l
}
S_i=\{s_1,...,s_l\}
Si={s1,...,sl}, 映射方式如下
h
i
=
∑
a
=
1
l
s
a
−
c
i
h_i = \sum^l_{a=1}s_a-c_i
hi=a=1∑lsa−ci
h
i
h_i
hi是与实例同维的向量,最终
B
i
\mathbf{B}_i
Bi映射为
H
(
B
i
)
=
[
h
1
,
.
.
.
,
h
k
]
H(\mathbf{B}_i) = [h_1,...,h_k]
H(Bi)=[h1,...,hk],如果没有
c
i
c_i
ci这一类的实例,则对应的
h
i
h_i
hi 缺失值补零
ret_vec = np.zeros((len(centers), self.dimensions-1))
idx_ins = 0
for ins in self.bags[idx, 0][:, :self.dimensions-1]:
ret_vec[labels[idx_ins]] += ins - centers[labels[idx_ins]]
idx_ins += 1
ret_vec = np.resize(ret_vec, self.k_m * self.dimensions-1)
ret_vec = np.sign(ret_vec) * np.sqrt(np.abs(ret_vec))
return ret_vec / dis_euclidean(ret_vec, np.zeros_like(ret_vec))
三、均值映射
B
i
\mathbf{B}_i
Bi里面的实例根据
C
C
C 被分为了
K
K
K类,
K
≤
k
K\le k
K≤k, 每一类为一个集合
S
i
=
{
s
1
,
.
.
.
,
s
l
}
S_i=\{s_1,...,s_l\}
Si={s1,...,sl}, 映射方式如下
h
i
=
1
∣
S
i
∣
∑
a
=
1
l
s
a
h_i = \frac{1}{|S_i|}\sum^l_{a=1}s_a
hi=∣Si∣1a=1∑lsa
h
i
h_i
hi是与实例同维的向量,最终
B
i
\mathbf{B}_i
Bi映射为
H
(
B
i
)
=
[
h
1
,
.
.
.
,
h
k
]
H(\mathbf{B}_i) = [h_1,...,h_k]
H(Bi)=[h1,...,hk],如果没有
c
i
c_i
ci这一类的实例,则对应的
h
i
h_i
hi 缺失值补零
ret_vec = np.zeros((self.k_m, self.dimensions-1))
idx_ins = 0
for ins in self.bags[idx, 0][:, :self.dimensions-1]:
ret_vec[labels[idx_ins]] += ins
idx_ins += 1
unique, count = np.unique(labels, return_counts=True)
data_count = dict(zip(unique, count))
for key in data_count.keys():
ret_vec[key] /= data_count[key]
ret_vec = np.resize(ret_vec, self.k_m * self.dimensions-1)
ret_vec = np.sign(ret_vec) * np.sqrt(np.abs(ret_vec))
return ret_vec / dis_euclidean(ret_vec, np.zeros_like(ret_vec))
四、按比例映射
B
i
\mathbf{B}_i
Bi里面的实例根据
C
C
C 被分为了
K
K
K类,
K
≤
k
K\le k
K≤k, 每一类为一个集合
S
i
=
{
s
1
,
.
.
.
,
s
l
}
S_i=\{s_1,...,s_l\}
Si={s1,...,sl}, 映射方式如下
h
i
=
∣
S
i
∣
∣
B
i
∣
h_i = \frac{|S_i|}{|\mathbf{B}_i|}
hi=∣Bi∣∣Si∣
最终
B
i
\mathbf{B}_i
Bi映射为
H
(
B
i
)
=
[
h
1
,
.
.
.
,
h
k
]
H(\mathbf{B}_i) = [h_1,...,h_k]
H(Bi)=[h1,...,hk],如果没有
c
i
c_i
ci这一类的实例,则对应的
h
i
h_i
hi 缺失值补零
ret_vec = np.zeros(self.k_m)
bag_size = len(self.bags_size[idx])
unique, count = np.unique(labels, return_counts=True)
data_count = dict(zip(unique, count))
for key in data_count.keys():
ret_vec[key] = data_count[key]/bag_size
return ret_vec
测试效果
一、最短距离映射
bag-level classify result:
confusion:
[[282 2]
[ 2 650]]
precision: 0.9969325153374233
recall: 0.9969325153374233
f1-score: 0.9969325153374233
accuracy: 0.9957264957264957
start training single-instance model----------------
model trainig complete!
Finally instance result:
confusion:
[[ 337 3]
[ 1 9019]]
precision: 0.9996674794945688
recall: 0.9998891352549889
f1-score: 0.9997782950892363
accuracy: 0.9995726495726496
class-time 27.534499883651733
二、去中心化映射
bag-level classify result:
confusion:
[[290 2]
[ 17 627]]
precision: 0.9968203497615262
recall: 0.9736024844720497
f1-score: 0.9850746268656716
accuracy: 0.9797008547008547
start training single-instance model----------------
model trainig complete!
Finally instance result:
confusion:
[[ 342 4]
[ 7 9007]]
precision: 0.9995560981023194
recall: 0.9992234302196583
f1-score: 0.9993897364771152
accuracy: 0.9988247863247863
class-time 31.642945766448975
三、均值映射
bag-level classify result:
confusion:
[[307 4]
[ 10 615]]
precision: 0.9935379644588045
recall: 0.984
f1-score: 0.9887459807073955
accuracy: 0.9850427350427351
start training single-instance model----------------
model trainig complete!
Finally instance result:
confusion:
[[ 368 5]
[ 1 8986]]
precision: 0.9994438883327772
recall: 0.999888728162902
f1-score: 0.9996662587607075
accuracy: 0.9993589743589744
class-time 32.479421615600586
四、按比例映射
bag-level classify result:
confusion:
[[272 25]
[ 9 630]]
precision: 0.9618320610687023
recall: 0.9859154929577465
f1-score: 0.973724884080371
accuracy: 0.9636752136752137
start training single-instance model----------------
model trainig complete!
Finally instance result:
confusion:
[[ 333 28]
[ 1 8998]]
precision: 0.9968978506536672
recall: 0.999888876541838
f1-score: 0.9983911234396671
accuracy: 0.9969017094017094
class-time 27.800466299057007
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)