生物信息学与算法NLP1 生物信息学

隐马尔可夫模型(HMM)

2018-09-28  本文已影响70人  Thinkando

马尔可夫链

image.png
image.png

隐马尔可夫

用隐马尔可夫模型建模

python 简单实现

import numpy as np

seq="CGAAAAAATCG"
# nocoding ,coding
states = ('N', 'C')
observations = ('A', 'C', 'G','T')
# 状态转移矩阵A
transition_probability = {
    'N': {'N': 0.8, 'C': 0.2},
    'C': {'N': 0.4, 'C': 0.6},
}
# 观测概率矩阵B
emission_probability = {
    'N': {'A': 0.2, 'C': 0.3, 'G': 0.3,'T':0.2},
    'C': {'A': 0.4, 'C': 0.2, 'G': 0.2,'T':0.2},
}

array =np.zeros((len(states),len(seq)+1))

array[0,0]=0.8
array[1,0]=0.2
for i in range(1,array.shape[1]):
    if array[0,i-1]>array[1,i-1]:
        array[0,i]=array[0,i-1]*emission_probability['N'][seq[i-1]]
        array[1,i]=array[0,i-1]*emission_probability['C'][seq[i-1]]
    else:
        array[0, i] = array[1,i-1] * emission_probability['N'][seq[i - 1]]
        array[1, i] = array[1,i-1] * emission_probability['C'][seq[i - 1]]
    print(array)
list=[]
for i in range(1,array.shape[1]):
    if array[0,i]>array[1,i]:
        list.append('N')
    else:
        list.append('C')

print(seq)
print("".join(list))
[[8.000000e-01 2.400000e-01 7.200000e-02 1.440000e-02 5.760000e-03
  2.304000e-03 9.216000e-04 3.686400e-04 1.474560e-04 5.898240e-05
  1.769472e-05 5.308416e-06]
 [2.000000e-01 1.600000e-01 4.800000e-02 2.880000e-02 1.152000e-02
  4.608000e-03 1.843200e-03 7.372800e-04 2.949120e-04 5.898240e-05
  1.179648e-05 3.538944e-06]]
CGAAAAAATCG
NNCCCCCCCNN
上一篇下一篇

猜你喜欢

热点阅读