基于频率分析的Vigenere破解
Vigenere密码简介:
信息代码:X=(x1,x2,…,xd)∈(Z/26)d
密钥:K=(k1,k2,…,kd) ∈(Z/26)d
加密函数:E(x)=X+K=( x1+k1,x2+k2,…,xd+kd);
解密函数: D(x)=X-K=( x1-k1,x2-k2,…,xd-kd);
其实,vigenere密码本身也是一种凯撒挪移密码,但是比起凯撒密码,这个密码存在比较难确定的两个因素:区域长度d的选定和密钥K向量;但是,这种密码并非不可破解,而且破解思路也比较简单(如果不深究其中深刻的数学原理),接下来给出破解思路和代码实现:
一些公用代码:
求模函数:
public class OperationUtil {
/**
* @param start
* @param end exclude
* @param result
* @return
*/
public static int mod(int start,int end,int result) {
int res;
int difference=end-start;
int resultFromStart=result-start;
res=start+resultFromStart%difference;
while(res<start) {
res+=difference;
}
return res;
}
//把文本text分成""+text[0]+text[0+keyLength]...+,""+text[1]+text[1+keyLength]+...,..如
//此分组
public static ArrayList<String> groupText(String text, int keyLength) {
ArrayList<String> groups = new ArrayList<String>();
for (int i = 0; i < keyLength; ++i) {
StringBuilder tempGroup = new StringBuilder();
for (int j = 0; i + j * keyLength < text.length(); ++j) {
tempGroup.append(text.charAt(i + j * keyLength));
}
groups.add(tempGroup.toString());
}
return groups;
}
/**
* 计算密文字母频率向量和统计字母使用频率向量左旋g的内积
* @param frequencies
* 密文字母频率
* @param w
* 统计字母使用频率
* @param g
* 左旋步长
* @param subTextLength
* 区域段密文长度
* @return 拟重合指数
*/
public static double transvection(Map<Character, Integer> frequencies, double[] w, int g, int subTextLength) {
double MG = 0.0D;
for (char ch = 'A'; ch <= 'Z'; ++ch) {
// int的除法
double fi = frequencies.get(ch) * 1.0 / subTextLength;
int index = OperationUtil.mod(0, 26, ch - 'A' - g);
MG += fi * w[index];
}
return MG;
}
}
吻合指数计算方法[图片来源《密码学--加密演算法》]
基于频率分析的Vigenere破解/**
* 计算吻合指数
* @param wordMap 字符频率映射表
* @param textLength groupText产生每个分组的长度
* @return
*/
public static double calculateCoincidence(Map<Character, Integer> wordMap, int textLength) {
double IC = 0.0D;
double denominator = Math.pow((double) textLength, 2);
for (int k = 0; k < 26; ++k) {
double frequency = (double) wordMap.get((char) (k + 65));
IC += frequency * (frequency - 1);
}
IC /= denominator;
return IC;
}
/**
* 统计每个字符出现次数
* @param text groupText分组文本
* @return
*/
public static Map<Character, Integer> countEveryWord(String text) {
Map<Character, Integer> map = new HashMap<>();
for (int i = 0; i < 26; ++i) {
map.put((char) (i + 65), 0);
}
for (int i = 0; i < text.length(); ++i) {
char current = text.charAt(i);
int frenquency = map.get(current) + 1;
map.put(current, frenquency);
}
return map;
}
一、求出密钥的长度(区域长度d)
大致思路:在猜测最大密钥长度范围内找出平均重合指数最大的密钥长度
/**
* Friedman解密
* @param text 密文
* @param guess 猜测密钥最大长度
* @return 密钥长度
*/
public static int Friedman(String text,int guess) {
// 重合指数
double[] Ic = null;
// 平均重合指数
double avgIc = 0;
// 密文分组
ArrayList<String> groups=null;
double max=0;
int keyLength=0;
// 可能密钥长度
int possibleLength = 1;
while (possibleLength<=guess) {
Ic = new double[possibleLength];
avgIc = 0;
// 1 先根据密钥长度分组
groups = groupText(text, possibleLength);
// 2 再计算每一组的重合指数
for (int i = 0; i < possibleLength; ++i) {
String subCipher = groups.get(i); // 子串
Map<Character, Integer> frequencies = countEveryWord(subCipher);
Ic[i] = calculateCoincidence(frequencies, subCipher.length());
}
for (int i = 0; i < possibleLength; ++i) {
avgIc += Ic[i];
}
avgIc /= (double) possibleLength;
//求最大重合指数
if (avgIc >= max) {
max=avgIc;
keyLength=possibleLength;
}
possibleLength++;
}
return keyLength;
}
二、确定密钥
大致思路:因为字母移位周期为26,所以就相当于在26中寻找密文频率比例和字母统计频率内积值最大的就是密钥
/**
* 重合指数获取密钥
* @param keyLength 密钥长度
* @param text 密文
* @return 密钥
*/
public static int[] decryptCipher(int keyLength, String text) {
int[] key = new int[keyLength];
//字母常用频率分布表
double[] probability = new double[] { 0.082, 0.015, 0.028, 0.043, 0.127, 0.022, 0.02, 0.061, 0.07, 0.002, 0.008,
0.04, 0.024, 0.067, 0.075, 0.019, 0.001, 0.06, 0.063, 0.091, 0.028, 0.01, 0.023, 0.001, 0.02, 0.001 };
ArrayList<String> groups = groupText(text, keyLength);
for (int i = 0; i < keyLength; ++i) {
double MG;
double maxMG=0;
for (int g=0;g<26;++g) {
MG = 0;
String subCipher = groups.get(i);
Map<Character, Integer> frequencies = countEveryWord(subCipher);
MG = transvection(frequencies, probability, g, subCipher.length());
if (MG >= maxMG) {
maxMG=MG;
key[i] = g;
}
}
}
return key;
}
三、根据密钥得出明文
public static String decrypt(String text,int[]keys) {
StringBuilder plainText=new StringBuilder();
for (int i=0;i<text.length();++i) {
plainText.append((char)OperationUtil.mod(65, 91, text.charAt(i)-keys[i%keys.length]));
}
return plainText.toString();
}
四、测试文本
OCWYIKOOONIWUGPMXWKTZDWG
TSSAYJZWYEMDLBNQAAAVSUWD
VBRFLAUPLOOUBFGQHGCSCMGZ
LATOEDCSDEIDPBHTMUOVPIEK
IFPIMFNOAMVLPQFXEJSMXMPG
KCCAYKWFZPYUAVTELWHRHMWK
BBVGTGUVTEFJLODFEFKVPXSG
RSORVGTAJBSAUHZRZALKWUOW
HGEDEFNSWMRCIWCPAAAVOGPD
NFPKTDBALSISURLNPSJYEATC
UCEESOHHDARKHWOTIKBROQRD
FMZGHGUCEBVGWCDQXGPBGQWL
PBDAYLOOQDMUHBDQGMYWEUIK
密钥:7,14,11,12,4,18
真实密钥:7,14,11,12,4,10
最后一个误差可以通过考虑单词特征纠错