最长公共子字符串算法 JAVA实现
什么是最长公共子字符串算法? 举一个例子就清楚了
比如我们有两个字符串:
Please, peter go swimming!
I’m peter goliswi
那么该算法应该输出’peter go’. 最长公共子字符串算法通过suffix trees算法 (时间复杂度O(n),但是实现极其复杂) 可以获得效率很高的实现。但是在本帖中我们要使用效率稍次的‘动态编程’思想来实现该算法。动态编程,顾名思义, 就是重用前一步已经计算出来的信息。要理解这种实现,我们首先需要填写一个二维的整数型数组,假如我们用i表示水平方向的字符串(Please, peter go swimming!),以j表示垂直方向的字符串(I’m peter goliswi) 如图:
public static String longestSubstring(String str1, String str2) {StringBuilder sb = new StringBuilder();if (str1 == null || str1.isEmpty() || str2 == null || str2.isEmpty()) return "";// ignore casestr1 = str1.toLowerCase();str2 = str2.toLowerCase();// java initializes them already with 0int[][] num = new int[str1.length()][str2.length()];int maxlen = 0;int lastSubsBegin = 0;for (int i = 0; i < str1.length(); i++) {for (int j = 0; j < str2.length(); j++) { if (str1.charAt(i) == str2.charAt(j)) { if ((i == 0) || (j == 0)) num[i][j] = 1; else num[i][j] = 1 + num[i - 1][j - 1]; if (num[i][j] > maxlen) { maxlen = num[i][j]; // generate substring from str1 => i int thisSubsBegin = i - num[i][j] + 1; if (lastSubsBegin == thisSubsBegin) { //if the current LCS is the same as the last time this block ran sb.append(str1.charAt(i)); } else { //this block resets the string builder if a different LCS is found lastSubsBegin = thisSubsBegin; sb = new StringBuilder(); sb.append(str1.substring(lastSubsBegin, i + 1)); } }}}}return sb.toString();}