串之朴素算法和kmp算法(java实现)_随笔

串之朴素算法和kmp算法(java实现)

文章目录

- 串
- BF算法(朴素算法)
- KMP算法

串

其实串也是一种数据结构，一般都是由顺序结构(数组)实现串，因为一般对于串的 *** 作是查找和修改，也可以使用链式存储(节点)实现串，为了节省空间可以在一个节点中存储多个字符

java中有个String类封装了很多关于串的方法，可以去看一看
串比较重要的算法就是在一个主串中，找到匹配串的在主串中的起头的索引

BF算法(朴素算法)

主串和匹配串的字符不匹配时：得出一个规律，主串回溯 i-j+1, 匹配串从头开始
主串和匹配串的字符匹配时：继续比较下一个字符，结束条件是 i 和主串长度相同或者 j 和匹配串长度相同
一般是求出匹配串在主串的开头位置
时间复杂度为O(m*n)

public class BF {

    public static void main(String[] args) {
        System.out.println(bf("wtklzh","klz"));
    }

    public static int bf(String str1, String str2){

        for (int i = 0, j = 0; i < str1.length(); i++) {
            if(str1.charAt(i) == str2.charAt(j)){
                j++;
            }else{
                i = i-j+1;
                j = 0;
            }
            if(j == str2.length()){
                return i - j+1;
            }
        }
        return -1;
    }
}

KMP算法

三个外国人搞出来的，所以就以三个人名字的首字母命名的算法，时间复杂度为O(m+n)

kmp算法分为两部分：

求next数组，比较难以理解
kmp匹配

优点：

i (主串索引)不用回溯了
j(匹配串索引)不用回到头开始匹配
时间复杂度为O(m+n)

注意:

通常，写kmp算法的时候，索引0是废弃的，但是也可以不废弃

手算匹配串的next[]数组:

前缀: 包含首字符不包含末位字符的串
后缀: 包含末位字符不包含首位字符串的串

匹配串索引： 0 1 2 3 4 5

匹配串的值： a b a b a c

getNext()的next[]数组 : 0 0 1 2 3 0 包含当前索引字符的串的重复的数量

getNext2()的next数组： 0 0 1 1 2 0 不包含当前索引字符的串的重复数量，这个算法是求废弃索引0的匹配串的数组，不然next[2] = 1解释不了

过程:看着代码走一遍流程，索引0没有废弃
默认索引0处的值为0，也就是a对应的next[0] = 0
b的子串是a,j这时为0， next[1] = 0;
a往前找，这时j = 0, i = 2,这个时候arr[i] == arr[j] 都是a,j++，j这时是1，next[2] = 1
b往前找，这时j = 1, i = 3,j>0 但是 arr[3]  == arr[1],都是b, j++ 这时j是2,next[3] = 2
a往前找，这时j = 2, i = 4, j>0 但是arr[4]  == arr[2],都是a, j++，这事j是3,next[4] = 3
c往前找，这时j = 3, i = 5, j>0 arr[5] != arr[3]、c!=b，将j = next[j-1]=next[2]=1，这时j是2
		j>0, arr[5] != arr[2]、c!=a,将j = next[j-1] = next[1] = 0, 这时j是1
		j>0, arr[5] != arr[0]、c!=a,将j = next[j-1] = next[0] = 0，这时j是0
		j>0不成立，arr[5] != arr[0],将next[i] = j、next[5] = 0

原理:

匹配串索引： 0 1 2 3 4 5

匹配串的值： a b a b a c

getNext()的next[]数组 : 0 0 1 2 3 0 包含当前索引字符的串的重复的数量

参考：

废弃索引的next[] 数组讲解

https://www.bilibili.com/video/BV1uJ411s7br/?spm_id_from=autoNext

代码

import java.util.Arrays;

public class KMP {
    public static int[] getNext(String arr){
        //初始化一个数组，长度是匹配串的长度
        int[] next = new int[arr.length()];
        next[0] = 0;
        for (int i = 1,j = 0; i < arr.length(); i++) {
            while (j>0 && arr.charAt(i) != arr.charAt(j)){
                j = next[j - 1];
            }
            if(arr.charAt(i) == arr.charAt(j)){
                j++;
            }
            next[i] = j;
        }

        System.out.println(Arrays.toString(next));
        return next;
    }

    //这个算法是求废弃索引0的匹配串的数组，不然next[2] = 1解释不了
    public static int[] getNext2(String arr){
        //初始化一个数组，长度是匹配串的长度
        int[] next = new int[arr.length()];
        next[0] = 0;
        next[1] = 0;
        int i = 1, j = 0;
        while(i < arr.length()-1){
            if(j == 0 || arr.charAt(i) == arr.charAt(j)){
                next[++i] = ++j;
            }else {
                j = next[j];
            }
        }
        System.out.println(Arrays.toString(next));
        return next;
    }

    public static int kmp(String str1, String str2){
        //获得匹配数组的next数组
        int[] next = getNext(str2);
        //循环主串，kmp的主串不用回溯
        for (int i = 0, j = 0; i < str1.length(); i++) {
            //不匹配
            while (j > 0 && str1.charAt(i) != str2.charAt(j)){
                //kmp中，匹配串不在回到头部，根据next数组记录的值，来决定匹配串的位置
                j = next[j-1];
            }
            //匹配
            if(str1.charAt(i) == str2.charAt(j)){
                j++;
            }
            //匹配串匹配完成结束程序
            if(j == str2.length()){
                return i-j+1;
            }
        }
        return -1;
    }


    public static void main(String[] args) {

        System.out.println(kmp("wtklzh","klz"));
        System.out.println(kmp("aaaabc","aababac"));
    }
}

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/5692967.html

串之朴素算法和kmp算法(java实现)

发表评论

评论列表（0条）