Java实现自定义敏感词库过滤

Java实现自定义敏感词库过滤,第1张

Java实现自定义敏感词库过滤

最近接到一个需求,要添加一个敏感词管理模块,一如既往的CURD,敏感词我们添加到了自己的库里。然后进行一个自定义敏感词过滤,话不多说直接贴代码

1、工具类

这里只是最简单的得到敏感词进行转换,可以根据自己的业务需求进行填充

package com.zylc.bixiang.business.keywords.web;

import com.zylc.bixiang.business.keywords.domain.repository.BXKeyWordsMapper;
import com.zylc.bixiang.business.keywords.domain.vo.BXKeyWordsVO;
import org.springframework.beans.factory.annotation.Autowired;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
import java.util.Collection;
import java.util.HashMap;
import java.util.List;


public class SensitiveWordFilter {
    @Autowired
    BXKeyWordsMapper bxKeyWordsMapper;

    //初始化
    private StringBuilder replaceAll;
    private String encoding = "UTF-8";
    private String replceStr = "*";
    private int replceSize = 500;

    private List keyWordsVOS;

    
    public String filterInfo(String str) {
        StringBuilder buffer = new StringBuilder(str);
        HashMap hash = new HashMap(keyWordsVOS.size());
        String temp;
        for (int x = 0; x < keyWordsVOS.size(); x++) {
            temp = keyWordsVOS.get(x).getValue();
            int findIndexSize = 0;
            for (int start = -1; (start = buffer.indexOf(temp, findIndexSize)) > -1; ) {
                //从已找到的后面开始找
                findIndexSize = start + temp.length();
                //起始位置
                Integer mapStart = hash.get(start);
                //满足1个,即可更新map
                if (mapStart == null || (mapStart != null && findIndexSize > mapStart)) {
                    hash.put(start, findIndexSize);
                }
            }
        }
        Collection values = hash.keySet();
        for (Integer startIndex : values) {
            Integer endIndex = hash.get(startIndex);
            buffer.replace(startIndex, endIndex, replaceAll.substring(0, endIndex - startIndex));
        }
        hash.clear();
        return buffer.toString();
    }

    
    public void InitializationWork() {
        replaceAll = new StringBuilder(replceSize);
        for (int x = 0; x < replceSize; x++) {
            replaceAll.append(replceStr);
        }
        //加载词库
        keyWordsVOS = bxKeyWordsMapper.findAllKeywords();
    }

    public StringBuilder getReplaceAll() {
        return replaceAll;
    }

    public void setReplaceAll(StringBuilder replaceAll) {
        this.replaceAll = replaceAll;
    }

    public String getReplceStr() {
        return replceStr;
    }

    public void setReplceStr(String replceStr) {
        this.replceStr = replceStr;
    }

    public int getReplceSize() {
        return replceSize;
    }

    public void setReplceSize(int replceSize) {
        this.replceSize = replceSize;
    }

    public void setEncoding(String encoding) {
        this.encoding = encoding;
    }
}

3、因为我们业务有敏感级别,我要得到每个敏感词和相对应的级别,所以上方初始化词库用的集合对象 

 

 

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/4829172.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-11-10
下一篇 2022-11-10

发表评论

登录后才能评论

评论列表(0条)

保存