java PDFBOX

java PDFBOX,第1张

@[TOC](PDFBOX(2.0.25) 修改文本内容)

pdfbox文本内容修在网上找不到新版本的,老版本的还不兼容(无效)。经过几周的寻找终于找到了个能用的
PDFBOX替换文本(pdfbox版本2.0.24,虽然我自己也发现这个字符编码的没弄成,参考这个后终于搞好了😊太不容易了。
记录下 方便以后使用

@Test
	public void pdf() throws Exception {
		File file = ResourceUtils.getFile("classpath:5.pdf");
		PDDocument pd = PDDocument.load(file);
		// 需要的字体文件
		Map<COSName, PDFont> oldfont = new HashMap<COSName, PDFont>();
		COSName fontName = null;
		PDType0Font targetfont= PDType0Font.load(pd, new File("C:\Windows\Fonts\simfang.ttf"));
		for (PDPage page : pd.getPages()) {
			PDFStreamParser pdfsp = new PDFStreamParser(page);
			pdfsp.parse();
			List<Object> tokens = pdfsp.getTokens();
            for (int j = 0; j < tokens.size(); j++) {
                //创建一个object对象去接收标记
                Object next = tokens.get( j );
                //instanceof判断其左边对象是否为其右边类的实例
                if(next  instanceof COSName) {
                	fontName= (COSName)next;
                	if(!oldfont.containsKey(fontName)) {
                		oldfont.put(fontName, page.getResources().getFont(fontName));
                	}
                }else 
                if(next  instanceof COSString) {
                    COSString previous = (COSString)next;
                    try(InputStream in = new ByteArrayInputStream(previous.getBytes())){
                    	StringBuffer sb = new StringBuffer();
                    	while (in.available()>0) {
                    		int rc = oldfont.get(fontName).readCode(in);
                    		sb.append(oldfont.get(fontName).toUnicode(rc));
                    	}
                    	//重置COSString对象
                    	sb.append("例");
                    	System.out.println("--Tj----"+sb.toString());
                    	previous.setValue(targetfont.encode(sb.toString()));
                    }
                }else if(next  instanceof COSArray) {
                    //PDF中的字符串 
                	byte[] pstring = {};
                    int prej = 0;
                    COSArray previous  =(COSArray)next;
                    //循环previous
                    for (int k = 0; k < previous.size(); k++) {
                        Object arrElement = previous.getObject( k );
                        if( arrElement instanceof COSString ){
                            //COSString对象>>创建java字符串的一个新的文本字符串。
                            COSString cosString =(COSString)arrElement;
                            //将此字符串的内容作为PDF文本字符串返回。 
                            if (j == prej) {
                            	byte[] thisbyte = cosString.getBytes();
                                byte[] temp = new byte[pstring.length+thisbyte.length];  
                                System.arraycopy(pstring, 0, temp, 0, pstring.length);
                                System.arraycopy(thisbyte, 0, temp, pstring.length, thisbyte.length);
                                pstring=temp;
                            } else {
                                prej = j;
                                pstring = cosString.getBytes();
                            }                       
                        }
                    }
                    try(InputStream in = new ByteArrayInputStream(pstring)){
                    	StringBuffer sb = new StringBuffer();
                    	while (in.available()>0) {
                    		int rc = oldfont.get(fontName).readCode(in);
                    		sb.append(oldfont.get(fontName).toUnicode(rc));
                    	}
                    	sb.append("例");
                    	System.out.println("TJ----"+sb.toString());
                    	COSString cosString2 = (COSString) previous.getObject(0);
                    	cosString2.setValue(targetfont.encode(sb.toString()));
                    }
                    int total = previous.size()-1;    
                    for (int k = total; k > 0; k--) {
                        previous.remove(k);
                    }
                }
            }
            PDStream updatedStream = new PDStream(pd);
            OutputStream out = updatedStream.createOutputStream();
            ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
            tokenWriter.writeTokens(tokens);
            out.close();
            oldfont.forEach((k,v)->{
            	page.getResources().put(k, targetfont);
            });
            page.setContents(updatedStream);
		}
		pd.save("d:/1.pdf");
		pd.close();
	}
	

昨天的代码有问题,修改后
修改前

修改后

欢迎分享,转载请注明来源:内存溢出

原文地址: https://outofmemory.cn/langs/719778.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-04-25
下一篇 2022-04-25

发表评论

登录后才能评论

评论列表(0条)

保存