objective-c – 使用Quartz 2D解析pdf时获取文本位置

objective-c – 使用Quartz 2D解析pdf时获取文本位置,第1张

概述关于pdf解析的另一个问题…… 刚阅读PDF参考版本1.7“5.3.1文本定位运算符”,我有点困惑. 我写了一些代码来获得转换矩阵和初始文本位置. CGPDFOperatorTableSetCallback (table, "MP", &op_MP);//Define marked-content point CGPDFOperatorTableSetCallback (table, "D 关于pdf解析的另一个问题……
刚阅读pdf参考版本1.7“5.3.1文本定位运算符”,我有点困惑.

我写了一些代码来获得转换矩阵和初始文本位置.

CGpdfOperatortableSetCallback (table,"MP",&op_MP);//define marked-content point    CGpdfOperatortableSetCallback (table,"DP",&op_DP);//define marked-content point with property List    CGpdfOperatortableSetCallback (table,"BMC",&op_BMC);//Begin marked-content sequence    CGpdfOperatortableSetCallback (table,"BDC",&op_BDC);//Begin marked-content sequence with property List    CGpdfOperatortableSetCallback (table,"EMC",&op_EMC);//End marked-content sequence    //Text State operators    CGpdfOperatortableSetCallback(table,"Tc",&op_Tc);    CGpdfOperatortableSetCallback(table,"Tw",&op_Tw);    CGpdfOperatortableSetCallback(table,"Tz",&op_Tz);    CGpdfOperatortableSetCallback(table,"TL",&op_TL);    CGpdfOperatortableSetCallback(table,"Tf",&op_Tf);    CGpdfOperatortableSetCallback(table,"Tr",&op_Tr);    CGpdfOperatortableSetCallback(table,"Ts",&op_Ts);    //text showing operators    CGpdfOperatortableSetCallback(table,"TJ",&op_TJ);    CGpdfOperatortableSetCallback(table,"Tj",&op_Tj);    CGpdfOperatortableSetCallback(table,"'",&op_apostrof);    CGpdfOperatortableSetCallback(table,"\"",&op_double_apostrof);    //text positioning operators            CGpdfOperatortableSetCallback(table,"Td",&op_Td);    CGpdfOperatortableSetCallback(table,"TD",&op_TD);    CGpdfOperatortableSetCallback(table,"Tm",&op_Tm);    CGpdfOperatortableSetCallback(table,"T*",&op_T);    //text object operators    CGpdfOperatortableSetCallback(table,"BT",&op_BT);//Begin text object    CGpdfOperatortableSetCallback(table,"ET",&op_ET);//End text object

所以这是申请午餐后的输出:

2010-09-02 15:09:23.041 testSearch[8251:207] op_BT begin    Integer value: 0    2010-09-02 15:09:23.043 testSearch[8251:207] op_BT end    2010-09-02 15:09:23.043 testSearch[8251:207] op_Tf begin    Integer value: 1    2010-09-02 15:09:23.044 testSearch[8251:207] op_Tf end    2010-09-02 15:09:23.044 testSearch[8251:207] op_Tm begin    float value: 557.364197    2010-09-02 15:09:23.045 testSearch[8251:207] op_Tm end    2010-09-02 15:09:23.045 testSearch[8251:207] op_TJ begin    2010-09-02 15:09:23.046 testSearch[8251:207] Array string value [0]: F    2010-09-02 15:09:23.046 testSearch[8251:207] Array integer value [1]: 94985208    2010-09-02 15:09:23.047 testSearch[8251:207] Array string value [2]: r    2010-09-02 15:09:23.047 testSearch[8251:207] Array integer value [3]: 94985208    2010-09-02 15:09:23.048 testSearch[8251:207] Array string value [4]: o    2010-09-02 15:09:23.048 testSearch[8251:207] Array integer value [5]: 94985208    2010-09-02 15:09:23.049 testSearch[8251:207] Array string value [6]: m s    2010-09-02 15:09:23.049 testSearch[8251:207] Array integer value [7]: 94985208    2010-09-02 15:09:23.049 testSearch[8251:207] Array string value [8]: a    2010-09-02 15:09:23.050 testSearch[8251:207] Array integer value [9]: 94985208    2010-09-02 15:09:23.050 testSearch[8251:207] Array string value [10]: m    2010-09-02 15:09:23.051 testSearch[8251:207] Array integer value [11]: 94985208    2010-09-02 15:09:23.051 testSearch[8251:207] Array string value [12]: p    2010-09-02 15:09:23.052 testSearch[8251:207] Array integer value [13]: 94985208    2010-09-02 15:09:23.053 testSearch[8251:207] Array string value [14]: l    2010-09-02 15:09:23.054 testSearch[8251:207] Array integer value [15]: 94985208    2010-09-02 15:09:23.055 testSearch[8251:207] Array string value [16]: e t    2010-09-02 15:09:23.055 testSearch[8251:207] Array integer value [17]: 94985208    2010-09-02 15:09:23.057 testSearch[8251:207] Array string value [18]: o r    2010-09-02 15:09:23.057 testSearch[8251:207] Array integer value [19]: 94985208    2010-09-02 15:09:23.058 testSearch[8251:207] Array string value [20]: e    2010-09-02 15:09:23.058 testSearch[8251:207] Array integer value [21]: 94985208    2010-09-02 15:09:23.059 testSearch[8251:207] Array string value [22]: s    2010-09-02 15:09:23.059 testSearch[8251:207] Array integer value [23]: 94985208    2010-09-02 15:09:23.060 testSearch[8251:207] Array string value [24]: u    2010-09-02 15:09:23.061 testSearch[8251:207] Array integer value [25]: 94985208    2010-09-02 15:09:23.061 testSearch[8251:207] Array string value [26]: l    2010-09-02 15:09:23.062 testSearch[8251:207] Array integer value [27]: 94985208    2010-09-02 15:09:23.062 testSearch[8251:207] Array string value [28]: t    2010-09-02 15:09:23.063 testSearch[8251:207] op_TJ end

如果有人熟悉文本矩阵和文本定位 *** 作符,那么解释所有这些 *** 作是如何工作将会很好.

如何使用Tm(变换矩阵和其他数据)计算文本位置(或字形?)?

解决方法 @Koteg:嗨!你终于成功了吗?对于Tm,我能够获得所有六个值,但是现在我看不出如何将一个单词的位置放到一行…
我有一个想法:如果我们在Tj,只是获得字母之间的空格(每次都跳这个相同)和Tm,得到一个单词的位置.
在TJ的情况下,这是非常复杂的:获取水平平移的值以减去数组的每个部分的Tm矩阵,但是搜索该数组中的单词将比Tj更复杂.

顺便说一句,对于其他人:

for(size_t n = 0; n < CGpdfArrayGetCount(array); n += 2){    if(n >= CGpdfArrayGetCount(array))        continue;    CGpdfStringRef string;    success = CGpdfArrayGetString(array,n,&string);    if(success)    {        Nsstring *data = (Nsstring *)CGpdfStringcopyTextString(string);        NSLog(@"array data : %@",data);        [searcher.currentData appendFormat:@"%@",data];        [data release];    }    CGpdfReal real;    success = CGpdfArrayGetNumber(array,n+1,&real);    if(success)    {        NSLog(@"array real : %f",real);    }}

谢谢

总结

以上是内存溢出为你收集整理的objective-c – 使用Quartz 2D解析pdf时获取文本位置全部内容,希望文章能够帮你解决objective-c – 使用Quartz 2D解析pdf时获取文本位置所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/web/1084596.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-05-27
下一篇 2022-05-27

发表评论

登录后才能评论

评论列表(0条)

保存