刚阅读pdf参考版本1.7“5.3.1文本定位运算符”,我有点困惑.
我写了一些代码来获得转换矩阵和初始文本位置.
CGpdfOperatortableSetCallback (table,"MP",&op_MP);//define marked-content point CGpdfOperatortableSetCallback (table,"DP",&op_DP);//define marked-content point with property List CGpdfOperatortableSetCallback (table,"BMC",&op_BMC);//Begin marked-content sequence CGpdfOperatortableSetCallback (table,"BDC",&op_BDC);//Begin marked-content sequence with property List CGpdfOperatortableSetCallback (table,"EMC",&op_EMC);//End marked-content sequence //Text State operators CGpdfOperatortableSetCallback(table,"Tc",&op_Tc); CGpdfOperatortableSetCallback(table,"Tw",&op_Tw); CGpdfOperatortableSetCallback(table,"Tz",&op_Tz); CGpdfOperatortableSetCallback(table,"TL",&op_TL); CGpdfOperatortableSetCallback(table,"Tf",&op_Tf); CGpdfOperatortableSetCallback(table,"Tr",&op_Tr); CGpdfOperatortableSetCallback(table,"Ts",&op_Ts); //text showing operators CGpdfOperatortableSetCallback(table,"TJ",&op_TJ); CGpdfOperatortableSetCallback(table,"Tj",&op_Tj); CGpdfOperatortableSetCallback(table,"'",&op_apostrof); CGpdfOperatortableSetCallback(table,"\"",&op_double_apostrof); //text positioning operators CGpdfOperatortableSetCallback(table,"Td",&op_Td); CGpdfOperatortableSetCallback(table,"TD",&op_TD); CGpdfOperatortableSetCallback(table,"Tm",&op_Tm); CGpdfOperatortableSetCallback(table,"T*",&op_T); //text object operators CGpdfOperatortableSetCallback(table,"BT",&op_BT);//Begin text object CGpdfOperatortableSetCallback(table,"ET",&op_ET);//End text object
所以这是申请午餐后的输出:
2010-09-02 15:09:23.041 testSearch[8251:207] op_BT begin Integer value: 0 2010-09-02 15:09:23.043 testSearch[8251:207] op_BT end 2010-09-02 15:09:23.043 testSearch[8251:207] op_Tf begin Integer value: 1 2010-09-02 15:09:23.044 testSearch[8251:207] op_Tf end 2010-09-02 15:09:23.044 testSearch[8251:207] op_Tm begin float value: 557.364197 2010-09-02 15:09:23.045 testSearch[8251:207] op_Tm end 2010-09-02 15:09:23.045 testSearch[8251:207] op_TJ begin 2010-09-02 15:09:23.046 testSearch[8251:207] Array string value [0]: F 2010-09-02 15:09:23.046 testSearch[8251:207] Array integer value [1]: 94985208 2010-09-02 15:09:23.047 testSearch[8251:207] Array string value [2]: r 2010-09-02 15:09:23.047 testSearch[8251:207] Array integer value [3]: 94985208 2010-09-02 15:09:23.048 testSearch[8251:207] Array string value [4]: o 2010-09-02 15:09:23.048 testSearch[8251:207] Array integer value [5]: 94985208 2010-09-02 15:09:23.049 testSearch[8251:207] Array string value [6]: m s 2010-09-02 15:09:23.049 testSearch[8251:207] Array integer value [7]: 94985208 2010-09-02 15:09:23.049 testSearch[8251:207] Array string value [8]: a 2010-09-02 15:09:23.050 testSearch[8251:207] Array integer value [9]: 94985208 2010-09-02 15:09:23.050 testSearch[8251:207] Array string value [10]: m 2010-09-02 15:09:23.051 testSearch[8251:207] Array integer value [11]: 94985208 2010-09-02 15:09:23.051 testSearch[8251:207] Array string value [12]: p 2010-09-02 15:09:23.052 testSearch[8251:207] Array integer value [13]: 94985208 2010-09-02 15:09:23.053 testSearch[8251:207] Array string value [14]: l 2010-09-02 15:09:23.054 testSearch[8251:207] Array integer value [15]: 94985208 2010-09-02 15:09:23.055 testSearch[8251:207] Array string value [16]: e t 2010-09-02 15:09:23.055 testSearch[8251:207] Array integer value [17]: 94985208 2010-09-02 15:09:23.057 testSearch[8251:207] Array string value [18]: o r 2010-09-02 15:09:23.057 testSearch[8251:207] Array integer value [19]: 94985208 2010-09-02 15:09:23.058 testSearch[8251:207] Array string value [20]: e 2010-09-02 15:09:23.058 testSearch[8251:207] Array integer value [21]: 94985208 2010-09-02 15:09:23.059 testSearch[8251:207] Array string value [22]: s 2010-09-02 15:09:23.059 testSearch[8251:207] Array integer value [23]: 94985208 2010-09-02 15:09:23.060 testSearch[8251:207] Array string value [24]: u 2010-09-02 15:09:23.061 testSearch[8251:207] Array integer value [25]: 94985208 2010-09-02 15:09:23.061 testSearch[8251:207] Array string value [26]: l 2010-09-02 15:09:23.062 testSearch[8251:207] Array integer value [27]: 94985208 2010-09-02 15:09:23.062 testSearch[8251:207] Array string value [28]: t 2010-09-02 15:09:23.063 testSearch[8251:207] op_TJ end
如果有人熟悉文本矩阵和文本定位 *** 作符,那么解释所有这些 *** 作是如何工作将会很好.
如何使用Tm(变换矩阵和其他数据)计算文本位置(或字形?)?
解决方法 @Koteg:嗨!你终于成功了吗?对于Tm,我能够获得所有六个值,但是现在我看不出如何将一个单词的位置放到一行…我有一个想法:如果我们在Tj,只是获得字母之间的空格(每次都跳这个相同)和Tm,得到一个单词的位置.
在TJ的情况下,这是非常复杂的:获取水平平移的值以减去数组的每个部分的Tm矩阵,但是搜索该数组中的单词将比Tj更复杂.
顺便说一句,对于其他人:
for(size_t n = 0; n < CGpdfArrayGetCount(array); n += 2){ if(n >= CGpdfArrayGetCount(array)) continue; CGpdfStringRef string; success = CGpdfArrayGetString(array,n,&string); if(success) { Nsstring *data = (Nsstring *)CGpdfStringcopyTextString(string); NSLog(@"array data : %@",data); [searcher.currentData appendFormat:@"%@",data]; [data release]; } CGpdfReal real; success = CGpdfArrayGetNumber(array,n+1,&real); if(success) { NSLog(@"array real : %f",real); }}
谢谢
总结以上是内存溢出为你收集整理的objective-c – 使用Quartz 2D解析pdf时获取文本位置全部内容,希望文章能够帮你解决objective-c – 使用Quartz 2D解析pdf时获取文本位置所遇到的程序开发问题。
如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)