从html获取特定内容并在Perl中打印到txt文件

从html获取特定内容并在Perl中打印到txt文件,第1张

概述我有一个html文件,其中包含纸质ID和纸张.所以我想按顺序打印这些ID和论文.这是html文件和示例输出. <META content="MSHTML 6.00.2900.2802" name=GENERATOR></HEAD><BODY bgColor=#ffffff leftMargin=0 topMargin=0 rightMargin=0 marginwidth="0" margin 我有一个HTML文件,其中包含纸质ID和纸张.所以我想按顺序打印这些ID和论文.这是HTML文件和示例输出.

<Meta content="MSHTML 6.00.2900.2802" name=GENERATOR></head><BODY bgcolor=#ffffff leftmargin=0 topmargin=0 rightmargin=0 marginwIDth="0" marginheight="0"><div class=conf><A class=confname href="http://ehw.jpl.nasa.gov/events/ahs2006/">1st Conference on Adaptive HarDWare and Systems (AHS-2006)</A></div><div class=menuoc>OpenConf Conference Management System</div><div class=menu><div class=menuitem><A href="http://www.eng.bahcesehir.edu.tr/openconf/chair/">Chair Home</A></div><div class=menuitem><A href="http://www.eng.bahcesehir.edu.tr/openconf/chair/signout.PHP">Sign Out</A></div><div class=menufiller>Logged in as: ahs2006&nbsp;</div></div><div class=mainbody><BR><P class=header>Assign RevIEws</P><FORM action=/openconf/chair/assign_revIEws.PHP method=post><DL><DT><P><B>Select Paper(s):</B></P><DD><P>[ Paper ID - Title (# RevIEwers) ]</P><DD><SELECT multiple size=10 name=papers[]> <OPTION value=2>&nbsp;&nbsp;2 - Switchable Glass: A possible medium for Evolvable HarDWare (4)</OPTION> <OPTION value=3>&nbsp;&nbsp;3 - An EfficIEnt Multi-Objective Evolutionary Algorithm for Combinational Circuit Design (3)</OPTION> <OPTION value=4>&nbsp;&nbsp;4 - A Background Mismatch Calibration for Capacitive Digital-to-Analog Converters (3)</OPTION> <OPTION value=5>&nbsp;&nbsp;5 - Designing Electronic Circuits by Means of Gene Expression Programming (3)</OPTION> <OPTION value=6>&nbsp;&nbsp;6 - Coherence Based Fault Detection And Error Correction (3)</OPTION> <OPTION value=7>&nbsp;&nbsp;7 - Wormhole Routing with Virtual Channels using Dynamic Rate Control for Network-on... (2)</OPTION> <OPTION value=8>&nbsp;&nbsp;8 - Noise Analysis of Phase Locked Loops (3)</OPTION> <OPTION value=9>&nbsp;&nbsp;9 - Design and Analysis of a Second Order Phase Locked Loops (PLLs) (2)</OPTION> <OPTION value=10>10 - SW-HW Co-design and fault tolerant implementation for the LRID Wireless communication... (3)</OPTION> <OPTION value=11>11 - Adaptive PID Controller Using Parameter Optimization Algorithm (2)</OPTION> <OPTION value=12>12 - A Novel Self-organizing HybrID Network Protocol (2)</OPTION> <OPTION value=13>13 - An Adaptive FPGA-Based Mechatronic Control System Supporting Partial Reconfiguration... (3)</OPTION> <OPTION value=14>14 - Generalized disjunction Decomposition for the Evolution of Programmable Logic Array... (3)</OPTION> <OPTION value=15>15 - Woofer-Tweeter Adaptive Optics System (1)</OPTION> <OPTION value=16>16 - A Re-Programmable Platform for Dynamic Burn-in Test of Xilinx VirtexII 3000 FPGA... (3)</OPTION> <OPTION value=17>17 - Using harDWare-based particle swarm method for dynamic optimization of adaptive ... (2)</OPTION> <OPTION value=18>18 - HarDWare/software coevolution of genome programs and cellular processors (2)</OPTION> <OPTION value=19>19 - Systolic Array Based Adaptive Beamformer Modelling in SystemC Environment (2)</OPTION> <OPTION value=20>20 - A Reconfigurable HarDWare Design Using FPGA (2)</OPTION> <OPTION value=21>21 - An FPGA Implemented Processor Architecture with Adaptive Resolution (2)</OPTION> <OPTION value=22>22 - Evolving HarDWare with Self-reconfigurable connectivity in Xilinx FPGAs (2)</OPTION> <OPTION value=23>23 - Particle Swarm Optimization with discrete Recombination: An Online Optimizer for... (2)</OPTION> <OPTION value=24>24 - Towards the Integration of Drive Control Loop Electronics of the jpl/Boeing gyroscope... (2)</OPTION> <OPTION value=25>25 - An Incremental Evolutionary Strategy for the Design of FIR Filters targeting Real... (2)</OPTION> <OPTION value=26>26 - Adaptive Micro-Antenna on Silicon Substrate (3)</OPTION> <OPTION value=27>27 - Towards Fluent Sensor Networks: A scalable and Robust Self-Deployment Approach (3)</OPTION> <OPTION value=28>28 - Comparison of Fuzzy-C Means,Hard C-Means and Differential Evolution Algorithm in... (2)</OPTION> <OPTION value=29>29 - Evolutionary Design of Digital Circuits: Where Are Current limits? (2)</OPTION> <OPTION value=30>30 - GEZGİN &amp; GEZGİN-2: Adaptive Real-Time Image Processing Subsystems for Earth Observing... (3)</OPTION> <OPTION value=31>31 - A Multi-objective genetic Algorithm for On-chip Real-time Adaptation of a Multi-... (2)</OPTION> <OPTION value=32>32 - An EfficIEnt Technique for Preventing Single Event disruptions in Synchronous and... (1)</OPTION> <OPTION value=33>33 - Architecture of a Dynamically Reconfigurable NoC for Adaptive Reconfigurable MPSoC (0)</OPTION> <OPTION value=34>34 - Embedded Reconfigurable Array Fabrics for EfficIEnt Implementation of Image Compression... (1)</OPTION> <OPTION value=35>35 - Routing in Wireless Sensor Networks Using Ant Colony Optimization (2)</OPTION> <OPTION value=36>36 - Simulation of Multifunctional Combinational Modules Controlled by Vdd (3)</OPTION> <OPTION value=37>37 - Reconfigurable Parallel Computing Architecture for On-Board Data Processing (2)</OPTION> <OPTION value=38>38 - On comparison of Variable Length Representations by Means of Unconstrained Evolution... (3)</OPTION> <OPTION value=39>39 - VLSI Implementation of LMS Equaliser with Adaptive Length Selection for Wireless... (0)</OPTION> <OPTION value=41>41 - A scalable Reconfigurable Analog to Digital Converter Architecture targeting Low... (0)</OPTION> <OPTION value=42>42 - linear Prediction with Differential Evolution Algorithm (2)</OPTION> <OPTION value=43>43 - genetic Algorithm based Engine for Domain-Specific Reconfigurable Arrays (0)</OPTION> <OPTION value=44>44 - Non-Uniform Search Domain based genetic Algorithm for the Synthesis and Continuous... (2)</OPTION> <OPTION value=45>45 - Design Concepts for a Dynamically Reconfigurable Wireless Sensor Node (2)</OPTION> <OPTION value=46>46 - On-Board Partial Run-Time Reconfiguration for Pico-Satellite Constellations (2)</OPTION> <OPTION value=47>47 - A Framework of Evolvable and Reconfigurable Sensor Networks for Aerospace –based... (0)</OPTION> <OPTION value=48>48 - Analytical Modelling of Power Attenuation under Parameter Fluctuations with Applications... (2)</OPTION> <OPTION value=49>49 - A New State Space Representation Method for Adaptive Log Domain Systems (2)</OPTION> <OPTION value=50>50 - Swarm Based Incremental Learning for Combinational Circuit Evolution (2)</OPTION> <OPTION value=51>51 - Gene Regulation Mechanisms introduced in the E valuation Criteria for a HarDWare... (2)</OPTION> <OPTION value=52>52 - automatic HybrID genetic Algorithm Based Printed Circuit Board inspection (2)</OPTION> <OPTION value=53>53 - Population based FPGA solution to Mastermind game (2)</OPTION> <OPTION value=54>54 - A Large Scale Adaptable MultiplIEr for Cryptographic Applications (2)</OPTION> <OPTION value=55>55 - A Self-Tuning Analog Proportional-Integral-Derivative (PID) Controller (2)</OPTION> <OPTION value=56>56 - Self-Configurable Neural Network Processor for Adaptable FIR Filters (3)</OPTION> <OPTION value=57>57 - On-Chip Evolution Using a Soft Processor Core ApplIEd to Image Recognition (2)</OPTION> <OPTION value=58>58 - A Novel Adaptive Viterbi Algorithm and Its Implementation (2)</OPTION> <OPTION value=59>59 - An EfficIEnt HarDWare Architecture for H.264 Adaptive Deblocking Filter (2)</OPTION> <OPTION value=60>60 - A Low-Complexity Self-Calibrating Adaptive Quadrature Receiver (2)</OPTION> <OPTION value=61>61 - A Honeycomb Development Architecture for Robust Fault-Tolerant Design (2)</OPTION> <OPTION value=62>62 - Sate-Space based Analytical Modelling for Real-Time Fault Recovery and Self-Repair... (2)</OPTION> <OPTION value=63>63 - StrategIEs to On- line Failure Recovery in Self- Adaptive Systems based on Dynamic... (2)</OPTION> <OPTION value=64>64 - A Platform for Digital Intrinsic HarDWare Evolution (2)</OPTION> <OPTION value=65>65 - Face Recognition Using a Gabor Filter Bank Approach (2)</OPTION> <OPTION value=66>66 - Protecting Fingerprint Data using Watermarking (2)</OPTION> <OPTION value=67>67 - DeBUG Support for System-on-Chips,ConsIDerations for Reconfigurable and HybrID ... (2)</OPTION> <OPTION value=68>68 - Novel Techniques for Ensuring Secure Communications for distributed Low Power Devices (2)</OPTION> <OPTION value=69>69 - A Modular Framework for the Evolution of Circuits on Configurable Transistor Array... (2)</OPTION> <OPTION value=70>70 - Power Driven Reconfigurable Complex Continuous Wavelet transform Processor (2)</OPTION> <OPTION value=71>71 - A Tuning Technique for Switched-Capacitor Circuits (0)</OPTION> <OPTION value=72>72 - An automatic Technique to Synthesize System-on-a-Chip to Adapt to Changing Environments (2)</OPTION> <OPTION value=73>73 - Picosatellite Constellations for Remote Sensing in LEO (2)</OPTION> <OPTION value=74>74 - Evolvable HarDWare ApplIEd to Nanotechnology (1)</OPTION> <OPTION value=75>75 - Gate-level Morphogenetic Evolvable HarDWare for Scalability and Adaptation on FPGAs (2)</OPTION> <OPTION value=76>76 - Synthesis of MOS Analog Circuits by Evolutionary Methods (2)</OPTION> <OPTION value=77>77 - An Adaptive HDL Design Methodology for Hard IP and Soft IP Co-Protection (2)</OPTION> <OPTION value=78>78 - FSM and HSM watermarking: A Tutorial (3)</OPTION> <OPTION value=79>79 - Physics-based Model applIEd to Evolvable HarDWare (2)</OPTION> <OPTION value=80>80 - A Generic On-Chip DeBUGger for Wireless Sensor Networks (goCDWSN) (2)</OPTION> <OPTION value=81>81 - The Gannet Service-based SoC: A Service-level Reconfigurable Architecture (2)</OPTION> <OPTION value=82>82 - A FPGA simulation using asexual genetic algorithms for integrated self-repair (2)</OPTION> <OPTION value=83>83 - USING THE “CELOXICA” FPGA BOARD AND THE MACHINE LEARNING ALGORITHM “LEM3‮.. (2)</OPTION> <OPTION value=84>84 - A Comparing Design of Satellite Attitude Control System Based on Reaction Wheel (0)</OPTION></SELECT> <P></P><DT><P><B>Select RevIEwer(s):</B></P><DD><P><SPAN class=note>Tip: Click on ID,name,or RevIEws on the line below to re-sort this List (page will reload)</SPAN></P><DD><P>[ RevIEwer ID - <A href="http://www.eng.bahcesehir.edu.tr/openconf/chair/assign_revIEws.PHP?s=name">name</A> (# <A href="http://www.eng.bahcesehir.edu.tr/openconf/chair/assign_revIEws.PHP?s=revIEws">RevIEws</A>) ]</P><DD><SELECT multiple size=10 name=revIEwers[]> <OPTION value=4>&nbsp;&nbsp;4 - [PC] Nizamettin Aydin (0)</OPTION> <OPTION value=5>&nbsp;&nbsp;5 - [PC] Yalcin Cekic (0)</OPTION> <OPTION value=6>&nbsp;&nbsp;6 - [PC] DIDIEr Keymeulen (1)</OPTION> <OPTION value=7>&nbsp;&nbsp;7 - [PC] Emin Anarim (0)</OPTION> <OPTION value=8>&nbsp;&nbsp;8 - [PC] Murat Askar (0)</OPTION> <OPTION value=9>&nbsp;&nbsp;9 - [PC] Peter Athanas (3)</OPTION> <OPTION value=10>10 - [PC] Juergen Becker (3)</OPTION> <OPTION value=11>11 - [PC] Neil Bergmann (3)</OPTION> <OPTION value=12>12 - [PC] John Choma (2)</OPTION> <OPTION value=13>13 - [PC] Carlos A. Coello Coello (3)</OPTION> <OPTION value=14>14 - [PC] Sorin Cristoloveanu (1)</OPTION> <OPTION value=15>15 - [PC] Antonio Di Nola (1)</OPTION> <OPTION value=16>16 - [PC] Wai-Chi Fang (3)</OPTION> <OPTION value=17>17 - [PC] F. Joel Ferguson (0)</OPTION> <OPTION value=18>18 - [PC] Dario Floreano (1)</OPTION> <OPTION value=19>19 - [PC] Manfred Glesner (3)</OPTION> <OPTION value=20>20 - [PC] Maya Gokhale (3)</OPTION> <OPTION value=21>21 - [PC] Pauline Haddow (3)</OPTION> <OPTION value=22>22 - [PC] Ilker Hamzaoglu (1)</OPTION> <OPTION value=23>23 - [PC] Tetsuya Higuchi (2)</OPTION> <OPTION value=24>24 - [PC] DanIEl Howard (3)</OPTION> <OPTION value=25>25 - [PC] lishan Kang (3)</OPTION> <OPTION value=26>26 - [PC] Haluk Konuk (3)</OPTION> <OPTION value=27>27 - [PC] John Koza (3)</OPTION> <OPTION value=28>28 - [PC] Jason Lahn (1)</OPTION> <OPTION value=29>29 - [PC] Bernard Manderick (3)</OPTION> <OPTION value=30>30 - [PC] Trent McConaghy (2)</OPTION> <OPTION value=31>31 - [PC] Bob McKay (1)</OPTION> <OPTION value=32>32 - [PC] Brian Meadows (3)</OPTION> <OPTION value=33>33 - [PC] Karlheinz MeIEr (2)</OPTION> <OPTION value=34>34 - [PC] Mohammad Mojarradi (2)</OPTION> <OPTION value=35>35 - [PC] J. M. Moreno (2)</OPTION> <OPTION value=36>36 - [PC] Masahiro Murakawa (3)</OPTION> <OPTION value=37>37 - [PC] Alex Orailoglu (0)</OPTION> <OPTION value=38>38 - [PC] Christos Papachristou (3)</OPTION> <OPTION value=39>39 - [PC] Marek A. Perkowski (1)</OPTION> <OPTION value=40>40 - [PC] Viktor Prasanna (3)</OPTION> <OPTION value=41>41 - [PC] Justinian Rosca (3)</OPTION> <OPTION value=42>42 - [PC] Eduardo Sanchez (3)</OPTION> <OPTION value=43>43 - [PC] Radu Secareanu (2)</OPTION> <OPTION value=44>44 - [PC] Sakir Sezer (3)</OPTION> <OPTION value=45>45 - [PC] Hajime Shibata (3)</OPTION> <OPTION value=46>46 - [PC] Horia-Nicolai Teodorescu (3)</OPTION> <OPTION value=47>47 - [PC] Jim Torresen (3)</OPTION> <OPTION value=48>48 - [PC] Andy Tyrrell (3)</OPTION> <OPTION value=49>49 - [PC] Sezer Goren Ugurdag (0)</OPTION> <OPTION value=50>50 - [PC] Ranga Vemuri (3)</OPTION> <OPTION value=51>51 - [PC] Tanya Vladimirova (3)</OPTION> <OPTION value=52>52 - [PC] svetlana Yanushkevich (3)</OPTION> <OPTION value=53>53 - [PC] Xin Yao (3)</OPTION> <OPTION value=54>54 - [PC] Nukhet Yetis (0)</OPTION> <OPTION value=55>55 - [PC] Sanyou Zeng (3)</OPTION> <OPTION value=56>56 - [PC] Nazeeh aranki (3)</OPTION> <OPTION value=57>57 - [PC] Hugo deGaris (3)</OPTION> <OPTION value=58>58 - [PC] Erik Dirkx (3)</OPTION> <OPTION value=59>59 - [PC] Ahmet Erdogan (2)</OPTION> <OPTION value=60>60 - [PC] Sharon Graves (2)</OPTION> <OPTION value=61>61 - [PC] DavID Gwaltney (2)</OPTION> <OPTION value=62>62 - [PC] ALister Hamilton (2)</OPTION> <OPTION value=63>63 - [PC] Alan Hunsberger (3)</OPTION> <OPTION value=64>64 - [PC] Srinivas Katkoori (2)</OPTION> <OPTION value=65>65 - [PC] Semion Kizhner (3)</OPTION> <OPTION value=66>66 - [PC] Gregory Larchev (2)</OPTION> <OPTION value=67>67 - [PC] Derek linden (1)</OPTION> <OPTION value=68>68 - [PC] Klaus McDonald-MaIEr (1)</OPTION> <OPTION value=69>69 - [PC] Julian Miller (1)</OPTION> <OPTION value=70>70 - [PC] Lukas Sekanina (3)</OPTION> <OPTION value=71>71 - [PC] Raphael Some (3)</OPTION> <OPTION value=72>72 - [PC] Adrian Stoica (3)</OPTION> <OPTION value=73>73 - [PC] Gianluca Tempesti (1)</OPTION> <OPTION value=74>74 - [PC] Anil Thakoor (2)</OPTION> <OPTION value=75>75 - [PC] Gunnar tufte (3)</OPTION> <OPTION value=76>76 - [PC] Tina Yu (2)</OPTION> <OPTION value=77>77 - [PC] Rolf Drechsler (3)</OPTION> <OPTION value=78>78 - [PC] Rajesh galivanche (3)</OPTION> <OPTION value=79>79 - [PC] Paul Hasler (2)</OPTION> <OPTION value=80>80 - [PC] Kalmanje S Krishnakumar (0)</OPTION> <OPTION value=81>81 - [PC] Osman Nuri Ucan (0)</OPTION> <OPTION value=82>82 - [PC] H Fatih Ugurdag (0)</OPTION></SELECT> <P></P><DT><input type=submit value="Assign RevIEws" name=submit> </DT></DL></FORM><P></P></div><!-- mainbody --><P>&nbsp;</P><div class=footerborder></div><!-- DO NOT REMOVE THIS copYRIGHT NOTICE --><P><div class=powered>Powered by <A href="http://www.openconf.org/" target=_blank>OpenConf</A><!--1.22--><BR>copyright ©2002-2005 <A href="http://www.zakongroup.com/technology/" target=_blank>Zakon Group LLC</A></div><!-- DO NOT REMOVE THIS copYRIGHT NOTICE --></BODY></HTML>

我想用perl创建的txt文件是:

2 - Switchable Glass: A possible medium for Evolvable HarDWare (4)3 - An EfficIEnt Multi-Objective Evolutionary Algorithm for Combinational Circuit Design (3)4 - A Background Mismatch Calibration for Capacitive Digital-to-Analog Converters (3)5 - Designing Electronic Circuits by Means of Gene Expression Programming (3)6 - Coherence Based Fault Detection And Error Correction (3)7 - Wormhole Routing with Virtual Channels using Dynamic Rate Control for Network-on... (2)8 - Noise Analysis of Phase Locked Loops (3)9 - Design and Analysis of a Second Order Phase Locked Loops (PLLs) (2)10 - SW-HW Co-design and fault tolerant implementation for the LRID Wireless communication... (3)11 - Adaptive PID Controller Using Parameter Optimization Algorithm (2)12 - A Novel Self-organizing HybrID Network Protocol (2)13 - An Adaptive FPGA-Based Mechatronic Control System Supporting Partial Reconfiguration... (3)14 - Generalized disjunction Decomposition for the Evolution of Programmable Logic Array... (3)15 - Woofer-Tweeter Adaptive Optics System (1)16 - A Re-Programmable Platform for Dynamic Burn-in Test of Xilinx VirtexII 3000 FPGA... (3)17 - Using harDWare-based particle swarm method for dynamic optimization of adaptive ... (2)18 - HarDWare/software coevolution of genome programs and cellular processors (2)19 - Systolic Array Based Adaptive Beamformer Modelling in SystemC Environment (2)20 - A Reconfigurable HarDWare Design Using FPGA (2)21 - An FPGA Implemented Processor Architecture with Adaptive Resolution (2)22 - Evolving HarDWare with Self-reconfigurable connectivity in Xilinx FPGAs (2)23 - Particle Swarm Optimization with discrete Recombination: An Online Optimizer for... (2)24 - Towards the Integration of Drive Control Loop Electronics of the jpl/Boeing gyroscope... (2)25 - An Incremental Evolutionary Strategy for the Design of FIR Filters targeting Real... (2)26 - Adaptive Micro-Antenna on Silicon Substrate (3)27 - Towards Fluent Sensor Networks: A scalable and Robust Self-Deployment Approach (3)28 - Comparison of Fuzzy-C Means,Hard C-Means and Differential Evolution Algorithm in... (2)29 - Evolutionary Design of Digital Circuits: Where Are Current limits? (2)

等等..

到目前为止,我已经编写了这段代码,但我无法理解为什么它不起作用.它不会向屏幕和文本文件打印任何内容.任何帮助将不胜感激.谢谢!

use strict;use warnings;  use HTML::TreeBuilder;my $tree = HTML::TreeBuilder->new_from_content(    do { local $/; <DATA> });open(my $fh,'>','outputs.txt');my $i = 2;for ( $tree->look_down( 'name' => 'papers' ) ) {    my $papers = $_->look_down( 'OPTION value' => 'i' )->as_trimmed_text;    # my $comment  = $_->look_down( 'class' => 'content' )->as_trimmed_text;    # my $name     = $_->look_down( '_tag'  => 'h3' )->as_trimmed_text;    # $name =~ s/^Re:\s*//;    # $name =~ s/\s*$location\s*$//;    print "Paper: $papers\n";    print $fh "Paper: $papers\n";    $i++;}
解决方法 你使用look_down过度复杂,它用于属性.简单 find() the <option> elements.

foreach my $papers ( $tree->look_down( 'name' => 'papers[]' ) ) {    foreach my $option ( $papers->find( 'option' ) ) {        say $option->as_trimmed_text;    }}

另请注意< select>的name属性是论文[],而不是论文. []是名称的一部分.

总结

以上是内存溢出为你收集整理的从html获取特定内容并在Perl中打印到txt文件全部内容,希望文章能够帮你解决从html获取特定内容并在Perl中打印到txt文件所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/web/1079600.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-05-27
下一篇 2022-05-27

发表评论

登录后才能评论

评论列表(0条)

保存