.\" Automatically generated by Pod::Man version 1.02 .\" Sun Sep 24 20:36:03 2000 .\" .\" Standard preamble: .\" ====================================================================== .de Sh \" Subsection heading .br .if t .Sp .ne 5 .PP \fB\\$1\fR .PP .. .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Ip \" List item .br .ie \\n(.$>=3 .ne \\$3 .el .ne 3 .IP "\\$1" \\$2 .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. | will give a .\" real vertical bar. \*(C+ will give a nicer C++. Capital omega is used .\" to do unbreakable dashes and therefore won't be available. \*(C` and .\" \*(C' expand to `' in nroff, nothing in troff, for use with C<> .tr \(*W-|\(bv\*(Tr .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` ` . ds C' ' 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' 'br\} .\" .\" If the F register is turned on, we'll generate index entries on stderr .\" for titles (.TH), headers (.SH), subsections (.Sh), items (.Ip), and .\" index entries marked with X<> in POD. Of course, you'll have to process .\" the output yourself in some meaningful fashion. .if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" . . . nr % 0 . rr F .\} .\" .\" For nroff, turn off justification. Always turn off hyphenation; it .\" makes way too many mistakes in technical documents. .hy 0 .if n .na .\" .\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2). .\" Fear. Run. Save yourself. No user-serviceable parts. .bd B 3 . \" fudge factors for nroff and troff .if n \{\ . ds #H 0 . ds #V .8m . ds #F .3m . ds #[ \f1 . ds #] \fP .\} .if t \{\ . ds #H ((1u-(\\\\n(.fu%2u))*.13m) . ds #V .6m . ds #F 0 . ds #[ \& . ds #] \& .\} . \" simple accents for nroff and troff .if n \{\ . ds ' \& . ds ` \& . ds ^ \& . ds , \& . ds ~ ~ . ds / .\} .if t \{\ . ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u" . ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u' . ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u' . ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u' . ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u' . ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u' .\} . \" troff and (daisy-wheel) nroff accents .ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V' .ds 8 \h'\*(#H'\(*b\h'-\*(#H' .ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#] .ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H' .ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u' .ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#] .ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#] .ds ae a\h'-(\w'a'u*4/10)'e .ds Ae A\h'-(\w'A'u*4/10)'E . \" corrections for vroff .if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u' .if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u' . \" for low resolution devices (crt and lpr) .if \n(.H>23 .if \n(.V>19 \ \{\ . ds : e . ds 8 ss . ds o a . ds d- d\h'-1'\(ga . ds D- D\h'-1'\(hy . ds th \o'bp' . ds Th \o'LP' . ds ae ae . ds Ae AE .\} .rm #[ #] #H #V #F C .\" ====================================================================== .\" .IX Title "HTML2LATEX 1" .TH HTML2LATEX 1 "perl v5.6.0" "2000-09-24" "User Contributed Perl Documentation" .UC .SH "NAME" html2latex \- \s-1HTML\s0 to latex converter. .SH "SYNOPSIS" .IX Header "SYNOPSIS" html2latex [\s-1OPTION\s0]... \s-1URLS\s0... .SH "DESCRIPTION" .IX Header "DESCRIPTION" html2latex uses \s-1HTML:\s0:TreeBuilder to parse an \s-1HTML\s0 file and then it converts the \s-1HTML:\s0:Element into to a Latex file. Each \s-1URL\s0 will have a \&.*html extension stripped. If you use a \s-1URL\s0, then the files taken from the Internet will be stored in your ~/.html2latex directory. If pictures are included, they are converted to .PNG, which can only be used with pdflatex. As an added bonus, there is an option to automatically create a \s-1PDF\s0 from the Latex file (using pdflatex). .SH "REQUIRES" .IX Header "REQUIRES" If your html2latex is not working correctly, this may be because you do not have many of the needed packages. html2latex requires \&\s-1HTML:\s0:TreeBuilder perhaps \s-1LWP:\s0:Simple and \s-1URI\s0. If you do not have either of these, try typing \fBperl \-MCPAN \-e shell\fR at the command line. This will bring up a shell for \s-1CPAN\s0 (The Comprehensive Perl Archive Network). Then, as root try typing \fBinstall \&\s-1HTML:\s0:TreeBuilder\fR. Should work like magic. .SH "URLS" .IX Header "URLS" In your list of URLs any filename given after a \s-1URL\s0 will continue to use the latest \s-1HOST\s0 given. Also, files default to index.html, regardless of what the server thinks. So, if you type: .PP \&\f(CW\*(C`html2latex http://slashdot.org foo.html http://linuxtoday.net bar.html\*(C'\fR .PP html2latex will try to grab http://slashdot.org/index.html, http://slashdot.org/foo.html, http://linuxtoday.net/index.html, and http://linuxtoday.net/bar.html .SH "OPTIONS" .IX Header "OPTIONS" Options are secondary to document-specified options. So, if your \s-1HTML\s0 file has border=1, a border will be printed regardless of the \&\fB\*(--border\fR option. The do overide, however, options given in the configuration file. If you want to change things more permanently, try changing the config file, html2latex.xml. For information on it, try the \s-1HTML:\s0:Latex under section \s-1CONFIGURATION\s0 \s-1FILE\s0. .Ip "\fB\-h \-? \-\-help\fR" 4 .IX Item "-h -? --help" Print the brief help and usage. .Ip "\fB\*(--latex2pdf \-\-pdf \-p\fR" 4 .IX Item "latex2pdf --pdf -p" Automatically create a \s-1PDF\s0 along with a latex file named \s-1FILE\s0.pdf. This may fail and print a number of cryptic errors. .Ip "\fB\-i \-\-image \-\-image_scale=SCALE\fR" 4 .IX Item "-i --image --image_scale=SCALE" Set the scale for images in the latex file. This is usefull because some images in \s-1HTML\s0 or much to big to fit on a page. Default is 1.0. \&\s-1SCALE\s0 can be any non-zere positive floating point number, large numbers are not reccomended. .Ip "\fB\-f \-\-font \-\-font_size=SIZE\fR" 4 .IX Item "-f --font --font_size=SIZE" Set the default font size. Can be 10\-12. Do not try anything else. html2latex will not check it, but the latex file will not compile (at least I think not). Default is 12. .Ip "\fB\-d \-\-debug\fR" 4 .IX Item "-d --debug" Level of debugging info to print. The more times this option is used, the higher the level. Default is 0, and you cannot lower that. Right now, 0 prints nothing. 1 prints fun code-tracking info. 2 prints lots of data-structure information, so don't do it unless you're serious. .Ip "\fB\*(--border \-\-table \-\-table_border\fR" 4 .IX Item "border --table --table_border" Sets table around borders on. Default is off. Also, \fB\*(--noborder\fR or \&\fB\*(--notable\fR will explicity turn table borders off. .Ip "\fB\*(--class \-\-document \-\-document_class=CLASS\fR" 4 .IX Item "class --document --document_class=CLASS" Set the documentclass to use. Any valid latex document class is valid. Examples are \fBreport\fR, \fBbook\fR, and \fBarticle\fR. \fBarticle\fR is the default. If an invalid document class is used, the output latex file will not compile. .Ip "\fB\*(--package=PACKAGE\fR" 4 .IX Item "package=PACKAGE" html2latex will create a latex file using any packages that you specify. \s-1PACKAGE\s0 will be added to the list of class to put in the file. html2latex will not make sure the packages are valid, but if they aren't the latex file won't compile. .Ip "\fB\*(--head=HEAD\fR" 4 .IX Item "head=HEAD" Latex allows you to add options in the preamble of the form \&\edocumentclass[\s-1OPTIONS\s0]{article}. Each \s-1HEAD\s0 you add gets added to the list included. For instance, you could use \f(CW\*(C`\-\-head=twocolumn\*(C'\fR to add the 'twocolumn' feature of Latex. Since font sizes are already added, don't add them yourself. See \f(CW\*(C`\-\-font\*(C'\fR .Ip "\fB\*(--mbox \-m\fR" 4 .IX Item "mbox -m" With any of these, html2latex will put a tex \embox around all of the tables it creates. I do not know why, but with a lot of tables (especially nested ones), the tex and pdf output will work better. So, if you do not like your output with tables, try this. .Ip "\fB\*(--paragraph \-\-par \-P\fR" 4 .IX Item "paragraph --par -P" Uses HTML-style paragraphs. This is by default, so try \-\-noparagrph or \-\-nopar or \-P! to turn it back to Latex-style paragraphs. .Ip "\fB\*(--cache \-\-local\fR" 4 .IX Item "cache --local" .Ip "\fB\*(--log \-l \s-1LOGFILE\s0\fR" 4 .IX Item "log -l LOGFILE" Print all messages to \s-1LOGFILE\s0 instead of \s-1STDERR\s0. .Ip "\fB\*(--conf \-C \s-1CONFFILE\s0\fR" 4 .IX Item "conf -C CONFFILE" Change the configuration file to \s-1CONFFILE\s0. For more information on this file, see the \s-1HTML:\s0:Latex manpage. .SH "Development" .IX Header "Development" Development is being carried out by Peter Thatcher (peterthatcher@asu.edu) and Stan Seibert (volsung@asu.edu). Homepage is http://html2latex.sourceforge.net.