Three out of four responders have misunderstood my question in different ways, soi I’m persuaded that I did a bad job of stating it, and I’d like to have another go.
On one hand, I have a web site with several dozen pages. On the other, I have people who are effectively my clients (now at least two of them out of four) who tell me that their clients find it difficult to utilize information from web sites, and insist on a page-oriented format, customarily a PDF.
To provide that I can convert each page to a PDF with Acrobat, or with a free PDF driver like CutePDF, and then combine the PDFs with Acrobat. But this approach has a serious disadvantage: the pages are heavily interlinked, and the links would continue to point to their original targets. From the reader’s point of view, there would be all these links that obviously were supposed to point to other parts of the document, and they’d point to the same material on some web site instead. Dumb!
So, I want a tool that lets me combine the HTML documents – resolving the links from inter-document targets to intra-document targets, among other things – and then convert to PDF. If the tool is really nice it can automatically add front matter, page separators, and a table of contents (although I’ll probably have to add the page numbers by hand after the conversion).
Returning briefly to the suggestions:
The people making the request are asking specifically for a PDF. Thus the whole point is to produce a page-oriented rendition of the web site. Some other page oriented format would probably be OK, although I’d have to clear it with them. Another interactive format, e.g., CHM, would be completely off point.
I don’t get to tell them that they don’t really want what they’re asking for because a web site cannot be (faithfully) represented in a serial document. That’s true, but irrelevant. This is one of those cases where the customer is always right; the customer’s customer is right squared!
Maybe some overriding constraint compels my clients’ clients to use a serial format, and it just hasn’t been explained to me. Maybe they’re asking for this because they put their heads on backwards when they get out of bed. It doesn’t matter.
I’m looking for an HTML tool, not a software component that I could use to create my own tool. My “client” does not have time to wait while I engage in a bout of software development, nor does my boss pay me to do that. In any case the response didn’t seem to imply that PDFLib can retarget links and do the other things required to solve this problem. I looked at its web site briefly, and got the impression that it \would just let me do those things myself by manipulating PDF rather than HTML. It’s not clear what the percentage is in that… especially if it involves implementing the tool on a server, which would not be its natural environment.
By the way, the problem is now solved to the extent it can be, because today is my last day in this job. Everything I can do for my “clients” is done. I encountered a similar problem once before, though, so I foresee encountering one again in the future, and I’d like to be ready when I do. On top of that, the problem is technically interesting.