My question covers a lot of territory, but I think is the best forum for starters, as it's going to involve purchasing some new hardware and software.
A few weeks ago, my neighbor's apartment burned down, nearly taking mine along with it. When I got home from work in the middle of the night, it looked like my apartment had vanished at first. It made me think: What if I had lost all the research I've done over a lifetime, much of it in the form of letters, documents and books not even stored in a computer? In fact, I have more than two dozen boxes full of papers that do little more than take up space.
So I decided it's time to digitize them...I think that's the right word; I want to essentially scan everything into a computer, but I need some guidance.
For starters, there are at least three ways to transform a paper document into a computer file that I'm aware of...
1) Scan it as an image.
2) Scan it as a PDF file.
3) Use optical character recognition (OCR) software to scan it as text.
In most cases, I'd prefer to scan it as text, not just for the smaller file size but because it would be a much more useful format; I could search it, edit the text, etc.
However, I won't be able to use OCR software with files that are faded, smudged, etc. Also, there are many images I'll want to scan.
So let's start with files that can't be copied with OCR software. Should I scan them as images or PDF files? (If I scan it as a graphic, which format should I use - TIFF, GIF, JPEG, etc.?) I've never worked with PDF files before, but I think you scan a document into a PDF format, right? If so, I assume it's a smaller file size than a graphic.
Can I search for text on PDF files? In other words, if I search my computer for the word "jaguar," will it search PDF files or only text files?
Do images come out nice in PDF files, or would it be better to scan them into an image format, like JPG?
Next question - What hardware and software should I buy?
I recently upgraded to a new MacBook Pro running Lion. I think I can get a good scanner for $150, maybe even less than $100 - right? Are there any particular models you recommend?
Do I need to buy special software to create PDF files?
What OCR software do you recommend?
Also, most of my files can be scanned with a regular scanner. (I think the term is flatbed scanner.) However, I have some thicker books that are going to be harder to scan. I've seen ads for little handheld scanners. Do they work well, and can you recommend any particular models?
Sorry to cram so many questions into one post. I think my project is pretty simple, really, or it will be after I figure out a few things - like PDF vs images vs OCR. Merely purchasing a scanner will probably be 90% of the project, especially if I can find one that comes with OCR software pre-installed.