How to Auto Download & Format Data from the Web?

Hey guys,

I am currently designing a national business directory site similar to YP or Angie’s List which will list local businesses of all kinds and allow users to search, browse, review, and rate businesses in their area. While most of the businesses will be added to my site by the users themselves (registering their own businesses on the site) I do not want to launch my directory site completely empty of listings. So I figured I would pick a few locations and list the businesses in the area which leads to my issue. It is going to take months to create a page for each business even for just one small area. I can upload a CSV file containing the information of each business to save time, but finding all of the info and then formatting it correctly in a CSV document will still take forever.

So my question is; is there a program/software that can recognize and auto-populate the necessary fields in my database from the data on a webpage so that I don’t have to copy/paste and format thousands of rows of data? For example, if I want to create a list of all plumbers in Washington DC using the information available on YP.com can I somehow setup a database program like excel to automatically pull the business’s name, address, phone number, hours of operation, etc. into the appropriate fields? If this sort of software is available anywhere I haven’t been able to find it, and if it is not available does anyone has any ideas as to how I could go about developing software like this myself?

Any help or suggestions would be really appreciated!