Importing custom databases into Drupal

I have in front of me a fairly complicated db structure from a custom CMS that I’ve been charged to import into Drupal. I’m just getting started with Drupal’s node structure, so I’m a bit worried how I’m going to be able to retain the content heirarchy and also preserve the cross reference tables (for example, vendors can belong to multiple categories and categories have multiple vendors, so a cross reference table is used to map the many to many relationship). I’m pretty familiar with how to do this if I’m writing code from scratch - but at a bit of a loss to do it with Drupal

Anyone recommend any tutorials out there on doing this sort of thing? I’m going through the Google hits I’ve found for this, but the articles its pulling are uneven. The Feeds module keeps getting mentioned, but since I’m reading in a database (or more accurately a set of CSV files I’ve converted back into a database in order to analyze their contents) it isn’t going to be much use.

Use the entity reference module to create fields that can reference other entities in the system. This is how you could *easily assign vendors to categories or any any other entity or that matter. If you are using Drupal 8 entity reference function is included in core, if not than you need the module.

Entity Reference requires the Entity module, which also appears to be scheduled for Drupal 8 core - which is nice, but we have a firm launch prior to a stable release of Drupal 8 so the site must be done in 7. Any good tutorials on those modules?

The entity reference module relies on entity api which has a production release. I’m not sure what you mean by entity module considering there is no such thing. In Drupal 7 the entity functions are just a series of procedural functions inside a single include file – entity is not a module. In Drupal 8 this has all been cleaned up but in 7 it is quite a mess to say the least lacking some very basic CRUD functions like save. The entity_api module provides a powerful API for manipulating entities and filling some basic CRUD gaps. Not to mention the entity_api module provides a ORM like features for dealing with relationships and defining entity properties and fields relationships/data types. That is not to say you couldn’t get by without using entity api or entity reference but it will save a lot of headache and allow integrations with other modules if you do which is a major advantage of doing things the Drupal way in the first place like using entities and fields to build site components/content entry templates.

In regards to importing the data best bet will be build something yourself or leverage the feeds module. Though from what you are saying sounds like either way you will writing custom code. Though the feed module does do quite a bit of heavy lifting. None the less, custom integrations will probably be required. In the case you were to use the feed module to move the data over I would probably suggest massaging the data into something that feeds can import in batches. That would probably either take the form of XML or CSV file(s). There are some other modules that exist that integrate with feeds that allow XML parsing and what not if you go the XML route. One of those modules is querypath parser though a standard [url=https://drupal.org/project/feeds_xpathparser]xpath parser also exists. he [url=https://drupal.org/project/feeds_tamper]feeds tamper has some useful utilities as well for things like stripping certain characters and expanding multi value fields. There is also [url=https://drupal.org/project/feeds_imagegrabber]image grabber which can be used to import images or for reference to write your own integrations since it is a relatively simple module. Those are some of the useful feeds modules off the top of my head.

Coming from an OOP background you are probably going to be disgusted by Drupal 7 and contributed modules but I will say it right now it is something your going to have to get over if you’re going to work with Drupal. It is a lot of procedural code until you hit 8. 8 is a game changer but 7 is pretty much mostly procedural.

When I was testing out feeds some time back what I did was created a plain jain local environment with just the feeds modules enabled. Generally speaking the less modules you have enabled the less likely you are to run into problems and the faster the site. So when it comes to researching these things it is nice to have isolated local environments for the different major parts of a project such as; importing data.

The absolute first thing I would do though is model the existing data into content types, taxonomy, and fields. That will than lay the ground work for what will be required to translate the existing data into Drupal entities (nodes, terms, etc).

I know what you mean about coming from the code world and wondering how to achieve things with Drupal. I was in the same boat about 6 years ago and it was painful at the time but Drupal is better now than it was back then and there is a lot more information too. Since Drupal 8 is still an alpha, I’m going to suggest concentrating on Drupal 7. I have Drupal 8 running as a test bed and it is packed with lots of out of the box goodness but I don’t expect to use it in production for quite some time.

For relationships and cross referencing, you need Taxonomy. You can create lists of references and then you can expose them to your content and tag it appropriately. Taxonomy is native to Drupal so no need for additional modules there.

I expect at some point in order to render indexes to your cross referenced information you’ll use the Views module. You don’t have to but since you are new to the world of Drupal I would suggest that you do. For Views you need C-Tools and Views. You only need C-Tools enabled (none of the bundled C-Tools need to be enabled). Once you’ve got views installed you’ll be able to create “Views” as stand alone pages, feeds or blocks. Views can be extremely complex but basically a view is a listing of content and you can filter it and display it pretty much anyway you like. I use views for Blog, news, article, etc… indexes. I also use it for catalogues and you can set specific filters or expose the filters. So if you want to have a roster of people, you can use views to list them out and then expose a filter to set some sort of demographic criteria. If you want to do a catalogue with title, description, thumbnail, feature sheet (PDF), etc…, views is your best friend.

I usually use Feeds to import the data into nodes. You will want to play with it a bit to ensure that it imports the content into the correct content type. As I recall by default, everything is imported as an article and I don’t use articles usually.

I’ve used Menu Import for importing a structural hierarchy. It’s a bit of a pain in the @ss to set up and get the import working correctly but once you get around it and have it set up correctly, it will create the menu and will save quite a bit of time as opposed to linking manually.

It can be difficult coming from a world of code into Drupal because there is a fairly substantial learning curve and it can feel like you are moving backwards but once you get past the growing pains it is definitely worth the effort. For learning the Drupal way of doing things I would recommend Using Drupal 2nd Edition (http://shop.oreilly.com/product/0636920010890.do) It’s pretty basic stuff but full of insight about how to go about common activities. For digging deeper and learning the API as well as the best “Theming” primer I’ve read, I would recommend Pro Drupal Development 7 (http://www.amazon.ca/dp/1430228385). A fairly in depth book about the guts of Drupal.

Andrew