How to develop a database that will potentially grow v large

If I am making a system , with a database that has potential to grow into something as big as linked in or FB…Do i need to take any steps while its development…how am i going to accommodate for so much data …will it ever require data distribution , and can this be done once a DB is already built…when do we start worrying about the data storage capacity of our DB

I’d say unless you are very sure that your data will grow that much very fast you should not worry about it. When you deal with those amounts of data you have to solve very specific problems that you normally do not encounter.

  • Make sure your database is Normalized - preferably 3NF.

  • Make sure you have Referential Integrity.

  • Make sure all tables have a Primary Key - preferably an AutoNumber.

  • Utilize Indexes where it makes sense.

  • Design your database with the assumption that it will be as big as FaceBook some day, i.e. get rid of the “We will never have more than 5 types of blah…”

Those tips will get you started…

One of the common mistake is “predicting future requirements”. From what I know, you can store millions of record in traditional db and most likely it’ll be fine for many years of your application. By the time you are actually having this issue is when your system becomes very successful. By then, you probably have more issue then being able to scale your db which would require code refactoring. You can always easily convert from DB to Big DB like MongoDB. Since you are converting from real data, you’ll be able to create a very good design for MongoDB. Trying to design a big data using a phantom data or guessing the future will probably lead to many headaches. I always say “Make it work first then make it better soon”

That only makes sense if you have an OLTP application. In the OLAP world this is something you do not want to do.

Please don’t. Primary key is not always necessary, and in many cases you want a primary key which is more sophisticated than just an AutoNumber. As always in software development there are trade-offs you have to consider.

Facebook, Google, and other big players have problems that occur only when the amount of data becomes extremely large. In smaller applications it is in fact very detrimental to have storage that is designed for those extraordinary amounts of data. So please, do not design for Big Data unless you know for sure that you will be there pretty soon.

And it’s pretty hard to have a need for an OLAP when you don’t have any data!!

There you always assume people are starting off with a OLTP unless stated differently…

Big Data?

You’re way out in left field man…

You don’t have to have a “Big Data” situation to PLAN PROPERLY.

I’ve seen train-wreck after train-wreck of data disasters because people don’t take the time to plan things up front.

This has nothing to do with Big Data, and to assume that database will simply scale in this day and age shows how naive you are if you really believe that.

(Physical limitations are rarely the problem. P*ss poor planning on the Logical Design is almost always the problem!!)

totally agree

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.