Should all Images be One Type?

Yes, that helps and seems in line with my own thinking.

I appreciate the other competing views, but I have a long ways to go before the # of Files in a Directory or anything else becomes an issue?!

Thanks for the nice words and well-wishes. I’ll need them!!

Debbie

P.S. I am reviewing my upload.php script now and hope to post in online later for a “code review”. If you wanna really help me out, please keep an eye out for it and let me know if I internalized everyone’s suggestions.

I support ultra’s idea, not necessarily for performance, but to help avoid naming collisions. It’s a lot easier to avoid collisions with 10 files then it is 100,000 files.

Also for the original topic: no, there is no benefit. In fact, there are several reasons to use multiple types (GIF, JPG, and PNG all have different storage methods, and those methods make them better suited for certain types of images).

Easier way to do that then with directories. sha1_file for example. Name all files using the result from that will make sure A) you have no name conflicts B) no duplicate files. One would spend more time making sub directory after sub durectory when there are much easier and faster solutions.

sha1_file just does a sha1 hash. Those can have naming conflicts (or there are only 40^16 things that can be hashed in the world =p)

Making the subfolders is pretty easy and automated:


$folder_name = $uploads . date('Y/m/d');
mkdir($folder_name, 0755, true);

// copy file

Of the file contents.

Those can have naming conflicts

Uh huh…if you get a conflict then it is likely the SAME FILE.

Rewind…

So if you upload a photo of yourself, and I upload a photo of myself - obviously very different - they each of of photo files could never create a “collision” even though they clearly have different bit configurations?

Secondly, if your answer above is, “No two image files that are different can have the same hash” then I would like to know if you have any easy ways to handle different people possibly uploading the same picture? (Maybe there are a lot of people who like Brad Pitt and so they use his image as profile image. How can I allow that to happen without getting all complicated?!)

Debbie

The two files in question would never return the same hash. Ever. Changing one bit in a file changes the hash.

Secondly…How can I allow that to happen without getting all complicated?!)

You don’t have to do anything, just reference the same file in the database. They’ll share the same file without causing issues.
You’ll only have to watch out if you delete, by making sure no one else references that file.

It is akin to hardlinks in the file systems. One file with multiple aliases/names attached to it.

Likely, but not guaranteed. =p

I like watching out for those 1 in a billion chances (or 1 in 42,949,672,960,000,000,000,000,000 chances… my argument still stands. =p)

Also, larger files take longer to hash, so it can be slower. SHA1 is O(n), so the larger n is, the longer it takes. My method is O(1), so it’s pretty much constant. :wink:

Been observing this thread and wondering why not something simple…

$path = PATH TO YOUR IMAGES sans image name
$ext = EXTENTION OF IMAGE
$does_it = 0;
while($does_it = 0) {
	$new_name = time() . $ext;
	if (!file_exists($path . $new_name)) {
		$does_it = 1;
	}
}

Until you start accessing those directories. ext3/4 do have a real directory limit (~32000 for ext3). Unlike files, too many directories have a real negative impact on performance. You might want to look that up.

Hi Debbie,

I will look out for you upload script and if I have anything valuable to add I will.

ext4 has [COLOR=#333333]support for volumes with sizes up to 1 exabyte ( 1 exabyte = 1,048,576 terabytes ) and files with sizes up to 16 terabytes. It supports up to 64, 000 directories. Again lot’s of room to grow into and you can either store your thumbnails in one directory or divide them into some logical directory structure; say by month. With the ext4 limits it gives you lots of room before you will have to move to something else.

Running RAID 10 on ext4 and the kernel patches to move large scale deletions/updates to memory have all but wiped out the problems ext3 suffered from. The inadequacies of ext3 is why ext4 was embarked upon.

Regards,
Steve[/COLOR]

If you do a subfolder for every year, month, and day, that largest number of folders would be 31 (until you have your site for 32 years).

Not sure what you mean.
What I had in mind is folder structure like this
uploads/2012/02/12 <- all uploads made on Feb. 12 2012 go here.

That’s what I mean.

So, the largest number of folders in any given directory would be 31 (31 “day” folders in one month). So, until you have 32 year folders, 31 is the maximum number, which is pretty tiny.