Browser Caching Questions

Is there a way to prevent images (and content) from being cached in a user’s browser?

(I am asking this from the website’s point-of-view and not any settings that a user would set on his/her computer.)

Thanks,

Debbie

You can tell the browser not to cache pages if you want. Here is a discussion about it:

Why would you want to stop caching…and why is this in Web Security?

[FONT=Verdana]Perhaps it follows on from this discussion: http://www.sitepoint.com/forums/showthread.php?868495-IE8-and-Forcing-Cache-Flush[/FONT]

I am working on a project that deals with Online Testing.

My client pays current and retired school teachers to write Test Questions/Answers for a Test-Bank. Any breach of that content would obviously be catastrophic!!

The Test Writers log into my client’s web application over a secure connection, and write the Test Materials, which are saved on the server. Once the Test Questions/Answers have been approved, the Test Writer will never be able to see them again.

However, the fear is that a person writing Test Questions/Answers might be working on their Home Computer or a School Computer, and that either their own kids, or their students could easily get access to the Test Materials if the browser were caching (i.e. saving) the Test Pages and Test Images.

Questions/Answers are usually in “Text” format, but there are many cases where they could be in an “Image” format as well (e.g. Mathematical Formulas, Illustrations, Maps, Diagrams, etc).

So, if the Test Writer’s computer or browser is somehow saving permanent copies of the Web Pages or Web Images, then that would be a data breach.

Now, I understand the concept that anything you see on the Internet needs to be downloaded onto your computer for viewing - unless it is streaming audio/video. However, I believe there may be ways to make sure that your Operating System or Browser isn’t caching hundreds of pages that could be easily retrieved.

For example, not too long ago on Windows you could surf around your C: drive and find copies of old Web Pages and Web Images quite easily. (In fact, this is why I refuse to do Online Banking, because I did a quick test on my old Windows XP machine, and found images of all the cancelled checks?!) :eek:

The debate at work is really centered more around the belief - which I believe is false - that if we use Internet Explorer that all of these issues go away?! :rolleyes:

I’m no expert, but my understanding is that IF anything can be done to stop caching/saving of Web Pages and Web Images, then you should be able to do it for any Browser and Operating System.

(Note, performance isn’t an issue here, so if the Web page/Images have to be reloaded every time they are accessed, no big deal.)

BTW, my client assumes that the Test Writers are trustworthy people who are not copying/saving content on their own. (Obviously nothing can prevent that!)

What they are just more concerned about is an un-technically savvy school-teacher hanging his/herself!!

Hope that makes more sense?!

Debbie

DD,

That sounds more like a History question. If I remember correctly, there are ways to eliminate - or prevent - pages from being logged to a browser’s history but I’ll let you search for them. Because it appears to be a browser-centric question, you will likely have to research each one.

Regards,

DK

Does anyone else agree with this?

Debbie

If by history, they mean the browser file cache, yes. If they mean the list of recently visited websites, then no.

It’s a cache issue. Browsers cache HTML,CSS, JavaScript, images, etc, in a folder on your computer, so they don’t have to download them a second time, when you refresh the page, unless the files have changed, then they fetch them again. Depending on last modified headers, etc.

As others have suggested, you can send headers to tell the browser not to cache the files.

Does anyone else agree with this?

Debbie[/QUOTE]

Yes.

The original question was about the content storage, if I understand correctly.
There is a small technical problem: webmasters commonly used “non-chache” directive while the browser is still able to store the data.

I was asking about either Images or Content being stored on the end user’s computer, and thus causing a security risk.

Debbie

:rofl:

If there is any browser that caches stuff for way to long, it’s IE. I’ve had instances where just clearing the history and restarting the browser wasn’t enough, I had to restart the computer completely to rid of IE’s cache. This was in ye olden IE6 days mind you, but still.

Anyway, sending Cache-Control headers with the values no-cache, no-store and private (see http://palizine.plynt.com/issues/2008Jul/cache-control-attributes/) for both the document as well as for all the images should do the trick.

And if at all possible, serve over SSL. Most browsers are hesitant to cache anything they download over an SSL connection.

I agree.

Anyway, sending Cache-Control headers with the values no-cache, no-store and private (see http://palizine.plynt.com/issues/2008Jul/cache-control-attributes/) for both the document as well as for all the images should do the trick.

But isn’t that easy to over-ride?

I mean, couldn’t a browser ignore all of that?

And couldn’t how a user has his/her browser set up, also over-ride those?

And, couldn’t a user manually over-ride those?

And if at all possible, serve over SSL. Most browsers are hesitant to cache anything they download over an SSL connection.

Now that is a very good point!! :tup: (I think that is something they could do and would support.)


BTW, backing up for a moment, does my original post make sense about what they are concerned about?

And from a business perspective, just how effective can they be in trying to prevent test material from leaking out? (I guess I see this analogous to banking online from home - or even worse from wok or a public computer - and wanting to make sure that nothing you did or saw ended up stored in an unsecured location. For example, copies of cancelled checks, copies of bank statements, any content that might include personal or banking information, etc.)

Thanks,

Debbie

Those are actually exact opposite situations since with online banking etc you are at the computer where the files are cached and can therefore tell the browser to clear the cache and no one can prevent it. With the situation you are looking at it is the server trying to prevent caching and the person sitting in front of the browser who has the ability if they know how to override whatever you do and cache the files anyway. As with anything sent to the browser the person using the browser has control.

Possible, yes. Easy, no.

Yes, those headers are only suggestions. A browser can just ignore all of them if they want.

I can’t remember ever seeing settings in a browser for this.

An advanced user could side-jack his own connection (if it’s not SSL) and change those values, or write a browser extension to change it, yes, but is by no means within the realm of possibilities of the average internet user.

Also, with regards to the images, if you don’t show them back to the user after they’ve uploaded them they will not be stored in the cache, because they’ve never been downloaded. The bigger problem here is that the image is still on the computer, because that’s where it’s uploaded from.

This sounds to me like a situation where you can give it your very best to prevent caching, but you can never 100% it.

I’m not sure what you are saying here, and think you may be misunderstanding me about images.

While there could be someone who uploads an image for the test, I don’t believe that would be common, since I am pretty sure all of the graphic design stuff happens in house.

However, the person writes the test questions and answers, or the person who approves what the author wrote, would need to see the image, thus it would be downloaded.

For example, let’s say there is a math question about a triangle, and along with the test question and answer, there is an image of a right triangle with the length of one side and the two acute angles also noted in the image.

The fear is that the autor or lead who approves things, might be doing so from school or at home, and if an unauthorized person got a hold of the cached webpages/images, then that image or a right triangle might clue them in on how the test is written, or on what a possible question might be.

It could be something even more specific, like a custom map of the U.S. showing certain features (e.g. former territory borders, natural resources, etc.) that could be used on several tests, and again, by having a copy f that image, people could use it to cheat or at least be really prepared for the tests.

I know you have to let viewers of the webpage download the image to view it, but the fear is that the image persists on the user’s computer long after they have accessed the image.

For example, an author has permissions on the server side to view a question while it is being authored, but as soon as the lead approves the question and answer as “valid”, then that author - who is usually a hired consultant - will never have access to the question again, because for what they are being paid to do, they have no need to go back and see it.

So is Jane Doe writes a test question and answer with an image to support things, then after her test question/answer/image are approved, she shouldn’t be seeing it on her computer, which could be at work, at school - if she is a teacher by day - or at home with possibly tons of kids!!

If the content of the test question or answer, or the image get stored on ANYONE’S computer, whether through webpages being stored in temporary locations, or image caching, or something the browser does to temporarily store things, or something the operating system does to temporary store things, then that Test Question/Answer/Image have in essence been “breached”…

Hope that makes sense.

Not sure how that changes your response one the image part?!

Thanks,

Debbie

I’m going to be honest here. I think you are over thinking the scenarios and making this a lot more complicated then it needs to be. Simplicity is a beautiful thing. The fact of the matter is, having a single image would not compromise a test, having taken quite a bit of test myself. Having part of an image to a math problem is not going to give anything away. Now unless the answer itself was an image? Maybe. Also you brought up a teacher at home with a bunch of kids…who probably won’t even be taking the test in the first place? And if they were, I’m sure their mother being a teacher and all will understand the “keep the answers a secret” type thing. As for school computers, the chance of a student have prolong unsupervised access to a teachers account, while not unheard of, snooping in the browsers cache is not a activity I would see them doing. Looking into the cache is not something people thing about.

A few comments…

1.) I am just researching things for work, and relaying what the concerns are.

2.) As a technical person, I am trying to get to the “truth” about what can and cannot be done with the technology (e.g. browser caching, etc.)

3.) These tests go out to MILLIONS of students, so to imply “I’m over-thinking this” or “It’s not that big of a risk” would be wrong!!

ScallioXTX’s link and suggestion to use HTTPS seems like the best advice so far, but I welcome more thoughts.

Thanks,

Debbie

Yup, I thought the test creators needed to upload images, but apparently they don’t. Not sure where those images come from them but I’m sure you’ve got something for that :slight_smile: