Parsing content from server file(s), selecting and inserting into page HTML (not DOM)

I have a problem for which JavaScript seems to be the most reasonable solution. While I’ve been programming for over 20 years (C, C++, Korn/Bourne Shell, traces of Perl, …, and lately (X)HTML with CSS) most of what I know of JavaScript is that the O’Reilly Rhino Book is big and heavy. Okay, I’ve read parts of it and I know I have to absorb the model of computation, the value/object subtleties, and so forth. What I’m looking for is what language parts/tools I should be studying to complete the task at hand. The gaps mostly have to do with I/O, which is typical when you are in an unfamiliar environment.

I’ll describe that task in some detail so you won’t go chasing quickie solutions that can’t do the job.

I’m now managing a web site which is embedding ads from a Big E-tailer. (If you’re thinking big, you’ve got the right one.) We have ad blocks for a large number of books and a variety of swag (totes, banners, pins, mugs) and we’d like random selection of ad blocks, as below. Since each ad block results in a request to Big E-tailer’s servers, and since our ‘library’ of ad blocks may run into the hundreds, we don’t want the code for a block to be interpreted unless we choose it for display. The site is hosted by a service that doesn’t support server-side execution, but does allow us to store url-accessible files (via http, not ftp).

In addition we’d also like to have a lot of control over how the blocks are selected, so we would like to be able to store parameters both with the ad blocks and in the display spaces. (Example to follow.) My feeling is that it would be fastest and least stressful all around if the ad blocks’ HTML could be stored, with the control parameters, in a single file on the file system we see in our hosting company’s server environment. We might have a second file, also there, with parameters for the various display spaces.

What follows is an example of how the thing might work. Note that I can program my way around the block and store that block in a data structure, my algebra is strong, my calculus adequate, and my prob-and-stat sufficient, to this task. I lack only the selection of JavaScript facilities (or so I think).

What I envision is something like this: An ad display space in the web page (perhaps in the coding for a sidebar) contains a JS program (or the URL thereof). Parameters for that space (one of many) say "three NewBooks, two ClassicFiction, two ClassicNonFic, one Swag, one to three selected randomly from NewBook, ClassicFiction, ClassicNonFic, and Swag.

This yields nine to eleven slots to be filled for this display space, each needing an ad from the specfied category. The order of the slots is now randomized. (I -know- there is a random number function in JS.)

The list of ad blocks (and associated parameters) resides in the server file. Perhaps each category is read into an array, and each array element has a string with the ad block HTML and display control parameters.

For each ad slot in the display space, an ad is selected as follows: First a weighted likelihood is assigned for each available ad, based on the parameters and the slot’s location (eg. from top to bottom in the display space). Then, based on the weighting and more “random” numbers, the ad is selected and its text is “injected” into the page. Finally, the ad is removed from consideration for re-display on that page (unless there are not enough left).

Since we have many times more ads than we will use on any given page display, it seems best to keep them as text rather than as Object Model Objects until we know that the particular ad will be needed. This is doubly true because each one results in a request to the e-tailer’s servers and too many requests from one browser can make those servers send generic ads rather than the particular product ads. (And of course, making unneeded requests is Not A Nice Thing To Do.)

The simplest way to maintain the ad content files is to take the code fragments provided by the E-tailer (via web page interface) and paste them into a text file, with parameter information. It will help if any formatting can be described by comments in the file. We can upload and download it at need, and we’ll probably be editing it two or three times a week. This is a rote operation that can be learned, so we can avoid additional programming interfaces. (And the only files we can write on the server are Excel files produced by the hosting service’s forms interface.)

So where can I use advice? Mostly with the tools and suggested practices for reading the hosting-server files, structuring their content to be interpreted by the program (variable numbers of parameters and defaulted parameters, keyword and numeric, would be helpful; so would a comment convention in the file), and mechanisms for injecting HTML and the subtleties and pitfalls associated with it. Since I have the Rhino Book, pointers to chapter and section would be just fine, if that’s best for you.

Another consideration is that all the display spaces in one rendering of a page should result in just one retrieval of the ad file(s) from the server and every execution should share the same content so that each script knows what has already been displayed. This, I expect, is tied to the JavaScript language model (distinct from the Document Object Model?)

If you’re read this far, I thank you for your patience and attention, and I will be grateful for any pointers you care to offer.

JavaScript is pretty much tied to the DOM. The industry-standard best-practice to insert content (or ads) in to the HTML content of the page is to do it from the server-side using server-side code such as php.

JavaScript is pretty much tied to the DOM. The industry-standard best-practice to insert content (or ads) in to the HTML content of the page is to do it from the server-side using server-side code such as php.

I don’t have access to the server-side. The hosting service has a fairly narrow interface. The only place I can run code is in the browser. I can, however, store files on the server; these files can be read by URL via http (but not ftp). The only way to store data via a web page is to use a form that emails it or buries it in an Excel file.

My point about the DOM is that I don’t want to insert one hundred, or two hundred, or more, parsed, digested, and internally represented objects into it when only twenty may be needed, -especially- when I don’t know for sure whether each one will result in a request to the E-tailer’s servers, with internal data requirements probably thirty times or more that of the HTML fragment. I’d rather handle the HTML fragments as text. It’s less a burden, in space, in execution, and especially in delay to the user. I realize that my variables/arrays/whatever will be part of the larger object model, but I don’t want to insert hundreds of (X-)HTML objects, each specifying a request to the E-tailer’s servers, and then move them around and shut ninety percent or more of them off, when all I need is to manipulate the text. Unless you can convince me that it is cheaper in memory and compute time for the browser to parse and build all the linked data structures, I question this “best practice” on the grounds that I have yet to meet a browser that was fast enough, or that I could not drive into paging hell with an hour of moderate use. (And yes, I’m using Chrome and I use Firefox and this machine has 8GB and multiple processor cores with an aggregate of over 12 GHz of CPU between them.)

In terms of code complexity, letting the environment parse my “data” reduces my programming, but may make it impossible to write the parameters I need … unless the XML is just a wrapper over the HTML source text. But that opens the door to another problem: it’s easier to show someone how to cut and paste into a text file than it is to teach someone how to write the syntactically correct XML wrappers. Otherwise the difference in complexity may cut the other way; I may need to learn how to manipulate objects in the DOM when otherwise all I need its to output strings. Yes, this is a learning curve, but that’s why I’m here.

So: it has to run in the browser (but can be stored in a separate server file). The store of ad code fragments must be suitable for maintenance by someone with minimal training, including perhaps a comment syntax in the file or files. There should be room for parameters to control preferred placement and describe the kind of article advertised. And it should not place burdens an order of magnitude greater than necessary on its execution environment.

Which of these requirements violates good practice?

The difficulty with JavaScript is that it cannot be relied on from a business perspective. It may not even be turned on and enabled on a users web browser, so the fundamental basis of the user experience has to begin from a perspective of using no JavaScript. It is from there that JavaScript starts to become useful to enhance the user experience.

I think it would be best though for me to bow out of assisting you at this point, because we’re just going to be at cross purposes to each other.

All the best with your project.