Easy URL Parsing With Isomorphic JavaScript

Originally published at: http://www.sitepoint.com/url-parsing-isomorphic-javascript/
Most web applications require URL parsing whether it’s to extract the domain name, implement a REST API or find an image path. A typical URL structure:

URL structure

You can break a URL string into constituent parts using regular expressions but it’s complicated and unnecessary…

Server-side URL Parsing

Node.js (and forks such as io.js) provide a URL API:
// Server-side JavaScript
var urlapi = require('url'),
    url = urlapi.parse('http://site.com:81/path/page?a=1&b=2#hash');

console.log(
	url.href + '\n' +		// the full URL
	url.protocol + '\n' +		// http:
	url.hostname + '\n' +		// site.com
	url.port + '\n' +		// 81
	url.pathname + '\n' +		// /path/page
	url.search + '\n' +		// ?a=1&b=2
	url.hash			// #hash
);

Client-side URL Parsing

There’s no equivalent API in the browser. But if there’s one thing browsers do well, it’s URL parsing and all links in the DOM implement a similar Location interface… Continue reading this article on SitePoint

From the article:

[quote=“ceeb, post:1, topic:114532”]
As you can see in the snippet above, the parse() method returns an Array containing the data you need such as the protocol, the hostname, the port, and so on.[/quote]

parse() returns an object, not an array. :wink:

Well spotted. We’ll get it changed…

isNode variable should be renamed to isCommonJs, because that check doesn’t tell you if you’re running in node. Consider when someone uses this script with JSPM or with browserify.

That’s a good point but I took the route which was most likely to succeed. Unfortunately, there’s no guaranteed way to distinguish Node from client-side JavaScript. Or not that I’m aware of. In some ways, that’s good. In others, it’s painful!

Hi Craig!
What am I missing? Why is a check for the existence of document or window not a guaranteed way to distinguish NodeJS from browser?

Because a Node.js program could easily set similar variables, e.g.

var window = {}, document = {};

There are DOM parsing libraries which do this sort of thing to enable server-side processing.

Similarly, a client-side JS program can define…

var module = { exports: function() { ... } };

Part of the beauty - and frustration - of JavaScript is anything can appear to be native. Detecting whether code is running on Node or in a browser isn’t guaranteed to work. But perhaps that’s a good thing?

Hi Craig!

OK, but that would mean one has to write something like:

global.document = {};
global.window = {};

…in nodejs right before the require call to fool a check in a module and override / change the global object.
That’s definitely bad practice, but I agree that it is unsafe because on can work around it as described.
Thanks for the hint!

Globals would do it, but so would any document or window variable defined in a module before isNode is set. While that would seem bad practice, what if you loaded a module named “window”…

var window = require('window');

Sure, you can shoot yourself into the food in various ways. :wink:
What I thought of was more like the following:

A utility module env.js

module.exports = {
    isBrowser: function() {
        return typeof window !== 'undefined' && typeof document !== 'undefined';
    }
};
console.log('document: ', typeof document);
console.log('window: ', typeof window);

And some app.js

// globals.document = {}; globals.window = {};
// or
// require('/some/malicious/module/which/alters/globals');
var env = require('env.js');
console.log('isBrowser: ', env.isBrowser());

As long as you don’t add some globals manipulating code before requiring env.js, it should be “safe”.
Assigning the return value to a variable called window only affects the module scope.

Another interesting approach is this http://www.timetler.com/2012/10/13/environment-detection-in-javascript/
I did not expect such a “trivial” problem to be unsolved yet. I bet node.js and io.js will come up with a safe solution, when isomorphic code gets more relevance.

How would you load env.js client-side? Unless you were using browserify or similar, you couldn’t depend on env.isBrowser()? But if Browserify is a dependency, it’s simpler to check for the existance of that … in which case, you don’t need env.js and your application disappears in a puff of logic!

Yeah, it’s surprising that Node’s been around for 5 or 6 years yet this remains a problem.

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.