Accessing pseudo-elements

While I’m on a hunt for a wrapped text drop cap, I have reached a point where I need to dismember my text in a <p> into <span>s, rendered line by rendered line.

Now, I don’t know if rendered text from that <p> will show \r
when read from DOM. I guess I would go with that, by I need to research.

One other thing I need to research is the following. I’ve thought of an algorithm involving :first-line pseudo-element. My idea is to build these <span>s by repeatedly taking the text wrapped in :first-line, while in the same time trimming the <p> text just by that, so the following line of text would become the new :first-line.

Something like this:

  1. create a new <p>
  2. take :first-line text from old <p>
  3. create a <span> for it
  4. add the <span> to new <p>
  5. delete :first-line text from old <p>
  6. repeat steps 2-5 with the trimmed old <p> until condition

Anyone can spare me the research and just hand it to me? :blush: Does it shows
? Can I use :first-line in DOM with JS?

Big thanks.

“newp” is the string that contains the contents of the paragraph, with each space <span>ed. It has to be “stored” because otherwise, when the window is resized, the function will be applied on text that already has <span>s (wrapping each line). It also makes it that little bit faster the second time around.

function spanify(p, newp) {
  // if newp isn't supplied as an argument, that means
  // this was called for the first time (not by resize event)
  if (!newp) {
    // replace spaces with spans
    var newp = p.innerHTML.replace(/\\s+/g, '<span> </span>');
    // every time the window is resized, call the function, passing the string newp over.
    window.onresize = function() {spanify(p, newp)}
  }
  // replace the contents of the paragraph with the string we just created
  // when the window is resized, newp already exists, so this is effectively where the function starts
  p.innerHTML = newp + '<span> </span>';
  // create a nodeList of the new span elements so we can loop through them
  // also find offsetTop of first span 
  var spans = p.getElementsByTagName('span'), offset = spans[0].offsetTop, firstline = lines = '';
  for (var i = 0, j = spans.length; i < j; i++) {
    // if we have reached the first span of the next line...
    if (spans[i].offsetTop !== offset) {
      // that means 'firstline' is now complete
      // add this complete line to the "lines" string
      lines += '<span>' + firstline + '</span>';
      // reset the reference offset to the new offset of the next line
      offset = spans[i].offsetTop;
      // and reset firstline as well
      firstline = '';
    }
    // add the current span contents to the firstline var (might not be the first one any more though)
    firstline += spans[i].previousSibling.nodeValue + ' ';
  }
  // this is necessary, otherwise the last line gets missed out
  lines += '<span>' + firstline + '</span>';
  // replace paragraph contents with each <span>ed line 
  p.innerHTML = lines;
}

// run the function for the first paragraph in the document
spanify(document.getElementsByTagName('p')[0]);

That way we reduce the number of loops (computing time)?

That’s beyond overkill. What we’re doing here is peanuts in terms of computing. You only need to start thinking about stuff like that if we were dealing with a paragraph containing, say, 1000+ words.

Before I go any further, let me see if i understand the code correctly.

spanify (two <p>; the second one optional) {

if the second one not specified {
create a new var that holds our p, only with spaces <span>ed.
associate the resize event with the actions performed by this function (isnt’ it redundant, as it will make the association every call?; and the association lacks in case of a second (clone) <p>?)
}

add a new <span>ed space at the end of our paragraph

build the array of this <span>ed spaces; get the first offset; initialize empty line

loop through the array {
if we find a difference in offset values {
build the current <span>ed line from firstline variable element
reset the offset and firstline
}
get the words, put space between them and put them in firstline variable
}

put the last line

replace <p> content with new content

? where is the resize event for this part?
}

OK?

I love it ! It’s getting there! But it’s not impossible with the second loop. Not entirely.

What I’ve said about building that array of spaces: in order to optimize our search, we can use a divide-et-impera algorithm (something like bubble-sort etc):

  • divide the vector in half;
  • and then in another half, and so on;
  • we compare two spaces; when finding two spaces with different offsets, we can exclude their immediate neighbours from the algorithm. Something like minesweeper.

That way we reduce the number of loops (computing time)? But that’s for later and that’s why I thought of a vector, an array for these <span>ed spaces. Which you provided! Thank you!

Thanks for the addition. Strangely enough, IE showed the last line. Opera does not comply. Firefox is a good boy. Chrome doesn’t like the sizing up. Works fine with minimize/maximize or sizing down.

About wrapping spaces instead of the words is pretty much the same thing. Well, yes, pretty same number of 1-char <span>s (actually -2). Taking up not that much memory as those <span>s of words. And also it resolves the problem of words splitting over two lines.

Think you can adjust the code? Pretty please !

Actually I think that using/<span>ing the spaces between words and not the actual words in the paragraph, resolves the word-splitting issue. And I believe it leads to a better/faster/simpler solution for this line-splitting algorithm that uses the difference in offset.

This word-wrapping must come with a “trace”… for JS to “sniff”.

Thanks for now. I’ll sleep on it and give them a go, both your spanify and getComputedStyle. I’ll let you know how it goes.

Thanks again. Have a good one.

Thanks Raffles, will try. It’s a very good solution.

I’ve come across this solution you got for me, but I’ve hopped of a different one, based on CSS styling read by JS.

Is there really no JS/DOM way of telling what is the text for the :first-line, if, let’s say, I’m having a different CSS style for it (since it’s rendered differently)? I mean, by testing style word by word, or char by char. When the UAs calculate where the :first-line stops, they must have something somewhere to base their decision on. And I guess JS it’s the tool to access that something.

And when reading the text content of the node, not by innerHTML, JS is not reading rendered result which should contain
control for the lines created? 'Cause I believe this info exists somewhere, even if only temporary, until the rendering parameters change.

Thanks again.

You can’t just “target the last space in a line” without studying the entire line. JavaScript has no idea of “lines”. This is why it’s necessary to look at every single word/space until they change offsetTop values.

This is still rather nebulous:

But I would like this code to target with spaces. […] Get a vector with the ones last in the line of text. The use their positions to insert </span><span> text in innerHTML (or nodes) at those positions.
It’s pretty hard to see exactly what you mean. Perhaps you mean string replacement rather than looping:

function spanify(p, newp) {
  if (!newp) {
    var newp = p.innerHTML.replace(/\\s+/g, '<span> </span>');
    window.onresize = function() {spanify(p, newp)}
  }
  p.innerHTML = newp + '<span> </span>';
  var spans = p.getElementsByTagName('span'), offset = spans[0].offsetTop, firstline = lines = '';
  for (var i = 0, j = spans.length; i < j; i++) {
    if (spans[i].offsetTop !== offset) {
      lines += '<span>' + firstline + '</span>';
      offset = spans[i].offsetTop;
      firstline = '';
    }
    firstline += spans[i].previousSibling.nodeValue + ' ';
  }
  lines += '<span>' + firstline + '</span>';
  p.innerHTML = lines;
}

spanify(document.getElementsByTagName('p')[0]);

It’s even simpler now and the regular expression can deal with tabs and newline characters too. But it’s impossible to not do the second loop.

Wow, that was fast! And it’s also pretty! You also covered al bases with extra <p> or not! Thank you so much!

I’ll put it to work and take it under advisement. I guess I’m only looking at it twice because of the innerHTML thing. About the resize, it’s a better choice than mine. I guess. It’s something I can’t grasp there. It applies to only one use case

spanify(document.getElementsByTagName('p')[0]);

and only once? Thanks again. I’m learning.

Still, I would love to see a space-between-words implementation. 'Cause I got some extra plans for that, a change in algorithm.

And this brings up another problem: it only works when words are not spliting between lines.

This reminds me, Firefox for example can wrap at - and / I believe. I remember some browsers wrapping long urls in anchor tags and others (opera for example) not. So, each browser has its own list of what it considers a word-break.

Which makes me wonder if \b metacharacters in regexen are affected by whether or not a browser considers non-space chars to be “wrappable”.

CSS seems to access the DOM in a completely different way than Javascript, which might explain why it’s somehow a huge task for CSS to have parent selectors, while for Javascript it’s just a .parentNode away.

I’m sorry to keep bugging you.

To be clear, p.innerHTML.split(’ ') it’s not the way I think of. Your solution of shifting <span> tags is teaching me something, I like it. But I would like this code to target only spaces. It’s doable by your current code, yes. But never to “load” in bits the words. Get a vector with the ones last in the line of text. The use their positions to insert </span><span> text in innerHTML (or nodes) at those positions.

I meant window.onresize = function() {spanify(p, newp)}. It only applies in one case, when only a <p> is passed as a parameter?

And probably to use some divide-et-impera techniques to speed it up, as after finding a different offset, it’s unlikely that two adjacent spaces will be the last ones on their lines.

Am I sounding like a client from hell? (:

@Raffles

A small correction: in order to split lines into <span>s, you need to first check the offset, then add the word to firstline, so I’ve moved the condition up when checking for offset and building the lines.

function spanify(p) {
  var bits = p.innerHTML.split(' '), newp = '';
  for (var  i = 0, j = bits.length; i < j; i++) {
    newp += '<span>' + bits[i] + " " + '</span>';
  }
  p.innerHTML = newp;
  var spans = p.getElementsByTagName('span'), offset = spans[0].offsetTop, firstline = '';
  newp = '';
  for (var i = 0, j = spans.length; i < j; i++) {
    if (spans[i].offsetTop !== offset) {
      newp += '<span>' + firstline + '</span>';
      offset = spans[i].offsetTop;
      firstline = '';
    }
    firstline += spans[i].innerHTML;
  }
  p.innerHTML = newp;
}

This approach is bad, as it takes up with a lot of words. I’m thinking a reverse approach: <span>ing the spaces between words instead of the actual words, and look for a different offset with them, while buiding a vector with different offset position spaces. Later, I use this vector to “cut” the paragraph by using string functions.

And this brings up another problem: it only works when words are not spliting between lines. But this is not such a big of a problem at this point.

What do you think?

This is a space-between words implementation! The spaces are wrapped with <span> tags, not the words. If you mean something else, clarify!

Well, the way I have it it works for the first <p> element in the document. But you could call spanify() and pass whatever you want as the argument. You could do it for all paragraphs by looping through them, calling spanify() for each one.

What does “takes up with” mean?

Wrapping spaces instead of the words is pretty much the same thing, requiring pretty much the same number of SPAN elements. I really don’t see another solution.

I admittedly didn’t test the code at all, so thanks for spotting the mistake. I also noted the last line wasn’t added, which was a simple solution:

function spanify(p) {
  var bits = p.innerHTML.split(' '), newp = ''
  for (var  i = 0, j = bits.length; i < j; i++) {
    newp += '<span>' + bits[i] + " " + '</span>';
  }
  p.innerHTML = newp;
  var spans = p.getElementsByTagName('span'), offset = spans[0].offsetTop, firstline = '';
  newp = '';
  for (var i = 0, j = spans.length; i < j; i++) {
    if (spans[i].offsetTop !== offset) {
      newp += '<span>' + firstline + '</span>';
      offset = spans[i].offsetTop;
      firstline = '';
    }
    firstline += spans[i].innerHTML;
  }
  newp += '<span>' + firstline + '</span>';
  p.innerHTML = newp;
}

I think it works quite well, considering it’s a rather hacky workaround.

That said, what you want to do is going to be very difficult, because there is no way with JavaScript to tell where the browser has decided to start the second line of text. JavaScript is pretty much entirely ignorant of pseudo-classes and pseudo-elements, though for pseudo-classes there are sometimes ways to achieve similar effects (e.g. the mouseover+mouseout events for :hover).

However, for :first-line I think the only solution is to do it by chopping up the paragraph word by word, and wrapping each word in a <span>. Then loop through them and stop when you reach one that has a larger offsetTop than before (i.e. it’s on a new line). Now you know which words make up the first line.

function spanify(p) {
  var bits = p.innerHTML.split(' '), newp = ''
  for (var  i = 0, j = bits.length; i < j; i++) {
    newp += '<span>' + bits[i] + " " + '</span>';
  }
  p.innerHTML = newp;
  var spans = p.getElementsByTagName('span'), offset = spans[0].offsetTop, firstline = '';
  newp = '';
  for (var i = 0, j = spans.length; i < j; i++) {
    firstline += spans[i].innerHTML;
    if (spans[i].offsetTop !== offset) {
      newp += '<span>' + firstline + '</span>';
      offset = spans[i].offsetTop;
      firstline = '';
    }
  }
  p.innerHTML = newp;
}

Now each line should be in a <span>. It’s a bit hacky, but should work. The argument for the function should be a P element (the actual DOM element).

Is there really no JS/DOM way of telling what is the text for the :first-line, if, let’s say, I’m having a different CSS style for it (since it’s rendered differently)? I mean, by testing style word by word, or char by char.

I’d forgotten about this. getComputedStyle has an argument for pseudo-elements. I’m tired now, but you might want to give it a go and see if it works. Unfortunately IE doesn’t support it at all so even if it works it might be of no practical use.

Other than that, there is no way I know of for JavaScript to know that the first line of some text is styled differently to the rest via :first-line. It’s not “reported” anywhere (and it can’t be defined in the style object (style=" ")). It seems to exist directly between CSS and the browser’s rendering engine, without any effect on the DOM (thus being invisible to JavaScript).

Regarding newlines (
), JavaScript can detect them. But there will not be a
at the end of each line, unless the lines were created by someone actually pressing “enter” rather than the browser simply word-wrapping the text in a container. When the browser word-wraps, it doesn’t introduce
characters at those points.

Regarding the amount of memory, it’s pretty much identical, because <span>we’re</span> <span>dealing</span> <span>with</span> <span>strings</span> (and innerHTML) so it doesn’t matter where the <span> tags are. Still, you might have a point with the words splitting over lines issue. It was a trivial fix anyway, just moving the <span> tags around.

I also realised that it would all get messed up if the window was resized, so added something to deal with that:

function spanify(p, newp) {
  if (!newp) {
    var bits = p.innerHTML.split(' '), newp = ''
    for (var  i = 0, j = bits.length; i < j; i++) {
      newp += bits[i] + '<span> </span>';
    }
    window.onresize = function() {spanify(p, newp)}
  }
  p.innerHTML = newp;
  var spans = p.getElementsByTagName('span'), offset = spans[0].offsetTop, firstline = lines = '';
  for (var i = 0, j = spans.length; i < j; i++) {
    if (spans[i].offsetTop !== offset) {
      lines += '<span>' + firstline + '</span>';
      offset = spans[i].offsetTop;
      firstline = '';
    }
    firstline += spans[i].previousSibling.nodeValue + ' ';
  }
  lines += '<span>' + firstline + '</span>';
  p.innerHTML = lines;
}

spanify(document.getElementsByTagName('p')[0]);

I’m afraid I can’t test with IE as I’m using Ubuntu.