[Wordpress] Unwanted HTML removal

Hi,

I’ve search around for how to remove unwanted HTML tags which are added by TinyMCE, etc.
The only solution I’ve come across remove all p tags by disabling various filters.

This isn’t any good for semantic page structure.
Does anyone have any tips which could point me in the right direction?

Cheers.
Michael

Hi,

Which tags is TinyMCE adding to your markup?

You can customize very many aspects of TinyMCE’s behaviour with the use of the filter tiny_mce_before_init.
So, if you can find out exactly what is bothering you, you can more than likely do something about it.

For example, adding the following code to your functions.php file will ensure that TinyMCE produces <br /> tags instead of <p> tags when you press enter.

function change_mce_options($initArray) {
  $initArray['forced_root_block'] = false;
  $initArray['force_br_newlines'] = true;
  $initArray['force_p_newlines'] = false;
  return $initArray;
}

add_filter('tiny_mce_before_init', 'change_mce_options');

In this case if you want a <p> tag, then use a newline, e.g.

This
is all in
one <p> tag.

This is in a second
one.

Hi,

Thanks for the reply.
My problem is nearly everything is being wrapped with p tags, even header tags and div tags.

I’ve wrote a few short codes which output markup for the likes of jQuery accordions. I need the user to have the ability to provide clean and well formatted shortcode structure so they can maintain it easier. The only way I’ve found which will sort the issue is to remove every single break within the structure but that makes it almost unmaintainable.

I’ve also got another short tag which simply enters an openning and closing div tag in the content to be styled as a divider and even that is getting wrapped.

Regards
Michael

Does adding the above code to your functions.php help at all?

There is also a function called wpautop which changes double line-breaks in the text into HTML paragraphs (<p>…</p>).
You can disable this from within functions.php:

remove_filter( 'the_content', 'wpautop' );

There’s also a plugin available to enable/disable the filter on a post-by-post basis. Maybe that might help:
http://wordpress.org/extend/plugins/wpautop-control/

Hi,

Your solution didn’t work the way i’d like it to.
I want to keep the paragraphs which are created by the user.

wpautop is no good either as that removes all paragraphs.

Basically, i only want markup that the user has entered, including paragraphs, to be output.

This is my problem;

What i want is this:


[accordion]
[accordion_item]
<p>Paragraph</p>
[/accordion_item]
[accordion_item]
<p>Paragraph</p>
[/accordion_item]
[/accordion]

But instead i get this:


<p>[accordion]<br />
[accordion_item]</p>
<p>Paragraph</p>
<p>[/accordion_item]<br />
[accordion_item]</p>
<p>Paragraph</p>
<p>[/accordion_item]<br />
[/accordion]</p>

I’m not sure whether adding some preg_replace to the shortcode would be beneficial as not every user will format like i did above.

Any tips?

Cheers

Yeah, that’s a tricky one.
It would be easier if you knew how you are expecting your users to format the input.

But, what about if you run wpautop from within your shortcode?

function accordion_shortcode($atts, $content = null) {
   $content = wpautop(trim($content));
   return '<div class="accordion">' . $content . '</div>';
}
add_shortcode('accordion', 'accordion_shortcode');

This was suggested here: http://sww.co.nz/solution-to-wordpress-adding-br-and-p-tags-around-shortcodes/

Other than that there is a good plugin which alows you to wrap code in [raw] tags and have Wordpress leave it as it is (i.e. not format it):
http://wordpress.org/extend/plugins/raw-html/

If you set the wpauto filter so that those extra p elements get removed, why not then use the HTML (instead of visual) view of the editor? That would do what you want. If the user enters markup anyway, then there’s no reason not to use the HTML view, plus, that way you get exactly what you have entered and WP does not add extra markup that way either.

That method isn’t any good.
I added the clean ‘markup’ i provided in a previous post in the HTML editor, switched to the visual editor and then immediately back and it added the unwanted markup.

I need to have it so that the markup is clean and using the HTML editor is not possible for a lot of users as they may not have HTML experience and many of which will not want to gain any experience.

Cheers

For now i’ve implemented a cleanup function which removes all the unwanted markup surrounding the shortcode tags, ie ‘<p>[shortcode_name]</p>’ is cleaned up to ‘[shortcode_name]’, etc.

The visual editor was never meant for users to add their own markup to begin with, so naturally it will add markup, that’s the point of the visual editing mode.

Have you looked at alternative text entry methods? E.g. using something like Markdown or Textile? They’re a lot cleaner but may not be for everyone.