Some strange ASCII characters are afoot

On my website users can submit testimonials that get put into a database. Apparently some of these people type up their testimonial in a word processor first, then copy/paste it into my form, because sometimes I encounter issues with what they submit.

Today I found an example on my website where, at first glance, it looked like there was not a proper hard return between paragraphs, but it’s likely the way I’m displaying text from the database. I copied the the paragraphs and pasted them into “Clean Text”, which is a cool app for Macs. In there it showed that there was actually two paragraphs.

Next I put my cursor at the beginning of the second paragraph and expected to hit the backspace key twice, which should have eliminated the hard return. But in this case it took hitting the backspace key three times. So this made me think there is an extra unseen character in the text.

Next I went to asciivalue.com, pasted in my text example, and it spit out the results of all the ascii characters. Below is a screenshot:

You can see that my example of text was, “improved. I started”. So these results are showing several characters after the period. I’m not sure if this comes from a person’s word processor when they are submitting their testimonial or what.

Can someone help me with the next step? I think the best way to fix these types of issues in the database is to do a mySQL query that will locate all of the places where this happens. Has anyone run into this problem before and help me know what to do next?

Thanks!

I would address this in the script handling the testimonial submissions.
Would that be written in PHP?

Yes, the site uses PHP. But what I need help with is to identify why there seems to be 5 characters that represent the two spaces after the period. Can you help?

Hi,

This is not something that I have ever run into, but as I said, it would probably make more sense to handle this before the testimonials are saved in the database.

I’ll move this to the PHP forum and see if anyone else can help.

Going by the Decimal values the characters are

..... 
46 period 
13 carriage return 
10 new line 
32 space 
13 carriage return 
10 new line 
..... 

What to do about them depends on how they are messing things up when you are using the values.

Maybe you’re looking for nl2br() ?

Those characters add a blank line in your text

It adds more than just a blank line. See the conversation above.

It adds a line with one space on it. So it’s blank visually but there is one space character inside. That why you had to make one extra backspace press.

Yes, but when viewing the text on a webpage, the two paragraphs are not separated by a blank line. This is what I’m trying to fix. Do you have any suggestions that will achieve what I’m seeking?

Thank you.

Sure they not, browser doesn’t care about new line characters in HTML.
It makes line breaks only when <br> tag is spotted.
If you want to convert your new lines into <br>'s use nl2br() function which was already mentioned by @Mittineague eariler

Thanks everyone for your input. The following query is what I used to find the problematic paragraphs:

select dateAdded, testimonialID from testimonials where body LIKE CONCAT('%', CHAR(46), CHAR(13), CHAR(10), CHAR(32), CHAR(13), CHAR(10), '%');

1 Like

If it’s the extra blank line you’re searching for, I’d probably have left the CHAR(46) off the start of that query, as that’s the full-stop ( . ) from the end of the preceding paragraph. So if you get someone who isn’t so good at punctuation but still has the extra blank line, you’ll miss it.

droopsnoot, when I altered the query as you suggested, it simply found paragraphs than ended in a ! or a :, etc. The query I posted seem to do the trick.

Thanks though!

Yes, I’d expect it would. I was intending it would find any place you had a CR, LF, space, CR, LF sequence, so any paragraph end mark (including no paragraph end) would match.