We ran into an issue at work with a CMS a while back where characters such as & were being input into the DB correctly but when output back into the CMS for further editing they were sent back to the text area as & and not & as they should have been, because after that the actual character was being sent into the database, not the character code any more! This issue was also occurring with » (») which was being used extensively, and also ™ (⃨.)
When I say issue, I mean that this went just shy of out of control with our already tight deadline. I vowed to find a more efficient solution for our future “internal only” CMS’s (the ones that we use to then create static pages to ship off to clients.)
My solutions:
- Never use a CMS to create static pages to ship out
- Write a function to handle HTML entities correctly.
Well I’m still working on #1 and in the meanwhile I figured I ought to work on #2.
There’s a lot of steps that need to be addressed before this problem is solved.
- Transform characters before they enter the database or after they leave the database (edit vs render)
- How do you transform literal characters into something? Can you search and replace?
- How do you make sure that everyone is the right character code?
So I decided to start simple. I just want to make sure my &’s are encoded properly. I know, that’s kinda simple and boring but it’s a good start to get into the swing of things.
Regular Expression: Search and Replace
We’ll start with the PHP function:
PHP
<?php
$text = preg_replace($pattern, $replace, $haystack);
?>
We know what replace is going to be:
PHP
<?php
$text = preg_replace($pattern, '&', $haystack);
?>
We’ll create a simple haystack to search through for this sample:
PHP
<?php
$haystack = "<p>Amber & and I know that some “things” are a problem and it will be a give & take when she moves in…</p>";
$text = preg_replace($pattern, '&', $haystack);
?>
We need this haystack to end up looking like this when it’s done so that what is sent from the textarea -> DB -> textarea -> DB is consistent and logical.
HTML
<p>Amber &amp; and I know that some &#8220;things&#8221; are a problem and it will be a give &amp; take when she moves in&hellip;</p>
Simple enough?
Continue on to page 2: Make that regular expression to replace all those pesky &’s
Pages: 1 2
This site runs on the Thesis WordPress Theme
If you're someone who doesn't understand a lot of PHP, HTML, or CSS, Thesis will give you a ton of functionality without having to alter any code. For the advanced, Thesis has incredible customization possibilities via extensive hooks and filters. And with so many design options, you can use the template over and over and never have it look like the same site.
If you're more familiar with how websites work, you can use the fantastic Thesis User's Guide and world-class support forums to make more professional customizations than you ever thought possible. The theme is not only highly customizable, but it allows me to build sites with a much more targeted focus on monetization than ever before. You can find out more about Thesis below:









