Products
GG网络技术分享 2025-03-18 16:15 31
We have a requirement to remove special characters from text strings. For example, we may get a string that looks like this; the ® is the registered trademark symbol:
PEPSI® Bottle 20 oz<br><br>
I\'m not great with regex, and can\'t figure out how to edit the existing code to produce that.
Here\'s what we currently have:
$ui = \"PEPSI Bottle 20 oz<br><br>\";$ui = preg_replace(\'/[^A-Za-z0-9\\.\\\' -]/\', \'\', $ui);
This results in PEPSI174 Bottle 20 ozbrbr.
Our desired result is PEPSI Bottle 20 oz<br><br>.
How can I edit the regex to make sure that
<br>, andWe don\'t want to have it remove all the numbers, as obviously the string can contain numbers; it\'s only numbers that are part of the entity code that we need to remove.
You could use this but now I can\'t guaranty it covers all the possible HTML entities:
$res = preg_replace(\'/&[A-Za-z0-9#]+;/\', \'\', $ui);That says replace any substring that:
- starts with&- followed by any number of alphanumeric characters or # in random order
;. 理论上,正则表达式办不到。
解析HTML,从DOM上删除。
Demand feedback