How to remove all attributes from HTML tags using PHP
Last Updated on Dec 13, 2022 - Written By Torikul Islam
You can remove attributes, which are inside HTML tags, using preg_replace() function. It is a PHP built-in function. This approach require RegEx method. If you want to skip RegEx, you can use str_replace() function in few cases.
In this article, I will discuss different methods with proper example.
Syntax
preg_replace('find', 'replace', 'string');
We need to provide three parameters.
1. Findis the first parameter which should be replaced with replace parameter.
2. Replace will replace the finds.
3. String will contain the full string from where you want to remove attributes.
In the placement of parameters, you can use PHP variables.
Discussion: remove attributes from HTML tags
Here I includes few examples to clearify different methods
Example 1: Remove all attributes from HTML tags
<?php
//input string
$string = '<p style="color:red;"><b style="padding:0;margin:0;">Example text</b></p>';
//removing all attributes from html tags
$string = preg_replace("/<([a-z][a-z0-9]*)[^>]*?(\/?)>/si",'<$1$2>', $string);
//showing result
echo $string;
?>
Normal result:
<p style="color:red;"><b style="padding:0;margin:0;">Example text</b></p>
After removing attributes:
<b>Example text</b>
You can use htmlentities() function to see the result more clearly.
echo htmlentities($string);
Output:
<p><b>Example text</b></p>
Note: htmlentities() is a PHP built-in function, which show the HTML tags on browser output.
Note: In rare cases this method could fail, which the Anti-HTML + RegExp will tell you. For instanse, <span style=">"> would end up <span>"> in output.
You can use Zend_Filter_StripTags as a more full proof tags/attributes filter in PHP.
Another simple solution is using str_replace() function. You should keep in mind that this function works for only specific parameter values.
Example 2: Remove specific attribute from specific HTML tag
<?php
//input string
$string = '<span style="color:red;"><b style="padding:0;margin:0;">Example text</b></span >';
//removing specific attribute from html tag
$string = str_replace('<span style="color:red;">','<span>', $string);
Or
//input string
$string = '<span style=">"><b style="padding:0;margin:0;">Example text</b></span >';
//removing specific attributes from html tag
$string = str_replace('<span style=">">','<span>', $string);
//showing results
echo htmlentities($string);
?>
Output:
<span><b style="padding:0;margin:0;">Example text</b></span>
RegExp explanation
1. / <code>Start Pattern</code>
2. < <code>Match '<' at beginning of tags</code>
3. ( <code>Start Capture Group $1 - Tag Name</code>
4. [a-z] <code>Match 'a' through 'z'</code>
5. [a-z0-9]* <code>Match 'a' through 'z' or '0' through '9' zero or more times</code>
6. ) E<code>nd Capture Group</code>
7. [^>]*? <code>Match anything other than '>', Zero or More times, not-greedy (wont eat the /)</code>
8. (\/?) <code>Capture Group $2 - '/' if it is there</code>
9. > <code>Match '>'</code>
10. /si <code>End Pattern - Case Insensitive & Multi-line ability</code>
Finally, <$1$2> just reserve the tag names and texts inside them to serve in output.
Reference: