PHP regex, skip <link> tags when rel="canonical" -


i run php script in wordpress removes http: , https: protocols links using following regex:

$links = preg_replace( '/<input\b[^<]*\bvalue=[\"\']https?:\/\/(*skip)(*f)|https?:\/\//', '//', $links ); 

for first part: <input\b[^<]*\bvalue=[\"\']https?:\/\/(*skip)(*f), skips <input> tags have http: / https: value, such as:

<input type="url" value="http://example.com"> 

additionally, i'd skip <link> tags have rel="canonical" attribute:

<link rel="canonical" href="http://example.com/remove-http/" /> 

using regex tester, i've been trying update logic. i've come far:

<(input|link)\b[^<]*\(value|rel)=[\"\'](https?:\/\/|canonical)(*skip)(*f)|https?:\/\/ 

but hasn't worked me.

the (*skip)(*f) verbs used discard text matched far , proceed search next match position regex index after matching text pattern before these verbs.

so, match word1 or word2, drop them , go on word3, need use

'~(?:word1|word2)(*skip)(*f)|word3~' 

the (?:...) non-capturing group group alternatives must dropped.

in case, whole <link...> should matched, not attribute. thus, need link\b[^>]*?\brel=[\'\"]canonical[\'\"][^>]*> instead of word2 in above regex.

however, should think using html parser compatible environment (i saw note domdocument malfunctions there).


Comments

Popular posts from this blog

javascript - Clear button on addentry page doesn't work -

c# - Selenium Authentication Popup preventing driver close or quit -

tensorflow when input_data MNIST_data , zlib.error: Error -3 while decompressing: invalid block type -