4
Answers

Regex for scraping content from html

Ask a question
K C

K C

14y
5.6k
1
Hi all,

I need a regex to match the the text between <a> tags that are also nested between <h3> tags - for example:

<h3><a href="">this text</a></h3>

I started using this: (?<=<h3>)[\s\S]*?(?=</h3>) which does match the text between the header tag, I also started playing around with <a.*?>[\s\S]*?</a> to retrieve links which also works, but I am having problems when trying to combine the 2 together in one pattern.

I might be going about this the wrong way altogether! I am new to regex and am still trying to learn, any help with this will be greatly appreciated!!

Krishna

Answers (4)