[Interest] Extract only src attribute of an image tag into XmlRole with XPath functions

Gian Maxera gmaxera at gmail.com
Wed Jun 3 11:26:51 CEST 2015


Hello,
I have an rss feed coming from Tumblr blog page.
The xml of the feed has into description a lot of html content that I want to remove and keep only the first image I found.
For example, this is one of the content into description tag:

<description><img src="http://33.media.tumblr.com/bd4312958b742a21221e87c0a96d52c1/tumblr_np213siZRu1tbs1mwo1_500.gif <http://33.media.tumblr.com/bd4312958b742a21221e87c0a96d52c1/tumblr_np213siZRu1tbs1mwo1_500.gif>"/><br/> <br/><img src="http://33.media.tumblr.com/6f1ca4ab1ef3d2504b2da48f2616df6e/tumblr_np213siZRu1tbs1mwo2_400.gif <http://33.media.tumblr.com/6f1ca4ab1ef3d2504b2da48f2616df6e/tumblr_np213siZRu1tbs1mwo2_400.gif>"/><br/> Paris Marriott Champs Elysees<br/><br/> <img src="http://36.media.tumblr.com/0f4726dd2f19c8d4a042f72786987573/tumblr_np213siZRu1tbs1mwo3_500.jpg <http://36.media.tumblr.com/0f4726dd2f19c8d4a042f72786987573/tumblr_np213siZRu1tbs1mwo3_500.jpg>"/><br/> <br/><h2><b>Marriott Hotels in France Celebrate Earth Hour 2015</b></h2><p>See what happened during Earth Hour celebrations at Marriott hotels in France.</p></description>

What I would like is to have a XPath function that return me only the first image url:
http://33.media.tumblr.com/bd4312958b742a21221e87c0a96d52c1/tumblr_np213siZRu1tbs1mwo1_500.gif <http://33.media.tumblr.com/bd4312958b742a21221e87c0a96d52c1/tumblr_np213siZRu1tbs1mwo1_500.gif>

How can I do that ?

Thanks,
Gianluca.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/interest/attachments/20150603/5a94fd6c/attachment.html>


More information about the Interest mailing list