preg_match_all problem

Hello,

If i have a HTML code like that


<div class="profile">
<h1>TOM</h1>
<h2><a href="http://www.domain.com" title="Tom Site">Tom Site</a>&nbsp;</h2>
<div><span class="about">About Tom</span></div>
</div>

<div class="profile">
<h1>Mark</h1>
<h2><a href="http://www.domain.com" title="Mark Site">Mark Site</a>&nbsp;</h2>
<div><span class="about">About Mark</span></div>
</div>

<div class="profile">
<h1>David/h1>
<h2><a href="http://www.domain.com" title="David Site">David Site</a>&nbsp;</h2>
<div><span class="about">About David</span></div>
</div>

And i want to to extract or grab some information like:


TOM
Tom Site
About Tom

Mark
Mark Site
About Mark

David
David Site
About David


<?php
$html='
<div class="profile">
<h1>TOM</h1>
<h2><a href="http://www.domain.com" title="Tom Site">Tom Site</a>&nbsp;</h2>
<div><span class="about">About Tom</span></div>
</div>

<div class="profile">
<h1>Mark</h1>
<h2><a href="http://www.domain.com" title="Mark Site">Mark Site</a>&nbsp;</h2>
<div><span class="about">About Mark</span></div>
</div>

<div class="profile">
<h1>David/h1>
<h2><a href="http://www.domain.com" title="David Site">David Site</a>&nbsp;</h2>
<div><span class="about">About David</span></div>
</div>';

preg_match_all( '/<div class="profile">.*?<h1>(.*?)<\\/h1>/s', $html, $name, PREG_SET_ORDER );

foreach ($name as $val) {
    echo $val[1] . "<br>";

}
?>

i tried this code and i got the names only like


TOM
Mark
David

Now how i get all information in one time?

Thank You

$str = '<div class="profile">
<h1>TOM</h1>
<h2><a href="http://www.domain.com" title="Tom Site">Tom Site</a>&nbsp;</h2>
<div><span class="about">About Tom</span></div>
</div>
 
<div class="profile">
<h1>Mark</h1>
<h2><a href="http://www.domain.com" title="Mark Site">Mark Site</a>&nbsp;</h2>
<div><span class="about">About Mark</span></div>
</div>
 
<div class="profile">
<h1>David</h1>
<h2><a href="http://www.domain.com" title="David Site">David Site</a>&nbsp;</h2>
<div><span class="about">About David</span></div>
</div>';

echo strip_tags( $str);

// TOM Tom Site  About Tom 
// Mark Mark Site  About Mark 
// David David Site  About David

Cups cheats…

@Cups thank you for your help,

unfortunately strip_tags worked fine with this simple code.

But i need another solution with preg_match_all for advanced if i need to select part from code for example?

My friend would slap me if I didn’t suggest DOM -> PHP: DOM - Manual unfortunately I know not how to use it…

@derokorian i used DOM but i still get 1 line only [ names ] and i need to do that with 3 lines not only one.

We can’t help your fix your code without seeing it. :eye:

Ask a silly question and you get a silly answer. :slight_smile:

Why don’t you post what is a typical real example of what you are fetching, what you want to extract from it and tell us roughly why and what you are going to do with those variables you do extract.

Have you tried XPATH? It’s pretty much made for this kind of thing

@Salathe This the full html code


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Profiles Page</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Profiles" />
<meta name="description" content="Profiles Page" />
<link rel="shortcut icon" href="http://www.sitepoint.com/forums/images/favicon.ico" />
<link rel="stylesheet" type="text/css" href="css/style.css />
<script type="text/javascript" src="code.js"></script>
<!--[if IE]>
<link rel="stylesheet" type="text/css" href="css/ieonly.css" />
<![endif]-->
<script type="text/javascript">

  var _gaq = _gaq || [];
  _gaq.push(['_setAccount', 'UA-********-2']);
  _gaq.push(['_trackPageview']);

  (function() {
    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
  })();

</script>
</head>

<body>

<div class="profile">
<h2><a href="http://www.domain1.com/" title="Mark Site">Mark Site</a>&nbsp;</h2>
<p>Some text about Mark profile.</p>
<div><span class="url">http://www.domain1.com/</span></div></div>
<br />
<div class="profile">
<h2><a href="http://www.domain2.com/" title="Jordan Site">Jordan Site</a>&nbsp;</h2>
<p>Some text about Jordan profile.</p>
<div><span class="url">http://www.domain2.com/</span></div></div>
<br />
<div class="profile">
<h2><a href="http://www.domain3.com/" title="Michael Site">Michael Site</a>&nbsp;</h2>
<p>Some text about Michael profile.</p>
<div><span class="url">http://www.domain3.com/</span></div></div>
<br />
<div class="profile">
<h2><a href="http://www.domain4.com/" title="David Site">David Site</a>&nbsp;</h2>
<p>Some text about David profile.</p>
<div><span class="url">http://www.domain4.com/</span></div></div>
<br />
<div class="profile">
<h2><a href="http://www.domain5.com/" title="Rachael Site">Rachael Site</a>&nbsp;</h2>
<p>Some text about Rachael profile.</p>
<div><span class="url">http://www.domain5.com/</span></div></div>


</body>
</html>


<?php
$data = file_get_contents('test.html');

preg_match_all( '/<div class="profile">.*?<h2><a href=".*?" title=".*?">(.*?)<\\/a>&nbsp;<\\/h2>/s', $html, $name, PREG_SET_ORDER );
 
foreach ($name as $val) {
    echo $val[1] . "<br>";
 
}
//This code work for first variable only and i need to extract [ Profile owner name  - about profile owner - profile owner URL.
?>

@Cups sorry for that i put the full HTML code without any silly stuff :slight_smile:

@xzyfer i didn’t use XPATH before