Exctract entire lines from text file to array

Inspector_Mills · June 4, 2012, 2:45pm

Hi there

I need a PHP script to extract a number of lines from a text file, and insert them into an array.

The lines that I need to extract all have the following in common:

the line preceding it has a line of dashes, preceded by 4 spaces ;
the line after it is blank, and then is followed by another line of dashes (with no preceding spaces).

Like this:
----------------------
Total : 26.225 %

Any suggestions on how to do this?

Thanks in Advance.

Mills

MarPlo · June 4, 2012, 3:43pm

Hi
An ideea is to use the file(‘file.txt’) function (see on php.net ). It creates an array with the rows from a file.
Then you can traverse the array (with for() ), and apply some instructions, like preg_match() to store in another array the lines you want.

Jake_Arkinstall · June 4, 2012, 3:47pm

If you don’t have a rather large file, preg match will do what you want:

Data.txt:


fsaffjsf some unrelated data
sdsds
sgfag
238123942 
------

    ----------------------
    Total     :   26.225 %

--------------------------
dsd
sdaffgasf35515
2------------
23----
    ---------------------------
    Monkeys  : Furry

--------------------

sds
afaf


$Content = file_get_content('Data.txt');
preg_match_all('~\\s{4}[\\r\
-]+\\s*(.+):(.+)\\s*[\\r\
-]+~m', $Content, $Matches);
$Data = array();
for($i = 0; $i < count($Matches[1]); $i++){
	$Field = trim($Matches[1][$i]);
	$Value = trim($Matches[2][$i]);
	$Data[$Field] = $Value;
}
var_dump($Data);

Output:


array(2) {
  ["Total"]=>
  string(8) "26.225 %"
  ["Monkeys"]=>
  string(5) "Furry"
}

If you just want an array of matching lines and no parsing the data, then:

preg_match_all('~\\s{4}[\\r\
-]+\\s*(.+)\\s*[\\r\
-]+~m', $Content, $Matches);
$Data = array();
foreach($Matches[1] as $Match){
	$Data[] = trim($Match);
}

Output:


array(2) {
  [0]=>
  string(22) "Total     :   26.225 %"
  [1]=>
  string(16) "Monkeys  : Furry"
}

tom8 · June 4, 2012, 11:45pm

If you do have a large file then best to look for start of pattern and then extract the line.

Here is an example:

<?php
   $pattern = "    ----------------------";
   $data = array();
   $extract = 0;
   $file = fopen('data.txt', 'r');
   while (!feof($file))
   {
	   $line = rtrim(fgets($file)); // remove any spaces or NL
	   if($extract)
	   {
		   $data[] = $line;
		   $extract = 0;
	   }
	   if($line==$pattern) $extract = 1;
   }
   fclose($file);
   echo "<pre>";
   var_dump($data);
?>

I’m using the same file as Jake Arkinstall’s but modified the start of pattern string to be the same.

Here is the result:

array(2) {
  [0]=>
  string(26) "    Total     :   26.225 %"
  [1]=>
  string(20) "    Monkeys  : Furry"
}

Jake_Arkinstall · June 5, 2012, 8:58am

My thoughts exactly Although I wasn’t sure on whether or not the start would be uniform, so kinda just made it arbitrary. It’s a strange format of data file, I must say.

Out of interest, why did you opt for $extract being an integer rather than a boolean?

tom8 · June 5, 2012, 12:35pm

If start pattern could have different formats then we can use preg_match() as you did to check for various pattern that can be used.

Although the example can use boolean, if $extract is true then extract the line and reset it back to false. However, if the match requirement is changed, say you want to extract all the lines till end pattern is found, which might expand more than one lines, then we need to set it to some number telling us whether or not the line is to be exacted. There might be some other conditions where $extract need to assign a different value to take on an appropriated action. Hope this make sense.