06-29-2007, 12:26 PM
Because I'm an object oriented kind of guy, I'll show you how to not only make a recursive directory search but how to make an object that does it for you upon creation. This tutorial requires PHP4/5 and that is all; surprised?
All right, so lets create an initial framework. My explanations will be in the comments, and will describe the usage (or recommendation that they not be used outside the script) and then we'll get into the guts of the script.
Now that means that you can simply call your class like this:
In fact, we'll come back to this a little later and show how you can do all sorts of things to make yourself acquainted with its usage.
Now, back to the object on a per-function basis, and making it function like a real wonder.
Our constructor really needs some serious work; without initializing variables and starting our search, there's just no way for our object to work. So step one is to initialize the variables. Remember, a class/object variable is referenced inside the object's code as [iphp]$this->varname;[/iphp], so all initialization must use this convention and all internal function calls too.
Now, before we tackle the __search function lets fix up the __isdot function because it is very important.
Yes, that's it. You might be thinking that I'm nuts, but stop and think for a minute, okay? What are the first two results always found on both Windows and Linux computers? Allow me to show you:
1. We search the same directory infinitely.
2. We search both downwards and upwards through the entire computer's structure.
Yikes! Think about it for a second. You are searching in C:\Temp\htdocs, so the first result is .. Well, lets say you skip it and go to the second, .., dot result? Because of how we will prioritize things you will search C:\Temp long before you ever so much as touch a given file in C:\Temp\htdocs! And, worse still, you would then search not only the root directory/drive, but all of its given pathways in all directions.
Ugly? Yeah. That's why if your script ever locks up or starts turning out pages upon pages of irrelevant results your first check should be if you called __isdot(), and then if you handled it right.
Enough doom and gloom, lets move onto the real meat and potatoes of the search: __search($dir)!
Done. But there's gotta be more to it! Really? Recursive searches are deceptively simple, and before you fool yourself make sure that you know, and I mean know, that there really isn't more to it. Sure, you could have a
Property and such, but it really isn't essential.
Our final sourcecode:
That's all folks! And yes, you can make an anonymous functions as your callback, but sometimes that isn't what you want. You might want to reference a global object or do something that is messy in a string. In any case, it works for me ... just drop it into your own webserver's root directory (assuming that you delete it immediately afterwards, since you don't want folks knowing all the layout of your files) and try it for yourself.
All right, so lets create an initial framework. My explanations will be in the comments, and will describe the usage (or recommendation that they not be used outside the script) and then we'll get into the guts of the script.
PHP Code:
<?php
// This is how you start a class. You can say class x extends y, but we're not doing that yet.
class RecursiveSearch {
// Now, you can declare your variables here, but it is not good form to initialize them here.
// The reason for this is that functions outside a child function (for initialization) aren't
// called. That's right. So if I wrote:
// var $abc = dirname(__FILE__);
// It would not get initialized properly. Let us begin in earnest.
// This is the root directory of our search, it should always be an absolute path, and not a
// relative path.
var $basedir = '';
// This will be an array, when our object is created, of all results in a path relative to
// the base dir.
var $files;
// This is an array of our directories, just in case someone happens to want those sometime. ;-)
var $folders;
// Now, this is different. I use a double underscore for internal variables and this is a
// callback function for any action the user may want to execute on a per-file basis.
// The function will be called like this: call_user_func_array($__filefunc, $file)
// A function prototype should be: find_file($file){}
var $__filefunc;
// This is our framework constructor. All we need passed to it is the root directory to search,
// and an optional per-file callback. Because we don't want to supply the latter every time
// we'll make it default to ''.
function RecursiveSearch($root,$callback = '')
{
}
// This is our prototype internal search function. You shouldn't call this on your own. The
// dir string that gets passed is what lets us go recursive in our search. Only there are two
// problems which I will outline later.
function __search($dir = '')
{
}
// I won't explain why this is here yet, but it is crucial if you don't like infinite loops.
function __isdot($s)
{
}
}
?>Now that means that you can simply call your class like this:
PHP Code:
$search = new RecursiveSearch(dirname(__FILE__)."/inc/plugins");
$output = "<b>Search Results:</b><ul>\n"
foreach($search->files as $file)
{
// The .= operator is the same as: $var = $var."Hi!";
$output .= "<li>$file</li>\n"
}
$output .= "</ul>"
echo $output;
Now, back to the object on a per-function basis, and making it function like a real wonder.
Our constructor really needs some serious work; without initializing variables and starting our search, there's just no way for our object to work. So step one is to initialize the variables. Remember, a class/object variable is referenced inside the object's code as [iphp]$this->varname;[/iphp], so all initialization must use this convention and all internal function calls too.
PHP Code:
// This is our framework constructor. All we need passed to it is the root directory to
// search, and an optional per-file callback. Because we don't want to supply the latter
// every time we'll make it default to ''.
function RecursiveSearch($root,$callback = '')
{
$this->__filefunc = $callback; // We want this assigned even if blank. More later!
$this->basedir = $root;
$this->files = array();
$this->folders = array();
// This is how hard it is to initialize the object. Wow huh? In fact, more on this later.
$this->__search();
}
Now, before we tackle the __search function lets fix up the __isdot function because it is very important.
PHP Code:
// I won't explain why this is here yet, but it is crucial if you don't like infinite loops.
function __isdot($s)
{
return ($s == '.' || $s == '..');
}
Quote:Microsoft Windows XP [Version 5.1.2600]Sometimes people ignore these conventions, but . is the same directory as the current directory, and .. is the one below your current location. So if we recursively search without looking for these we get one of two problems:
© Copyright 1985-2001 Microsoft Corp.
C:\Documents and Settings\{...}>dir
Volume in drive C has no label.
Volume Serial Number is {...}
Directory of C:\Documents and Settings\{...}
04/05/2007 10:55 PM <DIR> .
04/05/2007 10:55 PM <DIR> ..
02/21/2007 03:26 PM <DIR> .borland
01/15/2007 04:12 PM <DIR> .musikproject
02/26/2007 02:47 PM <DIR> .scribus
01/29/2007 02:55 PM <DIR> AbiSuite
04/05/2007 03:10 PM <DIR> Desktop
04/05/2007 09:04 PM <DIR> My Documents
04/05/2007 10:55 PM 4,980,736 NTUSER.DAT
02/02/2007 02:27 PM 127 quotes.txt
09/06/2006 12:40 AM 7,052 reg736.txt
01/16/2007 05:15 PM <DIR> Start Menu
03/06/2007 07:05 PM <DIR> WINDOWS
3 File(s) 4,987,915 bytes
10 Dir(s) 127,047,512,064 bytes free
C:\Documents and Settings\{...}>
1. We search the same directory infinitely.
2. We search both downwards and upwards through the entire computer's structure.
Yikes! Think about it for a second. You are searching in C:\Temp\htdocs, so the first result is .. Well, lets say you skip it and go to the second, .., dot result? Because of how we will prioritize things you will search C:\Temp long before you ever so much as touch a given file in C:\Temp\htdocs! And, worse still, you would then search not only the root directory/drive, but all of its given pathways in all directions.
Ugly? Yeah. That's why if your script ever locks up or starts turning out pages upon pages of irrelevant results your first check should be if you called __isdot(), and then if you handled it right.
Enough doom and gloom, lets move onto the real meat and potatoes of the search: __search($dir)!
PHP Code:
// This is our prototype internal search function. You shouldn't call this on your own. The
// dir string that gets passed is what lets us go recursive in our search. Only there are two
// problems which I will outline later.
function __search($dir = '')
{
// This is the same as if($dir == '') do ? ... : else do : ... ; This tutorial is to explain
// classes in a basic sense, and how to do a recursive search in a stable one. ;) Not
// elementary PHP coding.
$path = $dir == '' ? $this->basedir : "{$this->basedir}/$dir";
foreach(scandir($path) as $found)
{
// Now, this is extremely critical, as the __isdot call must be before everything else, or
// else it *will* register as a valid directory to be searched!
if(!$this->__isdot($found))
{
$absolute = "$path/$found";
$relative = $dir == '' ? $found : "$dir/$found";
// We prioritize folders first, as this script dives to the deepest depth and then works
// outwards. It's an effective mechanism to ensure that you do end up getting the
// results in a rather efficient manner.
if(is_dir($absolute))
{
$this->folders[] = $relative; // Store the result... again, with relative pathing.
// And this is how you search recursively. :D Just call it with the relative path, and
// you're good to go!
$this->__search($relative);
}elseif(is_file($absolute)){
$this->files[] = $relative;
// And this is how we add a callback hook, so that if there is a function to call
// whenever a file is found this is it! Pretty effective and very easy to handle
// I must say.
if($this->__filefunc != '')
call_user_func_array($this->__filefunc, $relative);
}
}
}
}
Done. But there's gotta be more to it! Really? Recursive searches are deceptively simple, and before you fool yourself make sure that you know, and I mean know, that there really isn't more to it. Sure, you could have a
PHP Code:
$this->count = count($this->files)+count($this->folders);
Our final sourcecode:
PHP Code:
<?php
// This is how you start a class. You can say class x extends y, but we're not doing that yet.
class RecursiveSearch {
// Now, you can declare your variables here, but it is not good form to initialize them here.
// The reason for this is that functions outside a child function (for initialization) aren't
// called. That's right. So if I wrote:
// var $abc = dirname(__FILE__);
// It would not get initialized properly. Let us begin in earnest.
// This is the root directory of our search, it should always be an absolute path, and not a relative path.
var $basedir = '';
// This will be an array, when our object is created, of all results in a path relative to the base dir.
var $files;
// This is an array of our directories, just in case someone happens to want those sometime. ;-)
var $folders;
var $count; // Just for kicks.
// Now, this is different. I use a double underscore for internal variables and this is a callback
// function for any action the user may want to execute on a per-file basis.
// The function will be called like this: call_user_func_array($__filefunc, $file)
// A function prototype should be: find_file($file){}
var $__filefunc;
// This is our framework constructor. All we need passed to it is the root directory to search, and an optional
// per-file callback. Because we don't want to supply the latter every time we'll make it default to ''.
function RecursiveSearch($root,$callback = '')
{
$this->__filefunc = $callback; // We want this assigned even if blank. More later!
$this->basedir = $root;
$this->files = array();
$this->folders = array();
$this->__search(); // This is how hard it is to initialize the object. Wow huh? In fact, more on this later.
// The following line is not executed until after the entire search is finished.
$this->count = count($this->files)+count($this->folders);
}
// This is our prototype internal search function. You shouldn't call this on your own. The dir string that gets
// passed is what lets us go recursive in our search. Only there are two problems which I will outline later.
function __search($dir = '')
{
// This is the same as if($dir == '') do ? ... : else do : ... ; This tutorial is to explain
// classes in a basic sense, and how to do a recursive search in a stable one. ;) Not elementary
// PHP coding.
$path = $dir == '' ? $this->basedir : "{$this->basedir}/$dir";
foreach(scandir($path) as $found)
{
// Now, this is extremely critical, as the __isdot call must be before everything else, or it *will*
// register as a valid directory to be searched!
if(!$this->__isdot($found))
{
$absolute = "$path/$found";
$relative = $dir == '' ? $found : "$dir/$found";
// We prioritize folders first, as this script dives to the deepest depth and then works outwards. It's an
// effective mechanism to ensure that you do end up getting the results in a rather efficient manner.
if(is_dir($absolute))
{
$this->folders[] = $relative; // Store the result... again, with relative pathing.
// And this is how you search recursively. :D Just call it with the relative path, and you're good to go!
$this->__search($relative);
}elseif(is_file($absolute)){
$this->files[] = $relative;
// And this is how we add a callback hook, so that if there is a function to call whenever a file is found
// this is it! Pretty effective and very easy to handle I must say.
if($this->__filefunc != '')
call_user_func_array($this->__filefunc, $relative);
}
}
}
}
function __isdot($s)
{
return ($s == '.' || $s == '..');
}
}
/*
The following is a test script; be careful with how you mangle it. :P
*/
echo "<b>Files:</b>\n<ul>\n";
$search = new RecursiveSearch(dirname(__FILE__),create_function('$found','echo "<li>$found</li>\n";'));
echo "</ul>\n",
"<table border='0'>\n",
"<tr><td><i>Number of Files:</i></td><td>".count($search->files)."</td></tr>\n",
"<tr><td><i>Total Results:</i></td><td>{$search->count}</td></tr>\n",
"</table>";
?>That's all folks! And yes, you can make an anonymous functions as your callback, but sometimes that isn't what you want. You might want to reference a global object or do something that is messy in a string. In any case, it works for me ... just drop it into your own webserver's root directory (assuming that you delete it immediately afterwards, since you don't want folks knowing all the layout of your files) and try it for yourself.


This is a tutorial on how you search directories recursively, meaning that you search subdirectories too, and it presumes prior programming knowledge in PHP.