PHP include path problems

Posted by: Kim N. Lesmer on 15.09.2009
Working with include paths in PHP can sometimes provide quite a challenge. When everything resides in the document root there are no problems, but once the files are moved into one or more subdirectories difficulty may arise. This article uses Unix paths, not Microsoft Windows paths.

Relative vs. absolute path

First a note about the relative path vs. the absolute path.

The relative path points to a file or directory in relation to where the present file is located.

If you have an include in your index.php and are using a relative path for an inclusion like this:

<php
include ("foo.php");
?>

Then the script will look for the file foo.php in the same place as index.php

The absolute path is the "full path" from the webserver point of view. It is the path that contains the document root. For example /var/www/mydomain.com/.

If you have your index.php located in /var/www/mydomain.com/myproject/ (notice the subdirectory) and are using an absolute path for an inclusion like this:

<php
include ("/foo.php");
?>

(Notice the preceding slash meaning an absolute path), then the script will look for the file foo.php in the document root and NOT in the same place as index.php, hence the script expects to locate foo.php in the path /var/www/mydomain.com/ and NOT in /var/www/mydomain.com/myproject/ path.

The problem

If you keep everything in the document root, and always knows where your application is going to be installed, then it is easy to simply use an absolute path for inclusion every time, and if you are working on a Unix like operating system the variable $_SERVER['DOCUMENT_ROOT'] will provide you with the path to the document root, unless of course someone has disabled this feature.

But what if you are developing an Open Source application, and you don't know where it's going to be installed? Maybe it will be installed in the document root, or maybe it will be installed in a subdirectory inside the document root, or maybe even inside a subdirectory residing in another subdirectory.

Once the application goes inside a subdirectory you can no longer trust absolute paths for inclusion.

In most cases the safest bet is to use the relative path for inclusion, but there is a potential problem with that as well.

If your application has a need to include a lot of different files located in a lot of different subdirectories and the result is "nested includes", then using a relative path won't work.

Imagine that you are including a menu in order to avoid having to update every single file if a change to the menu must take place.

Example:

/var/www/mydomain.com/index.php
/var/www/mydomain.com/contact.php
/var/www/mydomain.com/information.php/var/www/mydomain.com/doc/documentation.php
/var/www/mydomain.com/doc/some_doc1.php
/var/www/mydomain.com/doc/some_doc2.php/var/www/mydomain.com/incl/menu.php

Inside index.php and every other file you include the menu like this, using a relative path:

<php
include ("incl/menu.php");
?>

Inside the files in doc you include the menu like this, using a relative path:

<php
include ("../incl/menu.php");
?>

Now, at first this seems to be perfect, but what about the links inside the menu itself?

If the menu.php file is using relative paths pointing to all the relevant files in the application, then the references needs to change if you are suddenly located inside the doc directory.

Lets imagine that your links in menu.php looks like this, again relative paths:

<a href="index.php">Home</a>
<a href="contact.php">Contact</a>
<a href="doc/documentation.php">Documentation</a>

If you have loaded the index.php document, and click on any link in the menu, it will work perfectly, but if the document.php file is loaded, and you click the index.php link in order to go home, the script will look for index.php inside the doc directory.

A solution could be to only use absolute paths inside the menu itself, like this:

<a href="/index.php">Home</a>
<a href="/contact.php">Contact</a>
<a href="/doc/documentation.php">Documentation</a>

But then again, what if the application gets installed inside a subdirectory? The document root doesn't include subdirectories.

The same goes for using the $_SERVER['HTTP_HOST'] variable as a reference.

Example:

<a href="http://<?php echo $_SERVER['HTTP_HOST']; ?>/index.php">Home</a>
<a href="http://<?php echo $_SERVER['HTTP_HOST']; ?>/contact.php">Contact</a>
<a href="http://<?php echo $_SERVER['HTTP_HOST']; ?>/doc/documentation.php">Documentation</a>

The HTTP HOST works the same way, in the above, as an absolute path and it doesn't include the subdirectory.

Keep It Simple, Stupid

There exist several common approaches to solving the problems described above, but keeping it simple sometimes isn't an option (unfortunately).

The simple system

If you are developing a specific application to work at only one specific web server, and the application isn't supposed to be installed as an application for several different web servers, you should keep things as simple as possible.

If you are loading a file that is located in the same directory as the source file, just include the file name, don't use any path. This is the very basic and simple example of a relative path because the file and directory are both in same place. In other words, no path info is required if the files are in the same directory.

If you are including the menu into each file, where it is needed, and the menu is located inside a subdirectory like this: /var/www/mydomain.com/incl/menu.php, you need to remember to avoid including the menu from files located at other places than the document root itself.

Define an install path

The common solution to the problems described above is to define an install path and then only use absolute paths and URLs.

This solution is extremely effective, but the challenge lies in how to define something that you don't know.

The install path is where the application gets installed, it is the document root added any possible subdirectories.

The Wordpres way

I don't know if Wordpress still uses the same method, but a commonly used way to deal with the above, is to have the install script scan for the current working directory upon installation and then save the information in the database (or elsewhere) for later usage.

/var/www/mydomain.com/myproject/install.php

The install script will scan its working directory and remove the filename and then save that information as the install path in the database. The install path will be defined as a constant and then be used as an absolute path throughout all the files in the system.

Here's an example that will result in the install path:

<?php
// Fixing the install path.
$script_path = realpath(basename(getenv("SCRIPT_NAME")));
$slash = explode('/', getenv("SCRIPT_NAME"));
$current_filename = $slash[count($slash) - 1]; 
$install_path = str_replace($current_filename, "", $script_path);// Fixing the host URL.
$host_url = str_replace($current_filename, "", getenv("SCRIPT_NAME"));
?>

The output of the explode command is an array. The script will collect the last element of the array, which is the filename of the running script, no matter where it is located. The result left is the document root added any possible subdirectories.

The method is quite effective, but it does contain one small problem. If, at a later time, the user needs to move his application to another host, and the install path changes, he can't simply move the application and restore the database. He has to change the install path inside the database first.

The session variable way

This method is exactly the same as the Wordpress method only instead of storing the install path in the database the install path is scanned from the index.php file and the result is stored in a session variable.

This method is more userfriendly as the client can move the system around as he desires, and he doesn't have to mess around with the database.

The problem with this method is that it forces every single user of the system to allow the usage of cookies based sessions (keeping the install path in the URL isn't really an option).

The save-in-a-file way

This method is exactly the same as the Wordpress method only this time the install path gets stored in a configuration file on the server.

The potential problem with this method is that the client has to make sure that the webserver has write access to the file in which the install path gets stored.

Get the user to do it

This method demands that the user manually stores the full document root plus any subdirectories inside a variable in the configuration file (or at least any subdirectories).

Since the user has to enter his database credentials in some kind of configuration file or during installation, he might as well enter the install path.

The main problem with this method is that a lot of people get the install path entered wrong. With or without the needed slashes etc. And since it can be scanned automatically there really isn't any reason why the user should enter it manually.

The best practice

Each solution has its own pro's and con's and I don't think there exist a best practice to the problem. Different framesets uses different solutions to the problems.

I personally find "the Wordpress way" or "the session variable way" the best ways to deal with the problems when dealing with a system of "unknown installation path".

The best thing to do is to plan for the solution in order to get best result.

Include vs. require

The only difference between the functions include() and require() or include_once() and require_once() is that when errors occurs the include functions generates a warning, and the script will continue to execute whereas the require functions generates a fatal error, and the script will stop execution.

I hope you find this article useful. In case you have any comments feel free to email me, if I find them relevant, I will add them in the bottom of the article.