PHP HTTP Socket Client: TinyHttpClient

The crippling of file_get_contents()

These days, security is becoming more and more of an issue, not only for your own website, but also for the servers of hosting companies.  The file_get_contents() function allows a PHP script to open a file and get the file contents as a string.  In addition to opening files on the local server, the function can also connect to a remote server and download the contents of a URL resource.

This simple example will download the page at www.citumpe.com and save it in the file cache.html:

$url = "http://www.citumpe.com";
$cacheFile = "./cache.html";
if($content = file_get_contents($url))
{
  $fp = fopen($cacheFile, 'w');
  fwrite($fp, $content);
  fclose($fp);
}

There are numerous legitimate reasons why a web based application would need to access and download content from a remote URL.  Perhaps your application needs to obtain the latest RSS feed from a site or perhaps you need to verify that a site is working on a periodic basis.

In an effort to reduce the risk of malicious code being downloaded and executed directly by a PHP instance on a shared hosting service, many web hosts are disabling the file_get_contents() function in their PHP implementation.

1and1 hosting has disabled URL paths for file_get_contents() under PHP 5, but it is still available under PHP 4.

It’s pretty easy to tell if your web host has disabled the URL functionality of the file_get_contents() function.  Usually, you will receive a message similar to:
Warning: file_get_contents() [function.file-get-contents]: URL file-access is disabled in the server configuration in “path/to/your/php/file”.

To check your webserver to see if file_get_contents() is enabled for URL files, simply execute the following in a php file on your server:

echo "allow_url_fopen setting is: ".ini_get("allow_url_fopen");

So what do you do when your web host has disabled the URL functionality of this method?

Using cURL

Some folks have advocated the use of the cURL libraries.  While this is a simple approach, with few lines of code required, it does require that your host has enabled the cURL bindings in your PHP instance.  Some hosts have it enabled, while others have blocked it.

Getting remote file contents via cURL can be as simple as this example:

$myCurl = curl_init($url);
curl_setopt($myCurl, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($myCurl, CURLOPT_RETURNTRANSFER, 1);
$content = curl_exec($myCurl);
curl_close($myCurl);

A socket based http client: TinyHttpClient

What is needed is a raw socket based approach which avoids the security risk of the file_get_contents() method and the portability risk of the cURL approach.

The TinyHttpClient class allows you to connect to a remote host and download the file (aka resource).  The TinyHttpClient class supports URL requests from simple resource requests to GET and even POST requests. In addition to requesting from open servers, the TinyHttpClient also supports Basic Authentication. The class implements a small subset of the HTTP 1.0 protocol and does not support chunking, which is part of the HTTP 1.1 standard.

Using the TinyHttpClient class is simple. Here is an example:

//set up the parameters...
$host = "www.citumpe.com"
$port = 80;
$remoteFile = "/";
$basicAuthUsernameColonPassword = "";
$bufferSize = 2048;
$mode = "post";
$fromEmail = "[email protected]";
$postData = "";
$localFile = $cacheFile;

//initialize the class instance
include 'TinyHttpClient.php';
$tinyHttpClient = new TinyHttpClient();
$tinyHttpClient->debug = true;

//get the remote file...
$retVal = $tinyHttpClient->getRemoteFile($host, $port, $remoteFile, $basicAuthUsernameColonPassword, $bufferSize, $mode, $fromEmail, $postData, $localFile);

Download:

tinyHttpClient: [download id=”4″]

Version 1.2:

Changed the name of the client to TinyHttpClient from CleanHttpClient.

Version 1.1:

Fixed copy/paste bug in authentication string generation.

Version 1.0:

Initial release of the CleanHttpClient class.  This software is licensed under a BSD style license with the added clause that you must provide a link back to www.henryranch.net on any site which this software is used or on which a derivative of this software is used.

Leave a Reply