PHP Classes

File: README

Recommend this page to a friend!
  Classes of Manuel Lemos   Htdig site indexing and searching interface   README   Download  
File: README
Role: Documentation
Content type: text/plain
Description: Basic instructions to use the Ht:/Dig interface class
Class: Htdig site indexing and searching interface
Interface with Ht:/Dig indexing and search engine.
Author: By
Last change:
Date: 19 years ago
Size: 4,031 bytes
 

Contents

Class file image Download
/* * README * * Purpose: Basic instructions to use this class. * * @(#) $Header: /home/mlemos/cvsroot/htdiginterface/README,v 1.1 2005/02/08 06:14:30 mlemos Exp $ * */ PHP interface for Ht:/Dig versions 3.1.x or 3.2.x: This class provides an interface to the Ht:/Dig package of programs to simplify the process of configuration, indexing and searching a site. Despite Ht:/Dig can work with an existing configuration files, this class can only work properly if you use a configuration file generated by the class. The class sets certain configuration directives to work with special result page template files that are necessary to let the class parse the search results and extract the information returned by htsearch program. The special template files are supplied within this class package. There are also example scripts to perform each of the steps to configure, index and search a site with Ht:/Dig. To make this class work properly, please follow these steps: 1. The htdig_setup_configuration.php example script demonstrates how to setup the class so it can create a suitable configuration file for Ht:/Dig. You can tell it to supersede the default Ht:/Dig configuration file or generate a new file in a different path. You may generate as many different configuration files as you want, possibly one configuration file for each site that you may be hosting in the same server. In this case, you may want to specify different directories for the database files that will contain each site index. The script should call the GenerateConfiguration function to tell the class to create the configuration file. This function takes an array of values for any Ht:/Dig options that you may want to set to customize the indexing and searching processes of your site. The GenerateConfiguration function merges your custom options with some options that the class needs to set to make the search results page parsing work properly. Those options set the file names of the output results templates to: htdig_header.html, htdig_nomatch.html, htdig_syntaxerror.html and htdig_template.html . The GenerateConfiguration function just takes a special option named template_path to specify an alternative directory for the template files if you want to put them in the current directory of your site index and search page script. 2. The next step after creating a suitable configuration file is to start the process of crawling a site to build the index database files. The htdig_build_databases.php example script demonstrates how to start a crawling session. It calls the class function named Dig that wraps around the htdig, htmerge and htfuzzy commands. This function can be called as often as you want, eventually using different configuration files, if you want, to index different sites. This is something that you probably will schedule to be done once a day on low traffic hours for each of your sites. Scheduled crawling can be done using tools like cron or equivalent in your operating system, using PHP CGI or CLI versions to run the crawler script off the Web server. The Dig function calls Ht:/Dig programs in a way that they will create temporary index database files during the indexing process. Only when the process is ended, the final index database files replaced with the contents of temporary files. This way you can run a crawling process at the same time the site is being searched by your users using database files from the previous crawling session. 3. Once your site is indexed at least once, you can start using the class to provide an interface to search your site pages. Take a look at the htdig_search.php script for an example site search page. You can use this example script as base for your customized site search page. The example script presents a simple search form. When the form is submitted, it calls the Search function and outputs the results split into pages with links to navigate between each pages of search results. The number of results per page is configurable.