IntraSearch©
by Kaloian Gorolomov

 

IntraSearch

As web-site size and complexity grows, users find it harder and more time consuming to locate the information they are interested within a web site. That is why an increasing number of web sites support a search within their site. This program allows you include that feature within your site easily and without spending a lot of money on software.

Some of the advantages of IntraSearch are:

  • Convenient GUI with DnD support
  • Allow users to perform search only within files you select
  • JSP and Servlet versions
  • "Whole word" search supported
  • Search can be performed locally trough the Administrator
  • Text within HTML tags and between <SCRIPT> and <STYLE> tags can be ignored
  • Easy to deploy and manage
  • Easy to integrate in your web pages
  • Cross-platform
  • IntraSearch is free

IntraSearch is designed mainly for web sites with small to medium traffic. Sites attracting substantial number of hits should consider commercial solutions.

Contents:

System Requirements

IntraSearch requires:

  • JDK 1.1.7 or higher
  • Web server which supports JSP or servlets

Installation

Note:
Most of the installation information provided here applies for Tomcat Server. This shouldn't prevent you from installing and starting IntraSearch on any other server if you are well familiar with it, and if it supports Servlets and/or JSP. As this is the first version of this product, there simply wasn't enough time and resources to test it under several web servers.

IntraSearch includes both a servlet and a JSP versions. It's up to you which one you are going to use. Both are included in this package.

Installation on a Tomcat Server

  1. Locate the webapps directory of your Tomcat server.
  2. Create a subdirectory therein called search. You can name it differently if you wish, just remember to use your name every time search is mentioned.
  3. Create a sub-directory of search named web-inf.
  4. Create a sub-directory of web-inf named lib.
  5. Copy files.txt, intrasearch.bat, sitesearch.jar and intrasearch.properties in the lib directory.
  6. Copy web.xml into web-inf.
  7. Copy indexServlet.html, indexJSP.html and WebClient.jsp into the search directory
  8. If you are going to use the servlet rename indexServlet.html to index.html. If you prefer to use the JSP do the same with indexJSP.html.
  9. Open web.xml in a plain text editor and edit the line:
    <param-value>D:\tomcat\webapps\search\web-inf\lib\</param-value>

    to point to the absolute path to the lib directory on your computer.

  10. Start Tomcat, add the new context /search (if you don't know how to do it, consult your server documentation), and restart the server.
  11. Open a browser and point it to <server_host>:<server_port>/search. A search page should appear. Right now you won't be able to perform a valid search, because the files to be searched need to be selected first. Read on.

Administration

IntraSearch includes a visual administration tool to help you tune the search process. Using the administrator you can also test how IntraSearch works or simply use it as a tool to locate text within a number of files without any regards to the web.

The administrator is started by executing the IntraSearch script. The application window consists of main menu and two tabs: Site Settings and Search.

Main Menu

Listed below are all items on the main menu and their functions:

Item Subitem Description
File Add Adds a file to the search list
Remove Removes the selected file(s) from the search list
Exit Exits the administrator
Tools Clean up list Removes all files from the list, which cannot be found (do not exist in the same location as when they were added)
Update list Saves the changes in the list. Before the list is updated no changes are valid
Preferences Opens the Preferences window
Edit Undo Undoes the last action. If the last action included several items, you might need to press undo several times to undo the action
Redo Redoes and undone action
Help About Info about this program and version

Site Settings Tab

This tab allows you to manage the list of files which will be searched when a search is performed. It is time consuming and unsafe to have the search routine search through all files on the server. There is usually no need to search within binary files or files with content which is not of interest to the user. Besides revealing the existence and path to these file may open security holes and assist a hacker in his/her work. That is why the search should be performed only within a limited number of usually HTML or TXT files.

Use the Add Files button to add files to that list or simply drag and drop them in the list area.

Using the Remove File button you can remove a file from that list.

The Clear List button clears the list of all its content.

The Up and Down buttons are used to manage the order in which the search will be performed. Although all results will be shown, you might choose to have some files search and listed (if the search string is found) before the others.

The Update List button updates any changes you have made and writes them to the file listing the searchable files and their order. Until this button is pressed all changes are not valid.

Some of these commands are also available from the main menu. By selecting Tools -> Preferences from the main menu you can set additional preferences:

Item Description
Web Server Root Directory The absolute path to the web server root directory. Failure to set this correctly will result in search result links not functioning correctly.
Site DNS name The DNS name of the site the search will be performed for. www.yoursite.com
Ignore text in HTML tags Text in HTML tags is ignored. For example if search for the word "table" is performed, and the searchable files contain "<table>" tags, they will not be considered a found instance. (Recommended)
Ignore context between <SCRIPT> tags Content between <script> and </script> tags will not be searched. It is very unlikely that the site visitor will search for something within your scripts. It is even more unlikely that you will want him to find it :-). (recommended)
Ignore context between <STYLE> tags The same applies as with the <script> tags. Here the ignored content is between <style> tags. (Recommended)

Make sure that you press the Set button after editing the preferences. Pressing only OK will not save the changes.

Search Tab

When you are done setting the search files and the search preferences try a search. Use the search tab to test the search process or to use IntraSearch to locate a string in files on your computer when no remote access is needed.

Two checkboxes allow you to customize your search.

Ignore Case - Looks for the search string regardles of case.

Whole Word - Reports the found search string only if it is a separate word. If you are looking for "but", and this checkbox is checked "button" will not be a match; if the checkbox is not checked it will.

Try a search and if it works well move on to trying it trough the web.

Point your browser to <server_host>:<server_port>/search, and try some search. You should get response similar to that:

If you did, then IntraSearch is installed and set correctly. For future searches you may either direct your site visitors to the search page you used, or you will probably want to integrate a search area somewhere within your web pages.

Embedding Search in the Web Pages

Take a look at indexServlet.html and indexJSP.html. Choose whether you are going to work with servlets or JSP. Open the one you will be working with. Basically what is between the <form> tags needs to be added to your web page in the place that you wish to put it. Of course you can modify the look of the form components, just make sure that all are present, and the properties of the form and form objects are retained. If you don't want some of the checkboxes, I would think that you could change the type to hidden and the value to the one you wish to set it to. This should work, but I haven't tried it, so you are on your own here.

Credits and Support

The author of this program is Kaloian Gorolomov. I would welcome any suggestions for the improvement of IntraSearch as well as any information about bugs. As this is FREEWARE I cannot provide any support at this time unless you are willing to pay for it.

I hope you find this program useful.

e-mail: k_gorolomov@yahoo.com

Disclaimer

This program is FREEWARE and you are using it at your own risk. The author cannot be responsible for any damages direct or otherwise resulting from the use of this software.