|
The Caprina Image Indexing Experiment |
![]() |
| 001
| 002
| 003
| 004
| 005
| 006
| 007
| 008
| 009
| 010 |
| 011
| 012
| 013
| 014
| 015
| 016
| 017
| 018
| 019
| 020 |
| 021
| 022 | 023 | 024 | 025 | 026 | 027 | 028 | 029 | 030 |
| 031 | 032 | 033 | 034 | 035 | 036 | 037 | 038 | 039 | 040 |
| 041 | 042 | 043 | 044 | 045 | 046 | 047 | 048 | 049
| 050 |
| 051 | 052 | 053 | 054 | 055 | 056 | 057 | 058 | 059 | 060 |
| 061 | 062 | 063
| 064
| 065
| 066
| 067
| 068 | 069 | 070 |
| 071 | 072 | 073 | 074 | 075 | 076 | 077 | 078 | 079 | 080 |
| 081 | 082 | 083 | 084 | 085 | 086 | 087 | 088 | 089 | 090 |
| 091 | 092 | 093 | 094 | 095 | 096 | 097 | 098 | 099 | 100 |
| 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 |
| 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 |
We are trying an experiment with Caprina to address the problem of generalized, text-based searching for images. While it is generally not too hard to devise a scheme for indexing and retrieving images in one discipline, for example, art history, medicine, or botany, by devising a series of data fields which are pertinent to each category of image collection, no one has effectively devised a scheme which will permit indexing and retrieving a multidisciplinary image collection.
Web search engines (e.g., HotBot, Yahoo, AltaVista) are successful in indexing millions of disparate Web pages by their content and supporting both simple and complex queries against this vast data collection. If all Web pages can be indexed, certainly a subset of Web pages which represent images could be indexed. Therefore, we are creating a simple Web page for each image in the Caprina collection (currently1998about 15,000) for which we have descriptive information.
The Caprina image collection is organized on the server in groups of about 100 images. Each group (directory) represents the images from one Kodak Photo-CD; we have our images commercially digitized onto Photo-CD which gives us a good quality archive as well as high-resolution images. Within each directory, they are sequentially numbered. The Web pages for the images are automatically generated. Since the goal is the be able to find these pages once they have been indexed by a search engine, they are quite simple. The array at the top of thie page contains the links to the index pages. The ones in black currently have the text but the images are still being copied from the CDs to the server. [Note: These pages rely heavily on JavaScript.]