--------


Program:  Quranref (built on top of Bibleref, a Bible concordance
	  program archived on cs.arizona.edu [icon/contrib/bible-
	  ref-2.1.tar.Z])

Purpose:  Perform word and passage-based retrievals on the M. H.
	  Shakir's translation of the Holy Qur'an.

Language: Icon (ftp-able from cs.arizona.edu [icon/interpreter])

Files:    qur2rtv.icn bibleref.src convertb.icn listutil.icn
          name2num.icn passutil.icn readfile.icn ref2bmap.icn
          srchutil.icn

Requires: Icon version 8, a working, up-to-date Icon Program Library
	  (see below on IPATH), and M. H. Shakir's translation of the
	  Holy Qur'an, as archived by Cary Maguire on princeton.edu
	  (pub/Quran.tar.Z).


--------


Overview:

	This package, Quranref, offers simple tools for word and
passage-based access to M. H. Shakir's translation of the Holy Qur'an.
Quranref is quick, and fairly easy to install (assuming you possess
the machine readable text, a sufficiently powerful machine, and know a
little about Icon).  It will also run with stock terminals - even
nasty old ones that leave magic cookies on your screen.  Quranref
will, however, put a bit of a dent in your mass storage resources.
Your 900k or so Qur'an text will get block Huffman encoded, which will
bring it down to about 500k.  The freed space, however, will be
gobbled up immediately by some 500k of auxiliary files, and by the
180k executable (more if you compile, rather than interpret).  In-core
requirements for the executable start at about 300k, and go up from
there (if your searches are complex enough, you could easily eat up a
megabyte or two).  In brief: Quranref enjoys dining on your memory
resources.  Once set up, though, it can operate with fairly minimal
impact on the CPU.
	With Quranref, you can perform most of the more basic,
low-level functions commercial browsing packages offer (and perhaps a
few not found in some of the commercial Qur'an study packages).  You
can, for example,

	-  retrieve any passage by section:verse number
	-  move forward or backward relative to the retrieved passage
	-  search the entire Qur'an for words and/or word-patterns
	-  search for word co-occurrences (or the absence thereof)
	-  save passages and/or passage-lists for use with an editor

Although this program is hardly the product of any major research
effort :-), it should prove sophisticated enough for casual use.  Its
main fault right now is that it relies on a newly scanned text which
is positively rife with errors.  The high number of errors is nothing
unusual for a text put into machine readable form by current OCR
technology, so no insult is intended to the people who rendered us
this great service.  It is merely a fact about the current
distribution that should be noted before attempting any serious
research.


--------


Installation:

	The following set-up and installation instructions tacitly
assume that you received Quranref as a shell archive, with no prebuilt
data files.  If you snarfed the distribution from an ftp site with the
data files already built, you can skip all the instructions about
obtaining and indexing the Princeton scan of M. H. Shakir's Qur'an
text.  If there's any doubt over whether your data files are prebuilt,
check to see if the file "index.done" exists in your Quranref source
directory.  If it does, then your data files have already been built.
Otherwise, you'll need to do a "from scratch" installation.
	In brief, the setup process consists of the following steps
(those with prebuilt data files only need to do the starred ones):

	*  make sure Icon is installed & your IPATH variable is set
	-  ftp the Qur'an from princeton.edu, and unarchive it
	-  link the Qur'an files to the dir where you unpacked Quranref
	*  modify the makefile to suit your machine
	*  make

Although I will discuss it below briefly, installation of the Icon
programming language falls outside the scope of this manual.  I will
therefore begin setup instructions with directions on how to get the
necessary machine-readable text, and on how to set this text up for
indexing.
	In order to obtain the Qur'an text, you must ftp the necessary
files from princeton.edu.  If you have direct internet connections,
just type "ftp princeton.edu," and when you are asked for a password,
type "anonymous."  Type your e-mail address as your password, and you
will be logged in.  Change to the "pub" directory by typing "cd pub."
To retrieve the Qur'an package, type "binary" or "type binary," and
then request a transfer by typing "get Quran.tar.Z".  When the server
lets you know the transfer is done, type "bye."  Back at your local
UNIX machine, unpack the Qur'an files by typing "zcat Quran.tar.Z |
tar -xf -" (this will set up a "Quran" directory, and fill it with the
appropriate files).  Once you've unpacked the Qur'an files, change
your current directory to the one you unpacked this Quranref
distribution in, and then link or copy the Qur'an files you just
unpacked to the current directory (i.e. you must type "ln [-S]
Quran-directory/* ."; in place of "ln" or "ln -S" you can also use
"cp").
	At this point, UNIX novices will probably need some technical
help, since installing Quranref requires modifying the set-up files,
and assumes you know at least something about the UNIX file structure,
makefiles, and operating environment.
	When finished linking your files to the Quranref source
directory, locate the Makefile.dist file included with this package
and read it.  Copy it to "Makefile."  If you find you must modify any
of the user-configurable sections, then edit them as you see fit.  Pay
special attention to the RAWFILES variable.  Also, be sure to change
DESTDIR, LIBDIR, USER and GROUP variables to appropriate values for
your machine.  If you are doing a "from scratch" installation, check
to be sure that the variable CAT is set to either "cat" or "zcat,"
depending on whether you have compressed the 114 basic text files, or
not.
	Assuming you've modified the makefile correctly, you can then
simply type "make."  If you are installing from scratch, then you
might as well have lunch.  The entire reformatting, collating, coding,
compressing, end indexing process takes as much as a full three hours
on a medium-range workstation.  Don't even *think* of indexing on a
system with less than 5 meg of core memory and/or a lot of swap space.
If you are really savvy about Icon storage management, you can try
setting your regions to have larger values than the defaults.  This
won't cause the indexing program to take up any less RAM.  It may,
however, decrease the amount of time it takes to index.
	If you find you get messages like "out of space in X region"
or "system stack overflow," then you are probably using an Icon
implementation that doesn't know how to make its own special block and
string regions bigger (e.g. on the NeXT it can't, because Mach doesn't
know about sbrk()).  If the make aborts with such a message, you will
need to set the sizes manually.  Try starting a new shell, and then
typing (assuming [ba|k]sh):

	HEAPSIZE=5000000
	STRSIZE=1000000
	export HEAPSIZE STRSIZE

If you are using (t)csh, then type

	setenv HEAPSIZE=5000000
	setenv STRSIZE=1000000

You may need to increase STRSIZE.  Can't hurt, if you have the memory,
to set it to some rediculous amount like 5000000.  Otherwise, start
low, and creep up to whatever minimum amount seems needed.  When done,
back out of the current shell, or unsetenv these variables.
	Once you've gotten past the indexing "hump" (which is quite a
hump for people starting from scratch), you are free to install.  Su
root first, if you plan on making the executables public, then "make
install."  Again, be sure the directories and ownership conventions
are set correctly in the makefile.
	Let me emphasize here that Quranref assumes the existence on
your system, not only of Icon interpreter or compiler, but also of an
up-to-date Icon Program Library.  There are several routines included
in the IPL which Quranref uses.  Make sure the system administrators
(if "they" are not you :-)) have put the IPL online, and have
translated the appropriate object modules.  Set your IPATH (or, for
the compiler, LPATH) environment variable to point to the place where
those modules reside, and you should be in business.  Note that I've
heard of people having problems with old IPL routines silently
failing, resulting in a truncated .IS file.  Suspect you have fallen
victim to this problem if you get through the installation process
fine, and then start Quranref up, only to have it terminate with an
error message and a trace reporting that a null expression was
received in place of a record.
	Any IPL material that I've written myself has been packed with
Quranref, by the way.  I'm constantly tweaking these programs, and I
just want to be sure everyone has the latest version.  IPL programs
written by other people, though, have not been built into the Quranref
distribution.  If you don't have the IPL, ftp it from cs.arizona.edu
along with the full Icon distribution, and get the updates as well
(~ftp/icon/library).  If all else fails, write to me and I'll package
up the necessary routines for you :-(.  In general, the best solution
is simply to keep your Icon run-time and support systems in sync with
the latest revisions from the U. of Arizona.  They are public domain,
after all, and the installation is pretty simple.


--------


Running Quranref:

	There is precious little in the way of documentation on
running Quranref.  This is it.  I wrote the program originally as a
test wrapper to put around some software I'm using for personal
research purposes.  In this incarnation it functioned as a Bible
research program for Christians and Jews.  This package was tried out
here at home by my wife and 5 year-old.  The 5 year-old likes to break
programs I write.  Thanks to him, the visual interface had all of the
obvious bugs worked out.  Others have checked other features, and have
compared retrieval lists to the listings in paper concordances, and
have found them to be complete.  This program - Quranref - is much
less well-tested.  It's a younger program, and the text it's based on
has not been put through a rigorous editing process yet.  Expect the
usual growing pains.
	When it is first invoked, Quranref will print a little message
about initializing one or another of its auxiliary files.  After
several seconds, this message will disappear - replaced by a prompt
which asks for a passage reference or "f" (to find a word or
word-pattern).
	If you are interested in looking at a specific passage in the
Qur'an, type it in.  Use the section or chapter number, followed by
the passage or verse number (as in "1:3" or "2:201").  If the passage
doesn't exist, then you'll get an error message (note that some
passages appear to be missing due to scan errors and missing newlines
in the scanned text).
	If want to use Quranref in order to locate a word-pattern,
rather than a passage, type "f" and press return at the main command
loop.  Just about every command in Quranref is invoked by typing a
single letter + a carriage return.  It's simple and consistent, and
lets me keep the terminal in its normal mode.  If at any time you must
type in additional input, Quranref will ask you for that input.  The
"f" (find) command is one such case.
	After you press f+return, Quranref will ask you to input a
word.  This "word" can be a simple string.  It can also be an
egrep-style pattern:

	(dis|un)?believers?
	dr[iau]nk.*
	parable.*
	etc.

Note that the pattern you specify must match words in their entirety.
For instance

	sacrifice

will only match "sacrifice."  That is, it will cause Quranref to
retrieve only those passages which contain the word "sacrifice."  To
catch "sacrifices," "sacrificed," etc. you need to input a regular
expression which will match this string in full as well (e.g.
"sacrifice.*").
	When you are done typing in a word or word-pattern, Quranref
will ask you if you are finished.  If you are, then press the
appropriate key ("f"), then hit a carriage return.  In a couple of
seconds, you should see a list of passages which contain the word or
pattern which you specified (if in fact any such passages were found).
Along with this list comes a new set of options, including ! (escape
to a shell), a (append the current list to a file), b (back up), c
(clear and redraw screen), m (view next screen), and several others.
Try the various options out and see what they do.  Two commands not
listed in the prompt string (which has to be of finite length!) are
"?" and "/."  With some differences, these do pretty much what UNIX
users expect.  If, say, you have a passage list containing 2000 hits,
and you want to find the section where references from the chapter
"Women" begin, then type in "/4:" (traditionally, this chapter is
number 4; "4:" will match everything from 4:1 on).
	While viewing a passage here, or at the top level, you may
jump to an arbitrary location by typing in a reference (e.g. "4:5").
You may also move to the next passage in the text (+), or move to the
previous one (-).  Finally, you may also either write (w) or append
(a) the current passage to a file.  Again, try out the various
options.  When you are done, type "q" and press return.  You will be
brought back to the previous menu, where you can *v*iew another
passage, or quit.
	At the main menu, you can invoke several additional functions
beyond retrieving specific passages or passage lists.  Quranref keeps
a resume of all passage lists you have retrieved, storing it in a
globally accessible structure.  You can look at this structure by
typing "l" (for "list") at the main menu prompt.  From the resulting
display menu, you can then *v*iew any of your previous passage lists
(i.e. "hit lists" resulting from invocation of the *f*ind function).
When you are done, you can once again press "q"+return, and go back to
the top-level command loop.
	From most of the display menus you can write or append
passages or passage lists to files.  If at any time you wish to reread
such files back into Quranref, type "r" at the main command prompt.
You will then be prompted for a filename.  If the file you name has
not been corrupted somehow, and is in the right format, Quranref will
read it in and display it just the way it would the results of a
*f*ind operation.
	A final basic function I'd like to mention here is Quranref's
facility for manual display of passage lists.  As was mentioned a few
paragraphs before, all your passage lists are all kept in memory, and
you can view the list of available ones by typing "l" at the main
prompt.  I'd just like to point out that if this seems too cumbersome,
you can simply type "d" and return.  Quranref will retrieve and
*d*isplay the last passage list you created.
	One somewhat under-tested aspect of Quranref's search facility
is its ability to handle ranges and Boolean operators in search
specifications.  You can, for instance, execute a *f*ind, using a
pattern such as "unbelieve.*".  When asked if you are finished, you
can then respond, not with the normal "f," but rather with an "a" (for
"and").  This tells Quranref that you want to perform an intersection
with respect to another set of passages.  After typing "a" and hitting
return, Quranref asks you for a unit (c = chapter, v = verse).
Normally you would press "v."  You are then asked for a range
(normally 0).  After entering a unit and range, you would enter a new
word or pattern to look for, and then press "f" to tell Quranref you
are finished.  What Quranref would retrieve in this instance is a list
of all Biblical verses which contain both of the words, or
word-patterns, that you specified.  Note that if you had entered 1 as
your range, you would have gotten a list of all passages containing
word(s) matching the first pattern and which either contain, *or are
adjacent to another passage containing*, a word matching the next
pattern you specified.
	In addition to "a" ("and"), Quranref also accepts "o" ("or")
and "n" (and-not) directives.  Also, words and patterns preceded by an
exclamation point and a space ("! ") are inverted (a la egrep -v).  I
would not recommend using the "! " much, though.  It is slow, and
usually brings about massive hit lists.  If you want, say, all
occurrences of the word "woman" that don't contain the word "child,"
then formalize your search as "woman" and-not "child," rather than as
"woman" and "! child."  The only thing slower than a search for
"woman" and "! child" would be to look for "woman" together with the
words "the" and "child."  There are about 4100 passages containing
"the," and although I've used a cute trick to reduce the number that
have to be stored, retrieving them all is still a mess (takes almost
30 seconds on my machine).


--------


Additional Notes:

	As mentioned above, this package is really just a wrapper
around a more general set of indexing and retrieval utilities I'm
using for personal research.  Despite the way they are used here,
these utilities are *not* geared solely for the Quran.  In fact, they
are set up so that they can be used with just about any text broken up
into hierarchically arranged divisions.  As noted above, this
distribution is actually built on top of a similar package geared for
Christian and Jewish Bible research.  If you need help integrating a
new text into the retrieve package, drop me a line (i.e. new Quran
translations, 'ahadith, biblical texts, etc.).  If I'm not busy, and
the job looks to be one I can help you out with, I'll be glad to do
so.  If nothing else, I can at least get you started, and offer
pointers on how to proceed.  Please, though, if you don't have M. H.
Shakir's Quran translation, and can't ftp from the location specified
earlier on in this document (i.e. princeton.edu), please *DON'T* write
to me asking me to e-mail you the files, or to package them up on
disks.  I've been flooded with requests for biblical texts already,
and can't reasonably oblige them all.
	It is with some reservation that I mention here several
features Quranref possesses that I haven't fully documented.  Most are
ones 1) that I'm not likely to continue supporting, 2) that haven't
been tested, 3) that are too slow to be practical, or 4) that are
likely to change.  First of all, the "d" command can take a number
argument, which causes it to display the list whose position in the
global list of lists corresponds to that number).  On the top level
(and in some other places), the "!" command can also pass arguments to
a shell (/bin/sh, or the value of your SHELL environment variable).
Also, if you type "f lord god" at the main prompt, and press return,
you'll get a list of verses containing the words "lord" and "god"
(i.e. Quranref will, in other words, perform a verse-based, range 0
"and" on the respective hit lists for these two words).  Finally, when
browsing search lists, you can look at the first line of each verse by
typing "l" and return (typing l+return again turns this feature off).
	While I don't want to hide the existence of these marginal
features, I don't want to encourage anyone to expect their presence in
later versions, or to suggest that they will work properly in the
current one.  I'm particularly worried about the "l" and "f lord god"
examples above.  The "l" listing option is very slow.  Also, telling
people that "f lord god" is okay also might lead them to think that
Quranref has a concept of word order within verses.  In fact, this is
just an alternate way of performing a set intersection on the hit
lists for "lord" and "god."  If you use "undocumented" features such
as these, be aware that there may be difficulties inherent in their
use, and that, in general, I've avoided mentioning them until now
precisely because I'm not quite sure they are worthy of mention in the
first place.


--------


Problems:

	Doubtless you will find problems, more options not discussed
in the documentation, and just general indications that this program
was written late at night after I was done all my serious work for the
day :-).  If - no, when - this happens, I encourage you to drop me a
line.  I'd like to know about any flaws you run into, especially
major, systemic ones.
	Generally, I really hope that the bugs will not prove too
annoying, and that the package will prove generally useful to you the
user, and, if you place it in a public directory, to anyone else who
might happen to try it out.


   -Richard L. Goerwitz              goer%sophist@uchicago.bitnet
   goer@sophist.uchicago.edu         rutgers!oddjob!gide!sophist!goer
