Notes on setting up the exercises

The exercises consist of a set of HTML pages and accompanying example data. In order for the exercises to run smoothly, it is essential that they be set up correctly so that the user will not become confused by inconsistencies between the notes and what actually happens when they attempt to follow the instructions.

It is anticipated that the user will be browsing the HTML pages as they work through the exercises. It is highly recommended that the user should also be provided with a hard copy of the exercises (by printing out the pages from the browser) in addition to being able to view them on line. At a pinch, it should be possible to work through the exercises purely from the hard copy without the user needing an HTML browser, but this would remove the hypertext links available for help on particular topics and in practice generally leads to other problems as well.

For the exercises to work correctly, the HTML pages must be available (for browsing or as hard copy or preferably both), the example data files must be available and the necessary executables must be available and correctly configured. The HTML pages and copies of the example data files should be available to the user as read-only files, usually on a central server. The user also needs a read-write directory in which they can carry out the exercises, and this should contain the example files needed originally with read-write access enabled. Additionally, all the necessary programs must be available to the user, again usually from a central server.

Setting up the HTML pages

The HTML pages should all be set up on a local machine in a directory with read-only access for all users. (Although the pages are all available on our web site this method of access is intended only to present an impression of the exercises and not for people to actually use them on-line.) As well as linking to each other, the pages also link to example files which should be available in subdirectories original, edited and edited2 of the main directory holding the HTML files. Having these example data files available here means that the user can always download new copies of the example files if their own become corrupted or if they wish to skip some of the exercises and hence do not produce some of the intermediate files required for the exercises.

The HTML browser should be set up with a bookmark to the contents page of the exercises, contents.html. It would be desirable to set up the default downloading directory (i.e. the browser working directory) to be the main directory which the user will be using to work through the exercises, to facilitate obtaining new copies of the example data files if these are required. The browser could be set up to have the contents page of the exercises as the home page, but there is no real necessity for this. However bookmarking the contents page is important.

Setting up the HTML and example files correctly should be readily accomplished by unzipping the zip file provided, taking care to restore the directory structure (unzip or pkunzip -d). When unzipping on a Unix platform, the filenames should be converted to lower case if necessary. Also, text files should be treated as such. All files are text files except those with extension .dbf and .gif, which should be unzipped as binary files. (If any files with extension .idc slip through then these must also be unzipped as binary files, but they should be created during the exercises rather than being provided from the start.)

The files how2os.html and how2run.html describe how to carry out operating system commands and run programs for a variety of platforms (although currently just DOS/Windows). It may be desirable to replace these with modified versions which explain how to carry out these actions only for the platform actually being used, to make the descriptions clearer and more concise.

Setting up the example data files

Each user should have their own copy of these files which should be present in a directory for which they have read-write access (and these files themselves must have read-write access). These files consist of all those files in the original directory of the zip file (i.e. the directory which becomes the original subdirectory of the directory containing the HTML files). Although it would be possible for each user to download the necessary files using the HTML browser, it is important that if the exercises are to run smoothly then these files should be initially present in the user's directory. It would be a simple matter to write a little batch or script file which would copy the necessary files for a new user so that the exercises were ready to run. In the course of the exercises the user will create new files in their own directory. If they corrupt any example files they can obtain the original copies by downloading them (using the browser) from the original directory. If they miss out exercises then they can access the files that would have been created by downloading them from the edited directory. The files necessary for each exercise are listed at the top of the exercise, as hotlinks so that they can be downloaded automatically if required.

Note: Downloading replacement copies of the replacement files works very smoothly with many HTML browsers. However at least some versions of Internet Explorer insist on adding the extension .html to files when they are saved. This is incredibly annoying and to my mind a good reason to use a different browser. It is possible to force the correct name to be used by surrounding it with double inverted commas "".

Setting up the programs

It is crucial for the smooth-running of the exercises that all the necessary programs be set up and configured correctly. It is also crucial that compatible versions of the programs be used. It has often been my experience with this kind of exercise that things completely grind to a halt when some unexpected incompatibility suddenly appears. When the exercises are set up, make sure that you work through them all with a system configuration exactly the same as the user will have.

All the programs which are to be run from the operating system prompt must be available on the PATH, that is to say that it must be possible to run them just by typing the program name without the full pathname. The exercises do not include any instructions for running programs not on the PATH, and the command files to run the linkage analysis programs will not run if these programs are not on the PATH.

Here are the programs which should be available (some are more essential than others):

None of the graph-drawing programs is essential for the exercises. The preferred one is now Excel for Windows and Ace/gr for Unix. Easigraf only runs in a full-screen MSDOS window under Windows. For rapid multipoints, vitesse is recommended, but one would only need to make minor modifications to the exercises in order for them to make use of linkmap instead. The LINKAGE package is essential, although at a pinch one could do without lcp and lrp if one was prepared to use quiklink and table instead. It is essential to realise that when running the exercises on a PC lcp and lrp require ansi.sys to be loaded.

If the exercises are to be run on a PC, then be sure to obtain at least version 1.2 of table, since earlier versions use "/" as their switch character rather than "-". In fact, it is important to make sure that one has the latest available versions of all programs (at least mine) since earlier ones may have bugs or incompatibilities.

Versions of table before 1.7 do not support gnuplot. Otherwise, fairly recent versions of the programs should be OK, but you should run through the exercises to make sure there are no problems.

The programs are available from a variety of sites. My programs, comprising qdb, table, quiklink, pedraw, clump, etdt and easigraf (in estat21.zip) are available from my site at http://www.mds.qmw.ac.uk/statgen/dcurtis.html. These programs are mirrored at the EBI site at ftp://ftp.ebi.ac.uk/pub/software/linkage_and_mapping/gene_ucl_uk/dcurtis/ and this site also mirrors the LINKAGE programs, although for these programs American sites will be more accessible from the US. My homepage may carry later versions of my programs than does the EBI site.

Setting the exercises up for Windows

If the exercises are to be run from Windows the user needs to be able to run the following programs and icons for them should be available on the desktop:

The working directory for the first two programs should be set to be the directory that the user will be performing the exercises in. In addition, do not forget that the directory containing the executables must be added to the PATH, presumably in the autoexec.bat file. If it is not convenient to do this in the autoexec.bat file then users should be provided with a batch file which they can run themselves to add this directory to the PATH. In order for lcp and lrp to run at all the config.sys file must contain the line device=ansi.sys. Alternatively, users must be instructed to run dvansi.com each time before running lcp and lrp.

A typical Windows set up would be as follows:

N:\EXERFILE - HTML files, etc.

N:\EXERFILE\ORIGINAL - example files in their original form

N:\EXERFILE\EDITED - modified example files as produced in the course of the exercises

N:\EXERFILE\EDITED2 - additional modified files

N:\EXERFILE\RESULTS - edited files and results from exercises

N:\EXERBIN - executables

C:\LINKEX - directory containing copies of original example files, in which user will perform exercises

Here N: would be a networked drive with all users having read-only access to the relevant directories. Each user's HTML browser would be bookmarked to n:\exerfile\contents.html, and each user's autoexec.bat file would contain a line reading PATH=%PATH%;n:\exerbin or something similar. To set the exercises up to use on a single machine, one would adopt a similar plan, but one would set the necessary files to be read-only using the DOS attributes, setting the copies to be read-write after copying to the user's own \linkex directory.

I would strongly recommend a program called PFE to use as the text editor, as it is generally powerful and one can have multiple documents open. It is freeware and very excellent, and available from the standard ftp sites. However Windows notepad or wordpad will do instead at a pinch although it really is not recommended. Additionally, some versions of notepad insist on adding the extension .txt to files when they are saved with a different name. Again, this is incredibly annoying. It is possible to force the correct name to be used by surrounding it with double inverted commas "". The PFE program can be downloaded from the Simtel collection, at http://www.simtel.net/pub/dl/11983.shtml.

If ghostview is to be used a the postscript viewer then it and ghostscript need to be installed on each machine rather than just having the executable available in the directory with the others. Installation is accomplished by running both set-up programs, e.g. by running gs814w32.exe and then gsv46w32.exe.

Setting up the exercises for Unix

This should be similar in principle to the set-up for Windows. The executables, HTML and example files need to be read-only and present in publicly accessible directories, while each user will have their own directory for the exercises which will initially contain the example files from the original subdirectory.

Other systems

Given the universal nature of HTML files, I would expect that it should be a simple matter to set up these exercises on any system for which the appropriate executables could be obtained.

Problems and modifications

Please contact me if you have difficulty setting up the exercises. If you need to make any modifications to the supplied files in order to have the exercises run smoothly on your system then you must inform me of these. I would also appreciate hearing about any difficulties, suggestions for improvements or additions you would like to see. It is intended that once a core set of exercises and data files is established then additional exercises will be contributed from a variety of sources in order to illustrate ideas or techniques which people think of interest, and I wll coordinate these contributions while fully acknowledging the contributors.

Exercises in genetic linkage analysis

All material copyright (C) Dave Curtis 1996-2000

dcurtis@hgmp.mrc.ac.uk