Multipoint linkage analysis

Example data files needed for this exercise

Executables required: linkmap, quiklink, table, easigraf (optional), gnuplot (optional), ACE/gr (optional), Excel (optional).

Multipoint analysis

So far, we have observed the results of carrying out a linkage analysis of the disease against each marker locus separately. However, suppose that we knew that the two marker loci were linked to each other. Then we would want to consider both markers simultaneously against the disease. This would be called a three-point analysis, because three loci are involved. In order to compare the lod scores for the disease locus in various positions relative to the marker loci, the linkmap program of the LINKAGE package is used. We will set up a three-point linkmap analysis using quiklink.

Setting up a three-point analysis with quiklink

To see how you can use quiklink to set up a three-point analysss, at the system prompt enter:

quiklink autdom2.ppd autdom2.par

Then in response to the prompt Enter switches or fileroot locus_numbers (blank to finish): type in the following line and press Enter:

d1m1m2 dis1 mar1 mar2

Now quiklink will request a value for the recombination fraction between the two fixed loci, these being MAR1 and MAR2. These two loci are separated by a recombination fraction of 10%, so input:

0.1

Finally, press Enter again to leave quiklink.

To run this new analysis, at the DOS prompt enter:

d1m1m2

or at the Unix prompt:

sh d1m1m2.sh

Using table to examine the linkmap output

In order to use table reformat the output from linkmap so that it can be more easily understood, at the operating system prompt enter:

table d1m1m2.res

This should create a file called d1m1m2.tab. Examine this with a text editor. It should appear as follows:

theta     0.400   0.300   0.200   0.100   0.000   0.000   0.020   0.040   0.060   0.080   0.100   0.100   0.100   0.100   0.100   0.100
theta     0.100   0.100   0.100   0.100   0.100   0.100   0.083   0.065   0.045   0.024   0.000   0.000   0.100   0.200   0.300   0.400
cM      -54.931 -34.657 -21.182 -10.137   0.000   0.000   2.001   4.009   6.029   8.069  10.137  10.137  20.273  31.319  44.794  65.067
     4    0.220   0.363   0.419   0.322 -15.051 -99.000   0.765   1.073   1.258   1.394   1.505   1.505   1.276   1.021   0.731   0.396
     7    0.031   0.095   0.124   0.022 -15.352 -99.000   0.332   0.524   0.543   0.389 -99.000 -99.000   0.022   0.124   0.095   0.031
total     0.251   0.457   0.543   0.344 -30.404 -99.000   1.097   1.597   1.801   1.783 -97.495 -97.495   1.298   1.145   0.825   0.427

There are now two lines for the recombination fractions, because three loci will always have two intervals between them. At first these intervals are between the disease locus and first marker, and between the two markers. The disease is brought gradually closer to the first marker and so the first recombination fraction diminishes, while the recombination fraction between the two markers is kept constant. However then the disease is placed between the two markers, so the first recombination fraction is for the interval between the first marker and disease, and the second recombination fraction is between the disease and second marker. As the disease locus is moved from being close to the first marker to being close to the second, so the first recombination fraction increases while the second decreases. Finally, the disease is placed on the other side of the second marker. Now the first interval is between the two markers and so the recombination fraction remains constant, while the second interval is between the second marker and the disease and is gradually increased. The line under the recombination fractions shows the position of the disease locus relative to the first marker in centimorgans. It is possible to convert from recombination fractions to map distance using a variety of mapping functions, and table uses the Kosambi mapping function for this purpose. The three-point lod scores for each position on the map are displayed, for each pedigree and totalled over both pedigrees.

Graphing the multipoint lod scores using gnuplot

If you are not using gnuplot as your graph-drawing section you can skip to the relevant instructions for easigraf, ACE/gr, Excel or omit these graph-drawing instructions altogether if none of these programs is available.

If you have the gnuplot program available, you can see a graphical representation of the lod scores because table can automatically prepare graph files which can be read in by gnuplot. In order to do this, at the operating system prompt issue the following command:

table d1m1m2.res -p

This command instructs table to produce graph files called d1m1m2.plt and d1m1m2.gda, as well as a table of lod scores in d1m1m2.tab. (The graph files are simple text files containing plotting commands and data in a form which gnuplot can understand. You can examine these files with a text editor if you wish, although there is no need to do so.) In order to have gnuplot display this graph, run gnuplot and select File, Open, and choose the file called d1m1m2.plt.

You should see the multipoint lod scores displayed, although the scale of the graph is not ideal. To fix this, select Axes, Y change and enter -2 as the lower bound and 3 as the upper bound. Then select Plot, Replot (or just click on the Replot button). The graph should appear approximately as shown below.

Graphing the multipoint lod scores using easigraf

If you are using MSDOS and have the easigraf program available, you can see a graphical representation of the lod scores because table can automatically prepare a graph file which can be read in by easigraf. In order to do this, at the DOS prompt issue the following command:

table d1m1m2.res -g

This command instructs table to produce a graph file called d1m1m2.grp, as well as a table of lod scores in d1m1m2.tab. (The graph file is a simple text file containing the data in a form which easigraf can understand. You can examine this file with a text editor if you wish, although there is no need to do so.) If you are running under Windows then before you attempt to run easigraf you it is very important that you set the display to use the full screen by switching to the MS-DOS Prompt window and pressing Alt-Enter, and it is very important that you quit easigraf (by pressing Q) before you either switch the display back to a window or before you switch to another program. In order to have easigraf display this graph, at the DOS prompt enter:

easigraf d1m1m2.grp

You should see the multipoint lod scores displayed, although the scale of the graph is not ideal. To fix this, press a to bring up the axes menu, press 5 and 6 to switch on the X and Y scales and press 7 to adjust the scales. Then press Enter to leave the X scale unchanged, but backspace over the parameters for the Y scale and enter instead -2,3,1 to produce a scale which ranges from -2 to 3, with ticks every 1 lod unit. Press Enter twice to return tot the graph. You can press M to switch off the menu display. When you have finished looking at the lod score graphs press Q to leave easigraf. The graph should appear approximately as shown below.

Graphing the multipoint lod scores using ACE/gr

If you have the ACE/gr program available, you can see a graphical representation of the lod scores because table can automatically prepare graph files which can be read in by ACE/gr. In order to do this, at the operating system prompt issue the following command:

table d1m1m2.res -x

This command instructs table to produce a graph file called d1m1m2.xgr, as well as a table of lod scores in d1m1m2.tab. (The graph file is a simple text file containing plotting commands and data in a form which ACE/gr can understand. You can examine this file with a text editor if you wish, although there is no need to do so.) In order to have ACE/gr display this graph, run ACE/gr and Load the file called d1m1m2.xgr. Then click the AS (autoscale) button.

You should see the multipoint lod scores displayed, although the scale of the graph is not ideal. To fix this click the Z (zoom) and select the region of interest.

Graphing the multipoint lod scores using Excel

If you are using Windows and have the Excel program available, you can see a graphical representation of the lod scores because the .tab file which table produces can be imported into Excel and then used to make a chart. In order to do this, at the DOS prompt issue the following command:

table d1m1m2.res

This command instructs table to produce a file called d1m1m2.tab.

You then need to run Excel from the Windows Start menu and follow the detailed instructions to import a .tab file and create an XY chart. The graph should appear approximately as shown below.

Whichever method is used to produce the graph of multipoint lod scores, it should appear approximately as follows:

three point lod scores of disease against two markers

It can be seen that the maximum lod score of 1.8 appears between the two markers, perhaps slightly closer to the second one.

Using quiklink input files to set up analyses automatically

Rather than enter the loci to be used for each analysis separately each time, it can be convenient to use an input file which contains this information and which will cause quiklink to set up all of the required analyses at once. This is especially helpful when, for example, additions or modifications are made to some of the data and it is desired to repeat a set of analyses. The input file also provides a permanent and easily accessed record of which analyses were set up for a particular project. A quiklink input file is a simple text file which consists of a lines which input just as they would be typed into quiklink interactively. An example input file called alltest.inp has been provided which would set up the three analyses above, and if you examine it with a text editor you will see it appears as follows:


d1m1 dis1 mar1
d1m2 dis1 mar2
d1m1m2 dis1 mar1 mar2
0.1

Since the third analysis is a three-point analysis, it requires a second line specifying the recombination fraction between the marker loci.

To see how this input file is used, at the system prompt enter:

quiklink autdom2.ppd autdom2.par <alltest.inp

This causes quiklink to read input from alltest.inp rather than from the keyboard. You should see that quiklink then automatically writes pedigree, locus and command files for each of the three analyses. (Of course in practice a quiklink input file might contain specifications for setting up many more analyses, perhaps for a set of markers covering a whole chromosome.) There is no need to re-run these analyses now, although in future exercises we will modify this input file and use it for setting up additional analyses incorporating a new marker.

Summary

This section demonstrates multipoint analysis with linkmap, where likelihoods are calculated as the disease locus is moved over a fixed map of two markers. Lod scores are tabulated using the table program, which can also produces graph files for other programs.

Exercises in genetic linkage analysis

All material copyright (C) Dave Curtis 1996-2006

david.curtis@qmul.ac.uk