Explanation of mlink output

Example data files needed for this exercise

Annotation of autdom.res

The results of the two-point analysis between the disease locus (DIS1) and the marker locus (MAR1) are contained in the file autdom.res. Use a text editor to examine this file. It should appear as follows:

Length of real variables = 8 bytes
LINKAGE (V5.1) WITH  2-POINT AUTOSOMAL DATA
 ORDER OF LOCI:   1  2
-----------------------------------
-----------------------------------
THETAS  0.500
-----------------------------------
PEDIGREE |  LN LIKE  | LOG 10 LIKE
-----------------------------------
        4   -25.391461   -11.027348
        7   -21.936689    -9.526963
-----------------------------------
TOTALS      -47.328150   -20.554311
-2 LN(LIKE) =  9.46562998410630E+0001 LOD SCORE =     0.000000
-----------------------------------
-----------------------------------
THETAS  0.000
-----------------------------------
PEDIGREE |  LN LIKE  | LOG 10 LIKE
-----------------------------------
        4 -100000000000000000000.000000 -43429355638650388500.000000
        7 -100000000000000000000.000000 -43429355638650388500.000000
-----------------------------------
TOTALS    -200000000000000000000.000000 -86858711277300777000.000000
-2 LN(LIKE) =  4.00000000000000E+0020 LOD SCORE = -86858711277300777000.000000
-----------------------------------
-----------------------------------
THETAS  0.050
-----------------------------------
PEDIGREE |  LN LIKE  | LOG 10 LIKE
-----------------------------------
        4   -25.126631   -10.912334

        7   -22.364860    -9.712914
-----------------------------------
TOTALS      -47.491490   -20.625248
-2 LN(LIKE) =  9.49829808289596E+0001 LOD SCORE =    -0.070938
-----------------------------------
-----------------------------------
THETAS  0.100
-----------------------------------
PEDIGREE |  LN LIKE  | LOG 10 LIKE
-----------------------------------
        4   -24.649752   -10.705229
        7   -21.886756    -9.505277
-----------------------------------
TOTALS      -46.536509   -20.210506
-2 LN(LIKE) =  9.30730176092179E+0001 LOD SCORE =     0.343805
-----------------------------------
-----------------------------------
THETAS  0.150
-----------------------------------
PEDIGREE |  LN LIKE  | LOG 10 LIKE
-----------------------------------
        4   -24.472921   -10.628432
        7   -21.705815    -9.426696
-----------------------------------
TOTALS      -46.178736   -20.055128
-2 LN(LIKE) =  9.23574722464653E+0001 LOD SCORE =     0.499183
-----------------------------------
-----------------------------------
THETAS  0.200
-----------------------------------
PEDIGREE |  LN LIKE  | LOG 10 LIKE
-----------------------------------
        4   -24.427737   -10.608809
        7   -21.650608    -9.402720
-----------------------------------
TOTALS      -46.078345   -20.011528
-2 LN(LIKE) =  9.21566906886759E+0001 LOD SCORE =     0.542782
-----------------------------------
-----------------------------------
THETAS  0.250
-----------------------------------
PEDIGREE |  LN LIKE  | LOG 10 LIKE
-----------------------------------
        4   -24.462748   -10.624014
        7   -21.664755    -9.408863
-----------------------------------
TOTALS      -46.127503   -20.032877
-2 LN(LIKE) =  9.22550059063503E+0001 LOD SCORE =     0.521433
-----------------------------------
-----------------------------------
THETAS  0.300
-----------------------------------
PEDIGREE |  LN LIKE  | LOG 10 LIKE
-----------------------------------
        4   -24.556398   -10.664685
        7   -21.719000    -9.432422
-----------------------------------
TOTALS      -46.275398   -20.097107
-2 LN(LIKE) =  9.25507957943628E+0001 LOD SCORE =     0.457203
-----------------------------------
-----------------------------------
THETAS  0.350
-----------------------------------
PEDIGREE |  LN LIKE  | LOG 10 LIKE
-----------------------------------
        4   -24.698679   -10.726477
        7   -21.791982    -9.464118
-----------------------------------
TOTALS      -46.490662   -20.190595
-2 LN(LIKE) =  9.29813231712021E+0001 LOD SCORE =     0.363716
-----------------------------------
-----------------------------------
THETAS  0.400
-----------------------------------
PEDIGREE |  LN LIKE  | LOG 10 LIKE
-----------------------------------
        4   -24.885319   -10.807533
        7   -21.864182    -9.495473
-----------------------------------
TOTALS      -46.749501   -20.303007
-2 LN(LIKE) =  9.34990011077663E+0001 LOD SCORE =     0.251304
At each value of the recombination fraction, theta, mlink calculates the logarithm of the likelihood for each pedigree and the sum of these is the total log likelihood for the pedigree set. The overall lod score is the difference between the total log (base 10) likelihood at a given value of theta and the log likelihood at theta=0.5. The lod score for each individual pedigree is similarly the difference between the log likelihood for that pedigree at a given value for theta compared to its log likelihood at theta=0.5. Although mlink does not provide these individual lod scores automatically, it is simple to calculate them by hand. (The lod scores need only be calculated to two or three significant figures.) Do this now to get the lod scores for each pedigree at a couple of values of theta. (In subsequent exercises we will see how other programs can be used to obtain these lod scores automatically from the mlink output.) If this is done for all the values of theta then the following table is obtained:

theta     0.000   0.050   0.100   0.150   0.200   0.250   0.300   0.350   0.400
     4 -infinity  0.115   0.322   0.399   0.419   0.403   0.363   0.301   0.220
     7 -infinity -0.186   0.022   0.100   0.124   0.118   0.095   0.063   0.031
total  -infinity -0.071   0.344   0.499   0.543   0.521   0.457   0.364   0.251

Understanding the lod scores produced

These results have a number of interesting features, which can be understood by referring to the original pedigrees:

pedigree diagrams with genotypes of first marker

The lod score for both pedigrees at theta=0 is negative infinity. This is because the log likelihood is negative infinity, equivalent to a likelihood of zero. The pedigrees are thus both inconsistent with the hypothesis that there is no recombination between the disease and marker loci.

Taking pedigree 4 first, it is clear that subject 3 has inherited the disease allele from his father, subject 1. Since subject 1 is homozygous for marker allele 2, we know that subject 3 has inherited an allele 2 of the marker locus along with the disease allele.We can refer to the disease allele and allele 2 of the marker locus as being "in phase" in subject 3, since they form part of the same haplotype, and the meioses producing the large sibship are termed "phase-known" meioses. We can see that subjects 5, 6, and 8 have all inherited both the disease allele and allele 2 of the marker locus from subject 3, whereas subject 7 has not received the disease allele and has not received allele 2. However subject 9 has received allele 2 but has not received the disease allele and so must represent a recombinant. Hence this pedigree displays 4 non-recombinants and 1 recombinant. The pedigree is thus inconsistent with the hypothesis that recombination never occurs between the loci and the lod at theta=0 is negative infinity. All other things being equal, the most probable value for the recombination fraction between the loci is 1/5=0.2, and the pedigree yields a maximum lod score at theta=0.2.

Turning to pedigree 7, because the grandparents are untyped we do not know the parental origin of the alleles in subject 1, and we do not know the arrangement of the haplotypes. Thus there are two possibilities. One is that the disease allele is in phase with allele 1 of the marker locus, in which case subjects 3, 4, 5 and 6 are non-recombinant and subject 7 is recombinant. Alternatively the disease allele might originally be in phase with allele 4, in which case subjects 3, 4, 5 and 6 are recombinant and subject 7 is the only non-recombinant. In this "phase-unknown" situation there are two equally likely estimates for the recombination fraction, 1/5=0.2 or 4/5=0.8. There is no biological mechanism which can produce a recombination fraction greater than 0.5, so we generally do not evaluate lod scores at these values. In the range of theta<0.5 the maximum lod score occurs at theta=0.2. However note that because of the lack of information regarding phase the resulting uncertainty means that the lod score is lower than in the phase known situation, despite being based on the same number of subjects.

Overall, of course, these lod scores are far too low to convincingly demonstrate linkage or to provide a good estimate of the true recombination fraction between the loci.

Summary

This section explains the output produced by the mlink program and also illustrates the lod scores produced by simple phase-known and phase-unknown pedigrees.

Exercises in genetic linkage analysis

All material copyright (C) Dave Curtis 1996-2000

dcurtis@hgmp.mrc.ac.uk