References

[1]

C. D. A program to draw pedigrees using linkage or linksys data files. Annals of Human Genetics, 54:365-367, 1990.

[2]

M. J. Daly, J. D. Rioux, S. F. Schaffner, T. J. Hudson, and E. S. Lander. High-resolution haplotype structure in the human genome. Nat Genet, 29(2):229-32, 2001.

[3]

S. B. Gabriel, S. F. Schaffner, H. Nguyen, J. M. Moore, J. Roy, B. Blumenstiel, J. Higgins, M. DeFelice, A. Lochner, M. Faggart, S. N. Liu-Cordero, C. Rotimi, A. Adeyemo, R. Cooper, R. Ward, E. S. Lander, M. J. Daly, and D. Altshuler. The structure of haplotype blocks in the human genome. Science, 296(5576):2225-9, 2002.

[4]

L. Helmuth. Genome research: Map of the human genome 3.0. Science, 293(5530):583-585, 2001.

[5]

J. R. O'Connell. Zero-recombinant haplotyping: applications to fine mapping using snps. Genet Epidemiol, 19 Suppl 1:S64-70, 2000.

[6]

D. Qian, 2002. Personal communication.

[7]

D. Qian and L. Beckmann. Minimum-recombinant haplotyping in pedigrees. Am J Hum Genet, 70(6):1434-1445, 2002.

[8]

A. H. Sherman. Algorithms for sparse gaussian elimination with partial pivoting. ACM Transactions on Mathematical Software (TOMS), 4(4):330-338, 1978.

[9]

P. Tapadar, S. Ghosh, and P. P. Majumder. Haplotyping in pedigrees via a genetic algorithm. Hum Hered, 50(1):43-56, 2000.


 

Figures and Tables

Figure 1: An illustration of a pedigree with 15 members. A square represents a male node, a circle represents a female node, and a solid (round) node represents a mating node. The children (e.g. 3-3, 3-5 and 3-7) are placed under their parents (e.g. 3-1 and 3-2).

 

 

 

Figure 2: An example recombination event. The notation i | j means that the haplotype information at the locus has been resolved, and we know that allele i is from the father and allele j is from the mother.

 

 



 

Figure 3: An illustration of the block-extension algorithm. The blank between two alleles at a locus indicates that the locus is PS-unresolved. Again, a | indicates that the locus is PS-resolved. For a PS-resolved locus, we use two numbers in parentheses to indicate the GS values of the paternal and maternal alleles.


Figure 4: An illustration of level 3 and level 4 constraints.

 

 

x

y

z

Constraint equations

1

1 2

1 2

 

1 *

1 *

 

1 1

1 1

 

x1=x2

2

1 2

2 1

 

1 *

2 *

 

1 1

2 2

 

x1+x2=1

3

1 2

2 1

 

1 *

1 1

 

1 1

2 1

 

x1+x2=1

4

1 2

1 2

 

1 1

1 1

 

2 1

2 1

 

x1 = x2

 

Table 1: The possible level 3 constraints.

 

 

x

y

z

Constraint equations

1

1 2

1 2

 

1 2

1 2

 

1 2

1 2

 

x1+x2=y1+y2=z1+z2

2

1 2

1 2

 

1 2

2 1

 

1 2

1 1

 

x1+x2=z1, y1+y2+z1=1

3

1 2

1 2

 

1 2

1 1

 

2 1

2 1

 

x1+x2=z1+z2

 

Table 2: The possible level 4 constraints.

 

Parameters

Time used by the block-extension algorithm

Time used by MRH

(17,10,6,0)

2.1s

1m16s

(17,10,6,4)

2.1s

2m14s

(15,25,6,0)

2.7s

28m

(15,25,6,4)

2.9s

30m

(29,10,6,0)

3.2s

6m17s

(29,10,6,4)

3.1s

8m47s

(29,25,6,0)

15s

2h58m

(29,25,6,4)

10s

3h9m

 

Table 3: Speeds of the block-extension algorithm and MRH on multi-allelic markers.

 

Parameters

Time used by the block-extension algorithm

Time used by MRH

(17,10,2,0)

1.9s

6m17

(17,10,2,4)

2.3s

16m16s

(15,25,2,0)

4.7s

3h43m

(15,25,2,4)

4.8s

4h44m

(29,10,2,0)

2.8s

1h3m

(29,10,2,4)

2.7s

57m

(29,25,2,0)

2.3s

28h

(29,50,2,0)

16s

³ 20h/run

 

Table 4: Speeds of the block-extension algorithm and MRH on biallelic markers.

 

Parameters

Percentage correctly recovered out of 100 runs

(15,50,6,0)

100

(15,50,6,4)

91

(17,50,6,0)

100

(17,50,6,4)

91

(29,10,6,0)

100

(29,10,6,4)

99

(29,25,6,0)

100

(29,25,6,4)

95

(29,50,6,0)

100

(29,50,6,1)

96

(29,50,6,2)

93

(29,50,6,3)

95

(29,50,6,4)

91

 

Table 5: Accuracy of the block-extension algorithm on multi-allelic markers.

 

Parameters

Percentage correctly recovered out of 100 runs

(15,10,2,0)

100

(15,10,2,4)

96

(15,25,2,0)

98

(15,25,2,4)

78

(15,50,2,0)

100

(15,50,2,1)

82

(17,10,2,0)

97

(17,10,2,4)

92

(17,25,2,0)

100

(17,25,2,1)

84

(17,50,2,0)

100

(17,50,2,1)

72

(29,10,2,0)

95

(29,10,2,4)

93

(29,25,2,0)

100

(29,25,2,1)

91

(29,25,2,2)

87

(29,50,2,0)

100

(29,50,2,1)

88

 

Table 6: Accuracy of the block-extension algorithm on biallelic markers.

 

Number of recombinants in the pedigree

0

1

2

3

4

Number of correct reconstructions out of 100 runs

100

88

72

64

54

 

Table 7: Accuracy decreases when the number of recombination events increases for the pedigree in Figure  with 50 biallelic marker loci.

 

Region name

Physical length (kbps)

Genotyped SNPs

Block

SNPs in each block

16a

40

14

1

5

16b

106

53

1

6

 

 

 

2

4

17a

186

70

1

6

 

 

 

2

5

 

 

 

3

4

 

 

 

4

6

18a

286

74

1

16

 

 

 

2

6

 

 

 

3

4

 

Table 8: The regions and blocks on chromosome 3.

 

Block

EM

PedPhase

MRH

 

Common haplotypes

Frequencies

Common haplotypes

Frequencies

Common haplotypes

Frequencies

16a-1

4 2 2 2 2

0.4232

4 2 2 2 2

0.3817

4 2 2 2 2

0.3779

 

3 4 3 4 4

0.2187

3 4 3 4 4

0.1720

3 4 3 4 4

0.1744

 

4 2 2 2 4

0.2018

4 2 2 2 4

0.1935

4 2 2 2 4

0.1802

 

3 4 2 2 4

0.1432

3 4 2 2 4

0.1613

3 4 2 2 4

0.1802

16b-1

3 2 4 1 1 2

0.8014

3 2 4 1 1 2

0.7634

3 2 4 1 1 2

0.7849

 

1 3 2 3 3 4

0.0833

1 3 2 3 3 4

0.0753

1 3 2 3 3 4

0.0753

16b-2

4 1 2 2

0.5410

4 1 1 2

0.4892

4 1 1 2

0.4826

 

2 3 3 4

0.2812

2 3 3 4

0.2581

2 3 3 4

0.2616

 

2 3 3 2

0.1562

2 3 3 2

0.1344

2 3 3 2

0.1512

17a-1

3 1 3 4 4 4

0.3403

3 1 3 4 4 4

0.3172

3 1 3 4 4 4

0.3226

 

1 3 3 2 4 2

0.3021

1 3 3 2 4 2

0.2419

1 3 3 2 4 2

0.2473

 

3 3 2 4 2 4

0.1354

3 3 2 4 2 4

0.0914

3 3 2 4 2 4

0.0914

 

3 3 3 4 4 4

0.1021

3 3 3 4 4 4

0.1183

3 3 3 4 4 4

0.1183

 

3 3 2 4 4 4

0.0681

3 3 2 4 4 4

0.0806

3 3 2 4 4 4

0.0806

 

1 3 3 2 4 4

0.0521

 

 

 

 

17a-2

2 3 2 4 2

0.3542

2 3 2 4 2

0.2903

2 3 2 4 2

0.2903

 

3 3 4 2 4

0.3333

3 3 4 2 4

0.2957

3 3 4 2 4

0.3118

 

3 3 4 4 2

0.1458

3 3 4 4 2

0.1344

3 3 4 4 2

0.1237

 

3 4 4 4 4

0.1250

3 4 4 4 4

0.1452

3 4 4 4 4

0.1452

17a-3

4 4 3 1

0.4129

4 4 3 1

0.4355

4 4 3 1

0.4167

 

3 1 1 2

0.2813

3 1 1 2

0.2258

3 1 1 2

0.2051

 

4 1 3 1

0.2363

4 1 3 1

0.1935

4 1 3 1

0.2115

 

4 1 3 2

0.0696

4 1 3 2

0.0753

4 1 3 2

0.0705

17a-4

3 4 4 1 2 4

0.3854

3 4 4 1 2 4

0.3710

3 4 4 1 2 4

0.4429

 

2 3 2 4 3 2

0.3333

2 3 2 4 3 2

0.2903

2 3 2 4 3 2

0.2357

 

3 4 2 4 2 4

0.2500

3 4 2 4 2 4

0.1881

3 4 2 4 2 4

0.1857

18a-1

1444231214144132

0.2697

1444231214144132

0.2473

1444231214144132

0.1706

 

1444111214144132

0.2396

1444111214144132

0.2151

1444111214144132

0.2357

 

1444131214144132

0.1887

1444131214144132

0.2204

1444131214144132

0.2176

 

4222133313412211

0.1250

 

 

 

 

 

1444231234144132

0.0833

1444231234144132

0.0699

1444231234144132

0.0764

18a-2

3 1 2 4 4 2

0.4967

3 1 2 4 4 2

0.4892

3 1 2 4 4 2

0.4765

 

1 3 2 4 3 4

0.2604

1 3 2 4 3 4

0.1935

1 3 2 4 3 4

0.1765

 

3 1 2 2 4 2

0.1271

3 1 2 2 4 2

0.0753

3 1 2 2 4 2

0.0765

 

1 3 4 4 4 4

0.0938

1 3 4 4 4 4

0.0806

1 3 4 4 4 4

0.0941

 

 

 

1 3 2 4 3 2

0.0538

1 3 2 4 3 2

0.0588

18a-3

2 2 1 1

0.4186

2 2 1 1

0.4032

2 2 1 1

0.4214

 

4 3 3 3

0.2188

4 3 3 3

0.1935

4 3 3 3

0.1714

 

2 3 1 1

0.2064

2 3 1 1

0.2204

2 3 1 1

0.1928

 

4 3 1 3

0.1250

4 3 1 3

0.1559

4 3 1 3

0.1857

 

Table 9: Common haplotypes and their frequencies obtained by PedPhase, MRH and the EM method. In haplotypes, the alleles are encoded as 1=A, 2=C, 3=G, and 4=T.