Microsatellite Variation in Mountain Pine Beetle {Dendroctonus ponderosae Hopkins) in Western Canada: Spatial Genetic Analyses of Neutral Variation and Development of Genie Markers. G. D. N. Gayathri Samarasekera BSc, University of Sri Jayewardenepura, Sri Lanka, 2004 Thesis Submitted in Partial Fulfilment of the Requirements for the Degree of Master of Science in Natural Resources and Environmental Studies (Biology) The University of Northern British Columbia April 2011 © G.D.N. Gayathri Samarasekera, 2011 1*1 Library and Archives Canada Bibliotheque et Archives Canada Published Heritage Branch Direction du Patrimoine de I'edition 395 Wellington Street OttawaONK1A0N4 Canada 395, rue Wellington Ottawa ON K1A 0N4 Canada Your file Votre r6f6rence ISBN: 978-0-494-75181-7 Our file Notre r6f6rence ISBN: 978-0-494-75181-7 NOTICE: AVIS: The author has granted a nonexclusive license allowing Library and Archives Canada to reproduce, publish, archive, preserve, conserve, communicate to the public by telecommunication or on the Internet, loan, distribute and sell theses worldwide, for commercial or noncommercial purposes, in microform, paper, electronic and/or any other formats. L'auteur a accorde une licence non exclusive permettant a la Bibliotheque et Archives Canada de reproduire, publier, archiver, sauvegarder, conserver, transmettre au public par telecommunication ou par I'lnternet, preter, distribuer et vendre des theses partout dans le monde, a des fins commerciales ou autres, sur support microforme, papier, electronique et/ou autres formats. The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission. L'auteur conserve la propriete du droit d'auteur et des droits moraux qui protege cette these. Ni la these ni des extraits substantiels de celle-ci ne doivent etre imprimes ou autrement reproduits sans son autorisation. In compliance with the Canadian Privacy Act some supporting forms may have been removed from this thesis. Conformement a la loi canadienne sur la protection de la vie privee, quelques formulaires secondaires ont ete enleves de cette these. While these forms may be included in the document page count, their removal does not represent any loss of content from the thesis. Bien que ces formulaires aient inclus dans la pagination, il n'y aura aucun contenu manquant. 1*1 Canada Abstract The mountain pine beetle (MPB), Dendroctonus ponderosae Hopkins, is the most destructive pest of pine forests in western North America. In this study genetic variation of MPB in western Canada was explored both at neutral genomic and gene-linked microsatellite markers. To understand the spatial genetic structure and dispersal patterns, beetles from 49 sampling locations throughout western Canada were analyzed at 13 neutral microsatellite loci. The MPB exhibits significant north-south population structure in western Canada. The decline of genetic diversity from south to north and the lack of isolation by distance in the northernmost cluster are consistent with patterns expected from postglacial recolonization. In terms of dispersal patterns, northern outbreaks are consistent with an expansion out of Tweedsmuir Provincial Park, an early site of infestation in the current epidemic, while southern outbreaks are consistent with multiple centers of origin. To explore the gene-linked variation in the MPB genome, a panel of gene-linked markers was developed by screening an EST database of MPB. Fifty of 79 EST-derived markers developed were found to be polymorphic. A preliminary survey of five EST-derived microsatellite markers showed evidence of selection at two loci. The genetic variation at another three loci was comparable with the genetic variation at neutral microsatellite markers. This study showed that the use of EST-derived microsatellites is a promising approach for identifying signatures of selection. Further, the new markers developed will be useful in constructing linkage maps and in performing genome scans and further landscape-scale genomic studies in both MPB and other related bark beetles. ii Table of contents Abstract ii Table of Content iii List of Tables vi List of Figures viii Abbreviations x Acknowledgment xi 1. Chapter One: General introduction 1 1.1. Background information on mountain pine beetle (MPB) 3 1.1.1. Life cycle of MPB and the nature of damage 3 1.1.2. Impacts of the beetle on the forest system 6 1.1.3. Geographical distribution & MPB epidemics 7 1.2. Spatial genetic studies 9 1.2.1. Spatial genetic studies of other bark beetles 9 1.2.2. Spatial genetic studies of MPB 15 1.3. Microsatellite DNA 17 1.3.1. Microsatellites as genetic markers 17 1.3.2. Genomic distribution of microsatellites 18 1.3.3. Functional roles of microsatellites 19 1.3.4. EST-derived microsatellite markers (Genic microsatellites) 21 1.3.5. EST databases 23 1.3.6. EST database of MPB 24 1.4. Thesis Organisation and Objectives 26 2. Chapter Two: Spatial genetic structure of the mountain pine beetle {Dendroctonus ponderosae) outbreak in western Canada: historical patterns and contemporary dispersal 30 Abstract 31 2.1. Introduction 32 2.2. Materials and methods 35 2.2.1. Site selection and beetle collection 35 2.2.2. DNA extraction and evaluation 39 2.2.3. Microsatellites amplification 39 2.2.4. Hardy-Weinberg equilibrium and linkage disequilibrium 40 2.2.5. Genetic diversity 40 in 2.2.6. Population structure 41 2.2.7. Genetic differentiation 43 2.2.8. Gene flow 43 2.2.9. Historical demography 45 2.3. Results 45 2.3.4. Hardy-Weinberg equilibrium and linkage disequilibrium 45 2.3.5. Genetic diversity 46 2.3.6. Population structure 48 2.3.7. Genetic differentiation 53 2.3.8. Gene flow 54 2.3.9. Historical demography 58 3.4. Discussion 60 3. Chapter Three: Isolation and characterization of EST- derived microsatellite markers for the mountain pine beetle {Dendroctonus ponderosae Hopkins) 70 Abstract 71 3.1. Introduction 72 3.2. Materials and methods 72 3.2.1. Screening of EST database 72 3.2.2. Primer designing 73 3.2.3. Microsatellite amplification 73 3.2.4. Polymorphism within a location 74 3.2.5. Characterization of the loci 77 3.3. Results 77 3.3.1. Screening of EST database 77 3.3.2. Microsatellite amplification 78 3.3.3. Characterization of the loci 79 3.3.4. Polymorphism within a location 79 3.4. Discussion 81 4. Chapter Four: A preliminary survey of spatial genetic variation of the Mountain pine beetle {Dendroctonus ponderosae) outbreak in western Canada with five EST-derived microsatellite markers: detecting signatures of selection. 83 Abstract 84 4.1. Introduction 4.2. Materials and methods 4.2.1. Analysis for selection 85 90 94 IV 4.2.2. Variation link to the genes 4.3. Results 4.3.1. Evidence for selection 4.3.2. Variation link to the genes 4.4. Discussion 5. Chapter five: General discussion 97 97 97 105 106 114 Literature cited Appendix 123 147 v List of Tables Page Table 1.1. Mountain pine beetle cDNA libraries used to develop the EST database. The library name, tissues used, construction method and the sampling location of beetles collected for each library are given. 25 Table 2.1. Sampling locations (49), by region, for the mountain pine beetle with GPS locations, year sampled, number of beetles genotyped (N) (number of sites, in locations with more than one collection are given in brackets), mean observed heterozygosity (Ho), mean expected heterozygosity (He), mean number of alleles (NA), allelic richness (AR) and inbreeding coefficient (Fis) are shown 36 Table 2.2. Loci typed. Total number of alleles (NA), mean observed heterozygosity (Ho), mean expected heterozygosity (He), number of loci deviated from HWE before and after (in brackets) sequential Bonferoni correction, and fixation indices (F\S,FST and significant values (*** PO.001) are shown) 47 Results of likelihood-based assignment tests for locations in northern Alberta compared to all other locations. The rank (top 5) and Score (%) are estimated by GENECLASS2. The scores given are based on frequency based likelihood-based method (Paetkau et al. 1995). The ranking was also same when tested with a Bayesian likelihood-based method (Rannala and Mountain 1997). Locations codes are found in Table 2.1 59 Table 2.4. Results of likelihood-based assignment tests for locations northeast of the Rocky Mountains compared to all other locations1 . The rank (top 3) and Score (%) are estimated by GENECLASS2. The scores given are based on frequency based likelihood-based method (Paetkau et al. 1995) and a Bayesian likelihood-based method (Rannala and Mountain 1997). Locations codes are found in Table 2.1 59 PCR primers and reaction conditions for 50 polymorphic ESTderived microsatellite markers for Dendroctonus ponderosa. Representative Genbank accession numbers, repeat motif and location in the putative mRNA (coding sequence (CDS) or untranslated region (3' or 5') or shown. A '?' indicates the position is unknown or uncertain 75 Table 2.3. Table 2.4. Table 3.1. VI Table 3.2. Table 4.1. Table 4.2. Table A. 1 Table A.2 Table A.3 Table A.4 Genetic diversity statistics within MPB sampled in western Canada. For each locus the allele size range for all samples and the number of alleles (NA ) found in the initial survey of eight geographically separated MPB (f) and from 16 MPB from the Quesnel location are shown. For the Quesnel sampling location observed heterozygosity (H0>), expected heterozygosity (HE), and polymorphic information content (PIC) are also shown 80 Predicted function of EST loci based on BLASTx. The description, E values and scores are given 91 Genetic diversity of EST-derived loci and genomic loci at each sampling location. The observed (Ho) & expected heterozygosities (He) and number of alleles (NA) at each sampling location are shown. The data of locus 054, which was out of HWE (chapter 2), were not included in the average statistics of genomic loci 98 The composition of di and tri nucleotide repeats found among 14441 contigs in the build 8 of the MPB EST database 147 The PCR primers of monomorphic EST- derived microsatellite markers for Dendroctonus ponderosa 148 Genotypic data of 16 MPB from the Quesnel sampling location at 50 polymorphic EST-derived microsatellite markers 149 Genotypic data of MPB from six sampling location in western Canada at five chosen EST- derived microsatellite markers. Sample locations are Houston (HO), Mackenzie (MA), Grande Prairie (GP), Whistler (WH), Banff (BA), and Nancy Green (NG) 154 vn List of Figures Page Figure 2.1. Figure 2.2. Figure 2.3. Figure 2.4. Figure 2.5. Figure 2.6. Map of mountain pine beetle sampling locations in BC and Alberta. Sampling locations in 2005/06 are represented by light color circles (red) and 2007/08 sampling locations are represented by dark (blue) circles. One location, Cypress Hills (CH) on the Alberta Saskatchewan border (GPS coordinate 49.6130, -110.1884 sampled in 2008) is not shown in the map. The location name, GPS location, year sampled and number of beetles genotyped (N) are given in Table 1.1 38 Genetic diversity, (a) Decrease of mean expected heterozygosity (He) and allelic richness from south to north, (b) Pattern of genetic diversity within southern cluster, (c) Pattern of genetic diversity within northern cluster 48 STRUCTURE analysis of MPB individuals in Western Canada, (a) Mean log probability of data LnP(D) over 10 runs for each K value as a function of K (error bars represent standard deviation), (b) Evanno's ad hoc statistic; AK as a function of K (over 20 replicates), (c) STRUCTURE results showing in pie charts; proportion of membership of each predefined sampling location in each of the K = 2 clusters with prior sampling data not used. Solid line represents the boundary or the 0.50/0.50 membership isocline between two main clusters. The 80 percent membership isoclines in each cluster are represented by the dashed lines 49 Subclustering pattern of MPB in western Canada. Results of the TESS, BAPS (with and without spatial information), and BARRIER analyses are overlaid on top of STRUCTURE results 51 Neighbor-joining tree of MPB sampling locations based on pairwise Nei's genetic distance. Bootstrap values > 50% are shown are the respective nodes. STRUCTURE, TESS and BAPS identified clusters were overlaid on the tree. Samples locations with samples sizes < 15 are noted by red dotes 52 Isolation-by Distance (IBD) analysis within (w/n) and between (b/w) clusters identified. Regression of genetic differentiation [estimated by F S T / ( 1 - .FST)] against logarithm of geographical distances (km) based on Rousset's 1997. (a) IBD across the study area and within each main clusters; (b) IBD between clusters; (c) IBD within southern subclusters; (c) IBD within northern subclusters 55 vni Figure 2.7. Figure 3.1. Figure 4.1. Figure 4.2. Figure 4.3. Figure 4.4. Figure 4.5. Figure 4.6. Examples of interconnected sampling locations based on nonsignificant pairwise FST values. Solid lines between locations represent connections (i.e., non significant pairwise FST), (a) between sampling locations of the southern and northern clusters and (b) between four expansion sampling locations in Fairview (FVf), Fox Creek (FOf), Grande Prairie (GPf and GP). Note, unique to all sampling locations, all pairwise FST values (at P = 0.05) were significant in the Whistler (WA). Results are overlaid on the relevant STRUCTURE, BAPS, TESS and BARRIER results (Figure 2.4). The relative location of Tweedsmuir Provincial Park (TM) to the sampling location is shown 56 The composition of microsatellite sequences with four or more repeats in contigs in build 8 of MPB EST database. A total of 2938 microsatellite loci were found in 14441 contigs 78 Sampling locations and number of samples at each location. The locations cods are; HO (Houston), MA (Mackenzie), GP (Grande Prairie), WH (Whistler), NG (Nancy Green), and BA (Banff) 93 Comparisons of the FST values of EST-derived (darker, maroon) and genomic loci (lighter, purple). The overall FST at EST loci was 0.16 and that of at genomic loci was 0.064 100 Comparison of F C T values which represent genetic differentiation between northern and southern clusters 101 Allele compositions at each locus in each cluster, (a) the north south divergent shown at locus_675. (b) the allele compositions of other EST derived loci. Different colors represent different alleles at each locus 103 Analysis with the program DETSEL 1.0. locus_675 was an outlier at P= 0.01. Locus_054 and 884 were also outliers at P = 0.05 104 Analysis with the program LOSITAN 2.0; under SMM model at 0.99 confidence intervals. Locus_675 was a candidate for positive selection and locus_4357 was a candidate for balancing selection 105 IX Abbreviations Abbreviation AB AFLP BC CO cDNA CDS DNA EST GBP ITS2 rDNA RAPD RFLP MPB mRNA mtDNA ORF PCR SSCP UTR Description Alberta Amplified fragment length polymorphism British Columbia Cytochrome oxidase Complementary DNA Coding sequence Deoxyribonucleic acid Expressed sequence tags GAGA-binding protein Internal transcribed spacer 2 Ribosomal DNA Randomly amplified polymorphic DNA Restriction fragment length polymorphism Mountain pine beetle Messenger RNA Mitochondrial DNA Open reading frame Polymerase chain reaction Single stand conformation polymorphism Untranslated region X Acknowledgements First and foremost, I would like to express my sincere gratitude to my supervisor Dr. Brent W Murray. I am grateful for your continuous guidance, encouragement and support you gave me from the day I started. You contributed a lot to improve my understanding of the subject and to bring the project to this stage. I am also grateful to my thesis committee, Drs. Staffan Lindgren, Dezene Huber and David Coltman for your contributions and suggestions to improve this project. Thank you for devoting your valuable time for my academic success. Many thanks to the TRIA research management and again, to my supervisor, for the funding provided through Genome BC, Genome Alberta and Genome Canada grants. It was great to work under the large TRIA research project. I also thank UNBC for offering me a travel award to participate in Annual Conference of Canadian Society of Ecology and Evolution and for offering me teaching assistant positions. Those greatly increased my academic experience while compensating my expenses. I would also like to thank all the members in the lab. Thank you very much for all your help and cooperation from the beginning to the end. It was really nice to work with you all. A very big thanks goes to Dr. Celia Boon for bringing me to the field. How nice it was to have weekly meetings with forest insect research group (FIRG) at UNBC. The presentations and discussions helped me to improve in many ways. As a foreign student from a tropical country it was important for me to listen the FIRG members to get the background on other insects in pine forests in Western Canada. Many thanks go to Dr. Jean Wang in the HPC lab at UNBC for providing me a great working place for data analyses and thesis writing. I enjoyed that working place a lot and I miss it now. I am also grateful to Annick Pereira, the international student advisor at UNBC and Kendra Starr in the international office. You were always available to help and advise me in many ways. My warmest thank goes to Dr. Brent W Murray's wife, Jacqui Dockray. Thank you very much for everything you did for me and my family to initiate our lives in this new country. I will never forget how thoughtful and warm you were. My heartiest thank to Mark and Binithi. Mark, thank you for tolerating all the changes in your life and, being the constant source of encouragement. Binithi, I will never forget how understanding you were to your age. I also thank my parents. You never compelled me towards studies, but let me feel how interesting it is. Thank you very much for all of those who supported me in any respect for the completion of the project and during my stay in Prince George. I miss the many warm hearted people in nice cold Prince George. Lastly, in addition to the material world, I respect and thank the universal rhythm behind everything. xn Chapter One GENERAL INTRODUCTION 1 The mountain pine beetle (MPB), Dendroctonus ponderosae Hopkins (Coleoptera: Curculionidae: Scolytinae), is a native bark beetle in the forests of western North America. This species has recently caused an outbreak of record size and is arguably the most destructive pest of pine forests in western North America (Safranyik & Carroll 2006). The current outbreak has led to the destruction of large tracts of lodgepole pine (Pinus contorta Dougl. ex Loud. var. latifolia Engelm), the primary host species in British Columbia (BC), and has expanded the historic range of the beetle into the Peace River area of Alberta, where it is now threatening jack pine {Pinus banksiana Lamb.) of the boreal forest of northern Canada (Pan et al. 2007; Kurz et al. 2008; Carroll et al. 2004; Safranyik et al. 2010). Because of the severity of its impact on the environment, as well as subsequent effects on the timber industry and potential for further spread, the MBP has been the focus of a large amount of recent research (Pan et al. 2007; Kurz et al. 2008; Safranyik et al. 2010; Tidiane et al. 2010; Zheng & Aukema 2010). A detailed understanding of the biology of the MPB is critical in order to mitigate the effects of the current outbreak and to predict the occurrence of future outbreaks. Among many ongoing research projects, "TRIA-Mountain pine beetle systems genomics'''' (www.thetriaproject.ca) is a large multi-institutional research project on the epidemic. The long-term goal of this project is to use genomic information of the three interacting organisms in MPB epidemics; the bark beetles, associated fungal pathogens, and the host trees, to improve ecological risk models as well as economic models to make reliable predictions of the future forest inventory for industry. The studies described in this thesis were conducted under the larger umbrella of the TRIA project. 2 Three studies were conducted focusing on utilizing genetic markers generated from genomic data of MPB to explore spatial genetic structure, information which will lead to insights into the biology of the MPB. In the first study I conduct a detailed population genetic analysis on the MPB in western Canada using a dataset of 15 genomic microsatellite markers. The objective of this study was to increase the resolution of the population genetic structure of MPB described in Bartell (2008) and to understand dispersal of MPB in western Canada. The objective of the second study was to develop a suite of microsatellite markers within, or closely linked to, the coding loci of the MPB genome (i.e., genie microsatellite markers or EST-derived microsatellite markers) in order to increase the marker coverage available for future MPB genetic/genomic research. The third study was done to conduct a preliminary survey using chosen ESTderived markers with the objective of exploring the utility of genie microsatellites to identify loci under selection and to study spatial genetic structure. 1.1. Background information on mountain pine beetle (MPB) 1.1.1. Life cycle of MPB and the nature of damage The life history of the mountain pine beetle is well known (Safranyik & Carroll 2006). Reproductively mature adults emerge from brood trees in mid- to late-summer and disperse below the canopy in search of suitable hosts. Females initiate attack by boring through the bark and cambium, where they construct vertical ovipositional galleries. Aggregation pheromones that attract both sexes are emitted, mediating a mass 3 attack which is required to successfully colonize its hosts (Borden et al. 1982). As females excavate ovipositional galleries, they inoculate the tree with symbiotic ophiostomatoid fungi, including its primary mutualist Grosmannia clavigera (previously Ophiostoma clavigera), (Solheim & Krokene 1998; Six 2003) and usually also Ophiostoma montium (Harrington 1993) and Leptographium longiclavatum (Kim et al. 2005; Lee et al. 2006). Once a male has joined the female, mating occurs inside the gallery. The female then lays up to 75 eggs individually and alternatively along the sides of the gallery. About two weeks after oviposition, larvae hatch and mine horizontal galleries by feeding on phloem tissues. Both larval activity and fungal infection are responsible for killing the host tree (Safranyik & Carroll 2006). In a heavy attack the transport of food and water throughout the tree is disrupted and the host tree can be killed within one month (Amman et al. 1990; Niemann & Visintini 2005). However, the dying trees may remain green for the next 8 to 12 months. Larvae overwinter under the bark, and resume feeding the following spring. By late June to early July the larvae make oval-shaped chambers at the end of the larval galleries and there they pupate. Adults remain under the bark where they complete maturation by feeding on phloem and fungi, after which they emerge and the life cycle is repeated. The mountain pine beetle normally completes its life cycle in one year (univoltine). However, its rate of development is determined by temperature, thus in colder regions, e.g., higher elevations and latitudes, life cycles may take up to two years 4 (divoltine) to complete (Bentz et al. 2001, Gibson et al. 2006). Most of its life cycle is spent under the bark of the host pine trees. The MPB over-winters mainly as larvae, partially protected from the cold temperatures by the bark. In addition, the increase of blood glycerol level in relation to the temperature decline reveals that the MPB larvae use antifreeze-like substances to withstand in the winter. Studies have shown that most of the larvae can survive in temperatures as low as -40°C (Wygant 1940 in Safranyik & Linton 1998; Regniere & Bentz, 2007). Of key importance during the life cycle of MPB (and many other bark beetles) is the ability to transmit blue stain fungi. Spores of these fungi are transmitted to new trees during attack. Many different symbiotic fungal species have been found associated with MPB. Among symbiotic fungi transmitted by MPB, G. clavigera is highly pathogenic in Pinaceae (Yamaoka et al. 1995; Solheim & Krokene, 1998). The fungi germinate underneath the bark and grow rapidly along the galleries constructed by MPB as well as through the vascular tissues. The characteristic blue stain in MPB-killed trees results from the melanin produced by their symbiotic fungi. The fungi benefit from the association by being dispersed (Harrington 1993; Paine et al. 1997), whereas the beetles may benefit in several ways. For example, the fungi can help to overcome the host defences (Berryman 1972; Owen et al. 1987; Lee et al. 2006), and can provide a favourable condition for the MPB by altering the chemical and/or moisture composition (Reid 1961; Wagner et al. 1979). Further, the fungi provide nutrients required for MPB reproduction and/or development (Whitney et al. 1987; Goldhammer et al. 1990; Six & Paine 1998; Lee et al. 2006). 5 1.1.2. Impacts of the beetle on the forest system Mountain pine beetle populations have four distinct states; endemic, incipient epidemic, epidemic and postepidemic or collapse (Safranyik & Carroll 2006). These phases are defined based on the population size of the beetles relative to the abundance of available host. In the endemic phase the MPB density is low and they can only live in small, weakened and stressed trees. Co-occurrence with other secondary beetles is common in the endemic phase. Relatively larger numbers of beetles are required to overcome the defences of an average large-diameter tree and when population densities reach this level the MPB population is defined as incipient-epidemic. As beetle numbers further increase MPB attain the epidemic phase and may spread rapidly killing the majority of large-diameter trees over a vast area (Carroll et al. 2006; Safranyik & Carroll 2006). Finally, the MPB population collapses as the availability of the host declines over the time and space (Carroll et al. 2006; Safranyik & Carroll 2006). The MPB and regularly occurring forest fires historically played balancing roles in lodgepole pine forests by replacing older, weaker trees with younger, healthier trees (Safranyik & Carroll 2006). Relative to the past, the number of MPB outbreaks has risen at an increasing rate recently (Kurz et al. 2008). At the current magnitude of the epidemic the loss of millions of hectares of forests causes crucial environmental (Uunila et al. 2006) and economic effects (Wagner et al. 2006). Large volumes of high-value mature trees have been killed, reducing the available timber supply and thereby affecting the forest industries and forestry-dependent communities (Wagner et al. 2006). The destruction of habitat and changes in hydrology will also cause detrimental effects to 6 wildlife and aquatic ecosystems (Uunila et al. 2006). In addition, modelling has indicated that the removal of vast areas of forest cover by MPB will increase the global carbon dioxide burden (Kurz et al. 2008). Finally, the loss of aesthetic values of the forests is also a concern. 1.1.3. Geographical distribution & MPB epidemics The geographical range of MPB depends on the availability of both suitable host species and favourable climatic zones (Safranyik 1978). The historical range of MPB extended from northern Mexico to northwestern Canada through 12 U.S. states and 3 Canadian provinces (Carroll et al. 2004). The range of the primary host, lodgepole pine, extends further into the Yukon and northeast into much of Alberta (Carroll et al. 2004). Expansion of MPB into those regions may be restricted by the unfavourable climate (Carroll et al. 2004). The MPB range expansion has clearly followed the shifts of climatically suitable habitats (Carroll et al. 2004). The climatic suitability is expected to increase both within and outside the historic range of MPB, further increasing the potential for future epidemics (Carroll et al. 2004). Although the MPB was historically widespread, the outbreaks in Canada have been mainly restricted to the southern half of British Columbia (BC) and the extreme southwestern portion of Alberta (Safranyik & Carroll 2006; Safranyik et al. 2010). The spreading of outbreaks was also likely to be restricted by the non-forested prairies and the high elevations of the Rocky Mountains (Carroll et al. 2006). Only one outbreak has 7 been recorded in the Cypress Hills at the Alberta - Saskatchewan border (Safranyik & Carroll 2006). After the MPB epidemic started in mid-1990 in interior BC, outbreaks were reported in many regions across BC and Alberta (Carroll et al. 2006; Burton 2010). On account of the large area affected and considerable timber loss, this epidemic is recorded as the largest in Canadian history (Ritchie 2008, Tsui et al. 2009). So far, the ongoing MPB epidemic has attacked over 16 million hectares of pine forests in Western Canada (Kurz et al. 2008; Burton 2010). Mountain pine beetle is responsible for the loss of 400 million cubic meters of merchantable lodgepole pine in BC from the mid -1990s to 2008 (Stickney 2007). During the current epidemic, MPB crossed the Rocky Mountain barrier and invaded northeastern BC (Carroll et al. 2004). By 2006, the MPB epidemic extended into adjacent regions in Alberta (Safranyik & Carroll, 2006; Raffa et al. 2008). The MPB is now situated in close proximity to the boreal forest of northern Canada and modeling has predicted that much of the boreal forest will become climatically available to the beetle in the near future (Carroll et al. 2004). Of special concern is the jack pine {Pinus banksiana) that extends all the way to the eastern seaboard. Pinus banksiana is dominant in boreal forests and is a potential host for MPB (Cerezke 1995; Safranyik et al. 2010). The two hybrid zones of lodgepole and jack pine in northern Alberta were expected to act as effective corridors for MPB to invade the boreal forests (Carroll et al. 2004). Confirming the earlier predictions, MPB was reported from lodgepole pine stands 8 neighbouring jack pine forests in boreal forests (Safranyik et al. 2010). Mock et al. (2007) reported that if MPB successfully invades the jack pine in boreal forests the range of MPB may considerably increase as boreal forests extend into eastern Canada and the central United States. Due to the lower susceptibility of jack pine, the rate of growth and spread of MPB populations, within the boreal forests are expected to be low (Safranyik et al. (2010). However, Cullingham et al. (2011) has recently shown through genetic analysis successful MPB attack in pure jack pine. The shifts of MPB into new habitats and hosts will undoubtedly result in significant ecological and socio-economic consequences (Ayres & Lombardero 2000; Cullingham et al. 2011). 1.2. Spatial genetic studies 1.2.1. Spatial genetic studies of other bark beetles Most species of plants and animals show some level of genetic structuring (Allendorf & Luikart 2007). The degree of population differentiation is related to the amount of current and historic gene flow among populations. Analysis of population structure has been used to study the amount of current connectivity among populations of a species (Balloux & Lugon- Moulin, 2002) as well as to investigate evolutionary and historical events that may have led to the observed patterns of genetic differentiation (Cruzan & Templeton 2000). Hence, population genetic approaches are used to understand ecological and evolutionary aspects of many species. 9 Hundreds of species of bark beetles have been reported in the United States and Canada (Bentz et al. 2010) where they are common pests of conifers. The preferred host range, the size of the trees attacked, and the location and nature of the damage are specific for a given bark beetle (Cognato & Grimaldi 2009). Spatial genetic studies have been done on bark beetles using different molecular genetic markers and at various geographical scales (i.e., regional and range wide) to understand various ecological and evolutionary questions. Among the molecular markers available to date, allozymes were the first genetic markers used to study population structure (Avise 2004). Stock et al. (1979) assessed the degree of genetic divergence between Douglas-fir beetles (Dendroctonus pseudotsugae) in the Pacific northwest, using isozymes produced by 13 gene loci and noted a clear genetic differentiation between coastal and inland groups. Anderson et al. (1979) studied six genes electrophoretically, in five populations of the southern pine beetle, Dendroctonus frontalis and found that eastern and western populations of D. frontalis have become genetically differentiated. The populations in Mexico and Arizona were genetically different from the others (Texas, Georgia, and Virginia) and from each other reflecting the geographical separation at the genetic level (Anderson et al. 1979). Six et al. (1999) assessed the range-wide population genetic structure of the sister species of MPB (Kelley & Farrell, 1998), the Jeffrey pine beetle (D.jeffreyi) and discovered two groups (northern and southern), consistent with geographic isolation. 10 Some population genetic studies, that utilized methods such as mitochondrial and nuclear sequencing, showed the role of geographic barriers on population dynamics of bark beetles (see Bartell 2008). Duan et al. (2004) analyzed the population genetic structure of Tomicus piniperda in 12 populations in Yunnan (southern China) and one in Jilin (northern China) using mitochondrial cytochrome oxidase {COI and COIT) and nuclear internal transcribed spacer 2 (ITS2) of ribosomal DNA (rDNA) and 28S-rDNA sequences, and compared the results to those obtained in France. They showed that the Yunnan populations differed markedly from French and Jilin populations and concluded that the individuals sampled in Yunnan belong to a new, undescribed species {Tomicus sp. nov.). Ritzerow et al. (2004) further supported the population structuring and differentiation in T. piniperda by analyzing the sequences of a region of mitochondrial COI genes in beetle samples from European, Asian and American. The bark beetle Tomicus destruens, is restricted to the Mediterranean basin and the Atlantic coasts of north Africa and Portugal (Horn et al. 2006). A phylogeographic analysis done using the single stand conformation polymorphism (SSCP) of COI, showed that T destruens populations of southern and central Italy strongly differ from a population of northern Italy (Faccoli et al. 2005). They attributed the observed geographic structure to the fragmentation of the host pine ranges. Horn et al. (2006) studied 42 populations of the same species, using sequences of mitochondrial genes COI and COII and identified two clades as eastern and western among 53 haplotypes. Horn et al. (2006) discussed the potential roles of host species, climatic parameters and geographical barriers and compared the phylogeographic patterns to classical models of 11 postglacial recolonization in Europe to explain the two clades and contrasting levels of genetic structure observed within each clade. Compared to the population genetic studies on bark beetles in Europe, considerably fewer studies have been done on the bark beetles in North America. Among them, Dendroctonus brevicomis from populations across its range in North America have been studied using PCR based restriction fragment length polymorphism (RFLP) analysis on a 1250-bp region of the mitochondrial COI gene (Kelley et al. 1999). Differences between western (California, Oregon, Idaho, and British Columbia) and eastern (Colorado, Utah, Arizona, New Mexico) populations, suggested that D. brevicomis is composed of two cryptic species (Kelley et al. 1999). They suggested that those populations of D. brevicomis may have become reproductively isolated as a consequence of the geographic separation of the host varieties. Cognato et al. (2003) performed cladistic and nested clade analyses of mtDNA COI sequence data from 95 pinyon pine beetle Ips confusus individuals collected from two hosts, Pinus monophyllae and Pinus edulis, and an atypical host, spruce (Picea pungens), from 10 western United States populations. The three main haplotype lineages identified by the nested clade analysis corresponded with three geographic localities reflecting the effect of past glaciation events but not differences in host use (Cognato et al. 2003). Maroja et al. (2007) used mitochondrial DNA sequences and allele frequencies at nine microsatellite loci to examine genetic population structure across the current range (North America) of the spruce beetle {Dendroctonus rufipennis). Three major groups were identified, two of which were found across northern North America, from Newfoundland to Alaska, on 12 white spruce (Picea glauca), while a third group was found in the Rocky Mountains, on Engelmann spruce (Picea engelmannii). This study showed the effect of both geographical regions and differences in host use on the population structuring in D. rufipennis. Contradictory results have been observed in some studies regarding the population structuring of bark beetles (i.e., population structuring showed only at some markers and/or depending on geographical scale of the study). Stauffer et al. (1999) analyzed spruce bark beetle (Ips typographies) by enzyme electrophoresis and by mitochondrial DNA sequence analysis in order to quantify the degree of population differentiation. Mitochondrial DNA grouped spruce beetles into three as central European, Scandinavian, and central Russian. Among the eight haplotypes identified, one was found only in Russian and Lithuanian populations, while the other seven haplotypes showed different degrees of admixture over the study area. In contrast, enzyme electrophoresis showed a high gene flow among all European populations (Stauffer et al. 1999). Other studies of genetic variation in spruce beetle show no evidence of population structure. For example, Salle et al. (2007) analyzed five nuclear microsatellite markers and found a homogeneous genetic structure in spruce bark beetles across 28 locations in Europe. Felix et al. (2008) reported that spruce bark beetles in Switzerland had experienced various and repeated population dynamics and hence they expected to see a greater genetic differentiation and population structuring. However, the analysis done with five nuclear microsatellites on spruce bark beetles sampled from pheromone traps at 30 locations over Switzerland did not give evidence for population 13 structure, isolation by distance, or recent bottlenecks. Felix et al. (2008) assumed that the high gene flow or high effective population sizes prevent the genetic differentiation of spruce bark beetles (/. typographies) in Switzerland. Similarly, contradictory results have been also reported in studies done on Tomicus piniperda beetles. Kerdelhue et al. (2002) showed that T. piniperda in 16 populations sampled from six pine species in France, clustered in two mitochondrial DNA haplotypic groups (clades A and B). Further, the length differences observed in internal transcribed spacer 1 of ribosomal DNA also supported the divergence between clades A and B. In contrast, only very weak population differentiation and weak isolation by distance effect were observed in T. piniperda in France when five microsatellite markers were used by Kerdelhue et al. (2006) in a separate study. The above examples reveal that the marker system used and the scale of the study may affect the ability to detect population structure. The discrepancies between nuclear and mitochondrial markers could be due to the maternal inheritance of mitochondrial DNA and to sex-biased dispersal (Salle et al. 2007). The ability to recognize population structure is not same among the markers (Dayanandan et al. 1999). Even within the microsatellite markers, the mutational rate is not the same among the markers. Depending on the mutation rate of the locus, a microsatellite marker may or may not show the true structuring pattern (Balloux & Lugon-Moulin 2002). Further, marker number is typically related to variance in parameter estimates. With fewer markers the variance from the true structure is expected to increase. Bamshad et al. (2003) reported 14 that the use of higher number of markers can increase the resolution of a study. As the two contradictory studies described above used only five microsatellite markers, chance alone (i.e., marker selection) might also be a reason for the lack of genetic structure noted in each species. Most population genetic studies of bark beetles support the idea that geographic barriers such as mountain ranges (Horn et al. 2006) and large deserts (Mock et al. 2007), can prevent gene flow between populations and cause genetic differentiation between populations. In addition, host species, climatic parameters and postglacial recolonization can lead to population structuring (Horn et al. 2006). 1.2.2. Spatial genetic studies of MPB Spatial genetic studies on MPB have also been done by researchers using different molecular markers and at various scales. Two studies (Stock et al. 1984; Calaps et al. 2002 cited in Bartell 2008) reported no evidence of population structuring in MPB. Calaps et al. (2002) used randomly amplified polymorphic DNA (RAPD) markers to study MPB and found no evidence for population structure. However, this study might have been affected by the fungal DNA, as the initial DNA extractions were performed on whole beetles and did not control for fungal contamination (see Bartell 2008). Further, the small sample size per population (15) may also not be adequate to detect genetic differentiation. Stock et al. (1984) used allozyme analysis (18 allozyme; six polymorphic and 12 monomorphic loci) to study MPB in 15 sites in seven western states in the United States and a high level of genetic similarity was observed across the study area. 15 However, by doing locus specific analysis Stock et al. (1984) reported that some allozyme markers {e.g., esterase, aspatate amino transferase 1, leucine aminopeptidase 2) showed a greater genetic differentiation among MPB. Use of monomorphic loci could be one reason for the observed low resolution of this study. Further, the effect of sample size is unknown since the sample size has not been reported. Most of the previous studies provide evidence for genetic differentiation and population structuring in MPB collected from different geographic locations. Stock & Guenther (1979) examined MPB at six locations in the Pacific Northwest using isozyme variation, and two major groups were identified by analyzing pairwise similarity coefficients of the locations. The observed north-south distribution of beetles in Idaho was well reflected by the gene frequencies at the aspartate aminotransferase locus (Stock & Guenther 1979). In addition, pronounced differences in allele frequencies among MPB in geographically distinct locations (the Mammoth lake and Lassen National forest in California) have been found in a study done based on three polymorphic allozyme (esterase, peptidase and phosphoglucose isomerase) markers (Kelley et al. 2000). The scale of this study was smaller compared to the Stock et al (1984), but the use of loci that showed higher differentiation might have increased the resolution {i.e., Stock et al (1984) reported that esterase locus varied most among groups). Mock et al. (2007) used a combination of amplified fragment length polymorphism (AFLP) analysis & mitochondrial sequencing analysis of portions of the COI, COII and tRNA - LEU genes. Mock et al. (2007) found evidence of genetic structuring of MPB and a broad isolation-by-distance pattern. 16 Bartell (2008) used microsatellite markers to study the population genetic structure of MPB and generated many informative results. The sampling design in his study was uniform over a large geographical area. Bartell (2008) used six microsatellite markers on adult beetles collected from 35 infested lodgepole pine stands throughout western Canada (BC and Alberta). At each sampling site, an average of 46 beetles (standard deviation of 5.17) were analyzed. AMOVA analysis revealed, shallow but highly significant D. ponderosae population structure in Western Canada (global FST= 0.03828; P < 0.00001). The Bayesian analyses (STRUCTURE and TESS) supported the existence of northern and southern clusters. A S AMOVA analysis further refined the boundary between the two groups. The decline of genetic diversity from south to north and presence of IBD effect in southern cluster but not in northern cluster were some important findings of Bartell (2008). 1.3. Microsatellite DNA 1.3.1. MicrosateUites as genetic markers MicrosateUites, Simple Sequences Repeats (SSRs), or Short Tandem Repeats (STRs) are tandem repeats of short (l-6bp) nucleotide motifs. MicrosateUites have been found in all eukaryotic genomes sequenced so far (Goldstein & Schlotterer 2001; Mudunuri & Nagarajaram 2007). Microsatellite loci are distributed throughout the nuclear and chloroplast genomes, and are sometimes found in mitochondrial DNA. A large percentage of microsatellite loci exhibit high levels of variability in repeat number 17 but, the underlying molecular mechanism generating microsatellite variability is not fully understood (Goldstein & Schlotterer 2001). It is believed that microsatellites evolve through slipped-strand impairing (Levinson & Gutman 1987), uneven recombination, or a combination of both (Jakupciak & Wells 1999; Wells, 1996). The number of repeats at a microsatellite locus may vary from 5-80 within a given population. Microsatellite loci can be amplified by PCR and the length differences can be detected by using various detection systems. These codominant genetic markers are extensively used in forensics, parentage testing, analysis of genetic structure of populations and the assessment of phylogenetic relationships (Goldstein & Schlotterer 2001). 1.3.2. Genomic distribution of microsatellites Microsatellites were initially thought to occur only in noncoding regions of the genome, but recent studies have shown the presence of microsatellite sequences in all regions of a gene; within coding sequence (CDS) as well as within 5' and 3' untranslated regions (UTRs) and introns (Peng & Lapitan 2005; Varshney et al. 2005; De la RosaReyna et al. 2006). The abundance of microsatellite sequences may be low in the coding regions relative to those found in noncoding portions of the genome however (Goldstein & Schlotterer 2001; Hancock 1999). Among microsatellites poly (A) or poly (T) mononucleotide repeats appear to be the most abundant in many genomes, but beyond this a common frequency distribution pattern can not be defined since slight differences occur among organisms (Goldstein & Schlotterer 2001). However, the dinucleotide 18 repeats have been reported as the most common type of microsatellites found in arthropods (Demuth et al. 2007). 1.3.3. Functional roles of microsatellites Although it was generally assumed that microsatellites are selectively neutral and represent 'junk DNA' of a genome, there is evidence that some microsatellites have functional importance (Kunzler et al. 1995; Belkum 1999; Moxon & Wills 1999; Kashi & King 2006; Donaldson et al. 2008). Regulatory elements involved in gene expression are usually located upstream from the coding sequence. Microsatellite sequences found within the upstream regulatory region have a probability of being involved in gene expression. They may serve as regulatory elements which through interactions with gene regulatory proteins may either switch on/off a gene or control the level of expression (Riva 2000). The ability of some microsatellite repeats to bind with proteins, indicating their possible functional role in gene regulation, has been shown both in vivo and in vitro (Timchenko et al. 1996; Epplen et al. 1996; Jacob et al. 2004; Gangwal et al. 2008). Indeed, various studies have provided evidence for regulatory functions of microsatellites in expression of some genes. For an example, the (GA)n repeat sequence present in promoters, named 'GAGA' elements, regulates numerous developmental events in animals (Bevilacqua et al. 2000; Busturia et al. 2001). Studies have shown that a protein called GAGA-binding protein (GBP) binds only to this microsatellite sequence in the promoter and that this binding enhances the expression of the gene (Sangwan & O'Brian 2002). Polymorphic microsatellite loci found within some genes have shown quantitative genetic variation (Kashi et al. 1997). For example, the length polymorphism of the (CT)n 19 microsatellite locus in the 5'UTR of the waxy gene in rice has a quantitative effect on the percentage of amylose content (Ayers et al. 1997; Prathepha 2008). Further, the microsatellite sequences found in introns proven to be involved in a range of activities in various organisms, including control of transcription, control of splicing mechanism altering the final protein product (alternative splicing) or the efficiency of splicing, while, the microsatellites in 3' UTR regions might be involved in gene silencing or producing longer mRNAs by transcription slippage (Pagani et al. 2000; Fabre et al. 2002: Li et al. 2004; Varshney et al. 2005). The association between microsatellite repeat number and some diseases further reveals the functional significance of microsatellite sequences (Kovtun et al. 2001; Goldstein & Schlotterer 2001). Recent studies show that changes in microsatellite sequences can cause selectively disadvantageous effects (Goldstein & Schlotterer 2001). For example, the length changes of tandem repeats have been identified as the causes of about 20 diseases in humans (Goldstein & Schlotterer 2001). Variation in microsatellite repeat number is known to have definite effects not only on the expression (Liu et al. 2000) but also on the protein product of a gene (Li et al. 2004). For example, the expansion of CAG/CTG repeats has a higher propensity to form secondary structures and hence, is found to be accountable for some triplet expansion disorders (Goldstein & Schlotterer 2001). Chamberlain et al. (1994) reported that some transcription factors contain stretchers of CAG repeats that code for polyglutamine tracts. Using an androgen receptor they showed that the length-changes to this region alters the binding ability with the target, hence preventing the expression of several different reporter genes. 20 The presence of polymorphic microsatellite sequences within genes, and the rising evidence of their involvement in gene regulation, indicates that genie microsatellites can be a major source of functional molecular evolution (Kunzler et al. 1995; Rosenberg et al. 1994). Microsatellite sequences occurring within or linked to the coding regions may enhance, suppress or modify the functions of genes leading to the evolution of the functional portion of a genome (Kunzler et al. 1995; Rosenberg et al. 1994). 1.3.4. EST-derived microsatellite markers (Genie microsatellites) Microsatellite markers are broadly divided into two categories, genomic and genie microsatellite markers, usually depending on the method of development. The microsatellite markers that are developed based on expressed sequence tags (ESTs) databases are known as genie microsatellite markers, i.e., being part of a gene. In contrast, the microsatellite markers that are isolated by traditional methods, such as from repeat-enriched genomic libraries, are known as genomic microsatellites and are usually considered as neutral, non-gene-linked markers. Expressed sequence tags are short sequence reads generated from a cDNA library constructed from either a whole organism, specific tissue or developmental stage. A cDNA library is a collection of clones that each contains a DNA copy of a messenger RNA (mRNA) present at the time of RNA extraction. A collection of EST sequences for an organism (also known as a library or database), therefore represents the expressed 21 genes (i.e., genes transcribed to mRNA) of that organism. A single EST sequence however, usually represents only a portion of an expressed gene. Many studies have shown the presence of microsatellites in EST libraries (Prasad et al. 2005; Kim et al. 2008; Pannebakker et al. 2010). With the advancement of sequencing technology EST libraries have been generated for many organisms and hence, ESTs have become a very efficient source for microsatellites marker development (Kantety et al. 2002). The in silico mining of EST databases rapidly allows identification of a large numbers of microsatellites markers and is therefore an attractive alternative to the earlier approach of generating repeat-enriched genomic libraries (Liu et al. 1999; Coulibaly et al. 2005; Perez et al. 2005; Bouck & Vision, 2007). In addition to the ease and efficiency of development of EST-derived microsatellite markers, they are becoming increasingly popular due to several reasons (Hisano et al. 2008). First, microsatellites in ESTs are flanked by the transcribed regions, and these regions are more conserved compared to the noncoding regions of a genome (Grillo et al. 2006). This increases the stability of the primers and hence reduces the occurrence of null alleles (Grillo et al. 2006). Second, the conserved nature of the flanking region increases the transferability of EST-derived microsatellites between different species than genomic microsatellites (Chabane et al. 2007). Last, EST-derived microsatellites are more informative because they not only provide information about the genetic polymorphism but are also closely linked to, or are part of, the functional allelic differences of expressed genes (Decroocq et al. 2003). Genie microsatellites have also 22 been used to study allelic differences of gene expression (Ayers et al. 1997; Varshney et al. 2005). Genie microsatellites have been categorized as "perfect" genetic markers since they exist within genes and represent actual gene linked variation of an organism (Chabane et al. 2007). EST-derived microsatellites have been used in both population genetic and population genomic studies (Kim 2008). Due to the efficiency of development and large number of potential applications, genie microsatellite markers have been developed for many species of both animals and plants (Prasad et al. 2005; Varshney et al. 2005; Kim et al. 2008; Pannebakker et al. 2010). 1.3.5. EST databases Many ESTs are often partial sequences that match to the same mRNA of a gene. To facilitate the use of the EST databases the partial sequences of the same gene are usually assembled into EST contigs. Within a database some sequences exist without any matching sequences and those are known as singletons. The function of many of these transcripts can be predicted using bioinformatics tools. In addition to their usefulness in developing gene linked molecular markers, EST databases have many applications in genomic science, including gene discovery, gene expression level studies (EST microarray), identification of potential microRNAs, comparative genomics, functional diversity, and are therefore, being rapidly generated for various organisms (Ayers et al. 1997; Mita et al. 2003; Peng & Lapitan 2005; Varshney et al. 2005; Zhang et al. 2005; Clynen et al. 2006). 23 1.3.6. EST database of MPB Under the TRIA project, an EST database has been developed for MBP (Password protected site: http://seqweb.bcgsc.ca/BSF_Beetle/homepage.pt). The MPB samples for this database have been collected from three locations in northern BC (Table 1.1). The ESTs in this database have been generated from a total of 14 cDNA libraries as summarized in Table 1.1. This database represent mRNAs from different tissues (antennae, head, midgut etc.), and different developmental stages; larvae, pupae, teneral adults and adults, further, to induce the expression of certain genes some beetles have been treated with certain chemicals before the mRNA extraction (Table 1.1). Hence, the EST database of MPB represent large array of expressed genes. The most recent build of MPB EST database (build eight, Feb 2011) consists of 178536 total reads. These reads have been assembled into 14441 contigs and 7955 singletons. Chapters three and four are based on microsatellite markers developed by screening the contigs in build eight. 24 Table 1.1. Mountain pine beetle cDNA libraries used to develop the EST database. The library name, tissues used, construction method and the sampling location of beetles collected for each library are given. Library Tissue Construction method Location DpoOl Larvae, mixed instar and undetermined sex, whole Pupae, undetermined sex, whole Pupae, undetermined sex, whole Antennae from adults of undetermined sex Whole teneral adults of undetermined sex Whole emerged adults, equal quantity of sexes Whole emerged adults, unsexed Whole emerged adults, equal number of sexes Midgut and adhering fatbody of emerged adults of both sexes Midgut and adhering fatbody of emerged adults of both sexes Midgut and adhering fatbody of emerged adults of both sexes Midgut and adhering fatbody of emerged adults of both sexes Heads from untreated teneral adults and adults topically treated Whole larvae, Cold acclimated Untreated, Un-normalized KK Untreated, Un-normalized KK Untreated, Normalized KK Untreated, Un-normalized KK Untreated, Normalized KK 24 h after juvenile hormone III treatment, un-normalized Untreated, Un-normalized KK Dpo02 Dpo03 Dpo04 Dpo05 Dpo06 Dpo07 Dpo08 Dpo09 DpolO Dpol 1 Dpol2 Dpol3 Dpol4 Treated with a-pinene, P-pinene, 3-carene, verbenone, or myrcene Vapour, un-normalized 1,5, and 16 hr after antennectomy and topical juvenile hormone treatment, un-normalized KK KK JC 1,5, and 16 hr after antennectomy and topical juvenile hormone treatment, normalized JC After feeding on lodgepole pine for up to 64 h, normalized JC After feeding on lodgepole pine for up to 64 h, un-normalized JC Treated with JHIII or exposed to various monoterpene .normalized KK normalized TJ Sampling locations; KayKay Forest Service Road, Prince George (KK), Jackman Flats Provincial Park (JC), and Crown land just west of Tete Jaune Cache (TJ). The approximate coordinates are N 54 02.731' W 123 19.109*, N 52 55.958' W 119 22.457' and N 53 03.588' W 119 36.879', respectively. 25 1.4. Thesis Organisation and Objectives This thesis is written in a sandwich format where each chapter is a self-contained study. The five chapters included in the thesis are; an introductory chapter (Chapter 1), three data chapters (Chapter 2, 3 and 4) and a final chapter with a general discussion of the overall study (Chapter 5). The references cited in all the chapters are included into one reference list which follows chapter 5. The appendix contains the genotypic data and supplementary tables and figures. The overall objective of this thesis was to analyze the spatial genetic variation of MPB over a large geographical area in western Canada, including recent outbreak locations. Chapter 2 describes the analysis of genomic, 'neutral' microsatellite variation over the entire sample of MPB in western Canada. Genie microsatellite variation is explored in chapters 3 and 4. Chapter 3 describes the development of genie microsatellites for MPB while chapter 4 illustrates their usefulness in a preliminary survey of 5 loci at 6 key sampling locations. The studies described in chapters 2 and 3 are intended for publication (chapter 3 was accepted for publication in Molecular Ecology Resources) and have received input from the co-authors. The main objective of each data chapter as well as the contribution of all other authors to this work is described below. 26 Chapter Two Title: Spatial genetic structure of the mountain pine beetle {Dendroctonus ponderosae) outbreak in western Canada: historical patterns and contemporary dispersal List of Authors: N. Gayathri Samarasekera, Nicholas V. Bartell, B. Staffan Lindgren, Janice E.K. Cooke, Corey S. Davis, Patrick M. A. James, David W. Coltman, Karen E. Mock, and Brent W. Murray The main objective of the work conducted and described in chapter 2 was to determine the population genetic structure of MPB in western Canada and to understand dispersal patterns. For this study a large microsatellite dataset developed under the mountain pine beetle system genomics (TRIA) project was analyzed. This dataset was an extension of the dataset developed by Bartell (2008). Bartell's (2008) dataset consisted of genotypic data of about 2000 beetles genotyped at six microsatellite markers, while the current dataset consists of genotypic data of 4607 beetles at 15 microsatellite markers, including one sex linked marker. Further, the beetle samples collected in the Bartell (2008) study were mainly from BC, collected in 2005/06, while the current dataset contains beetles from BC and recent outbreak locations in Alberta. The sampling of beetles used in this study was done by both University of Northern British Columbia (supervised by Dr. Brent W. Murray & Dr. B. Staffan Lindgren) and University of Alberta (supervised by Dr. Janice E.K. Cooke). The genotyping was done at the University of Alberta under the supervision of Dr. Janice 27 E.K. Cooke and Dr. David W Coltman. The microsatellite markers used in this study were from Davis et al. (2009) and the lab work for genotyping was mainly carried out by Andrea Singh under the supervision of Corey Davies. My responsibility was to do a detailed population genetic analysis on the large microsatellite dataset and to be the lead author for the resulting manuscript (i.e., primary author for the method, results and the discussion sections). Dr. Brent W. Murray was the primary supervisor of this study. This chapter contains inputs and editorial contributions from Dr. David W Coltman, Karen E. Mock, B. Staffan Lindgren, Nicholas V. Bartell, Patrick M. A. James and Brent W Murray. This chapter was formattered for the journal of Molecular Ecology\ Chapter Three Title: Isolation and characterization of EST-derived microsatellite markers for the mountain pine beetle (Dendroctonus ponderosae Hopkins) List of Authors: N Gayathri Samarasekera, Christopher I. Keeling, Jorg Bohlmann, Brent W. Murray The main objective of the study described in this chapter was to develop a suite of genie microsatellite markers {i.e., markers closely linked to or part of the coding region of mountain pine beetle genes). For this study an EST database developed by the TRIA project was used as the genetic source to identify possible microsatellite loci in the expressed genes of MPB. This MPB EST database was developed through collaboration with UBC (Dr. Christopher I. Keeling and Dr. Jorg Bohlmann) and UNBC (Dr. Dezene 28 Huber). My role in this study was to screen the database for microsatellite sequences and to develop a set of polymorphic genie microsatellite markers. I conducted all lab work and was the lead author for the preparation of the manuscript. Dr. Brent W. Murray was the primary supervisor of this study. The manuscript has received editorial comment from all co-authors and from the Subject editor of Molecular Ecology Resources. It was accepted for publication by the journal Molecular Ecology Resources, Dec 2010. All genie microsatellite markers developed have been submitted to the Molecular Ecology Resources Primer Database (www.tomato.biol.trinity.edu). Chapter Four Title: A preliminary survey of spatial genetic variation of the mountain pine beetle (Dendroctonus ponderosae) outbreak in western Canada with five EST-derived microsatellite markers: detecting signatures of selection List of Authors: N Gayathri Samarasekera, Brent W. Murray Chapter 4 describes a preliminary population survey done using five of the newly developed EST-derived markers (Chapter 3). The main objective of this work was to detect the feasibility of using these markers to identify loci under selection. I conducted all lab work and analysis for this study under the supervision of Brent W Murray. 29 Chapter Two SPATIAL GENETIC STRUCTURE OF THE MOUNTAIN PINE BEETLE {DENDROCTONUS PONDEROSAE) OUTBREAK IN WESTERN CANADA: HISTORICAL PATTERNS AND CONTEMPORARY DISPERSAL1 1 Chapter 2 is a modified version of the manuscript that has been prepared for submission for publication. 30 Abstract The mountain pine beetle is currently causing an epidemic of record size in western Canada. A lack of long distance dispersal data on bark beetles has limited our understanding of and ability to manage epidemics. Our goal was to determine the spatial genetic variation found among western Canadian mountain pine beetle populations, from which genetic structure and dispersal patterns may be inferred. Beetles from 49 sampling locations throughout British Columbia and Alberta were analyzed at 13 microsatellite loci. The mountain pine beetle exhibits significant north-south population structure in western Canada as supported by: 1) Bayesian based analyses (STRUCTURE; TESS; BAPS), 2) north-south genetic relationships and diversity gradients; and 3) the lack of isolation by distance in the northernmost cluster. The north-south structure is proposed to have arisen from the processes of postglacial recolonization as well as climate-driven differences in population dynamics. In terms of dispersal patterns, a prominent initial hypothesis suggested that the epidemic grew from an epicenter in Tweedsmuir Provincial Park, an early site of infestation in the current epidemic. Our findings, however, are consistent with spatiotemporal analyses of the current epidemic that supports a multicenter hypothesis. Northern outbreaks are consistent with an expansion out of Tweedsmuir Provincial Park while southern outbreaks are consistent with multiple centers of origin. 31 2.1. Introduction The mountain pine beetle, Dendroctonus ponderosae Hopkins (Coleoptera: Curculionidae: Scolytinae), is the most destructive pest of pine forests in western North America (Safranyik & Carroll 2006). An ongoing and unprecedented mountain pine beetle outbreak in western Canada has affected over 16 million hectares of pine forests (Kurz et al. 2008). Most of the mortality has involved the primary host, lodgepole pine {Pinus contorta Dougl. ex Loud. var. latifolia Engelm.), but most pine species are acceptable hosts to the beetle (Wood 1982). The outbreak expanded into lodgepole pine stands in northern Alberta in 2006 (Safranyik & Carroll 2006; Raffa et al. 2008), leading to fears that the epidemic could continue to spread eastwards into jack pine {Pinus banksiana Lamb.) in the boreal forest. This expansion would dramatically increase the range of MPB into eastern Canada and the central United States (Mock et al. 2007; Safranyik et al. 2010). Fire suppression and limited harvest of lodgepole pine over the past century have led to large, contiguous areas with a high density of trees vulnerable to beetle attack (Konkin & Hopkins 2009). In combination with a northern shift in climatic suitability, this has created ideal conditions for mountain pine beetle population expansion (Safranyik & Carroll 2006; Clark et al. 2010; Cudmore et al. 2010). Although D. ponderosae and other bark beetles are native agents of forest disturbance (Brunelle et al. 2008), epidemics make large contributions to global carbon dioxide emissions (Kurz et al. 2008) and inflict severe economic damage on forest industries and forestry-dependent communities (Wagner et al. 2006). With the current predictions of climate change, it is 32 likely that the frequency and severity of other bark beetle outbreaks, e.g., Douglas-fir beetle, D. pseudotsugae (Hopkins), and spruce beetle, D. rufipennis (Kirby), will increase in the future (Raffa et al. 2008). It is, therefore, vital to study the dynamics of the spread of the current outbreak, in order to aid in our understanding and ongoing management of the current, as well as future, bark beetle outbreaks. Mountain pine beetle outbreaks may arise locally, from the expansion of numerous endemic-phase populations, or as a result of long distance dispersal from epicenters. Long distance bark beetle dispersal is thought to be a passive process in which emerging beetles are caught in updrafts (Chapman 1962; Furaiss & Furniss 1972; Safranyik et al. 1989). This moves them above the canopy (Safranyik et al. 1992), from where they may be transported hundreds of kilometers by atmospheric winds (Jackson et al. 2008). A complex combination of the two modes is also possible (Namkoong et al. 1979). As D. ponderosae is endemic to many regions of BC (Wood & Unger 1996; Nelson et al. 2007a), it is generally assumed that isolated outbreaks in Alberta have originated via dispersal from numerous localized outbreaks in adjacent areas of BC. At the same time, there is also a widespread perception that the epidemic has spread from an epicenter in Tweedsmuir Provincial Park (located south of the Houston site; Figure 2.1) in the mid-1990s, as this was one of the first regions to erupt during the current epidemic. The relative roles of dispersal from a single epicenter versus the coalescence of multiple local outbreaks, to the overall extent of the outbreak is an area of ongoing study. Both may be important. Indeed, in a spatiotemporal analysis of the current epidemic, Aukema 33 et al. (2006) found evidence for both a true epicenter in Tweedsmuir Provincial Park and simultaneous geographically-isolated outbreaks in southern BC. Our understanding of the development of the current epidemic is hampered by a paucity of data regarding long distance bark beetle dispersal events (Safranyik & Carroll 2006). Without this knowledge, outbreak management is largely reactive, and hence its effectiveness is limited (Robertson et al. 2007). Mark-recapture techniques to characterize long distance dispersal events and their contribution to outbreak development are not economically feasible {e.g., Salom & McLean 1990), and prior genetic techniques {e.g., RAPD - Calpas et al. 2002) have not produced informative results. However, microsatellites provide a powerful tool for investigating population structure, colonization patterns and characterization of migration and dispersal patterns (Balloux & Lugon-Moulin 2002). Most population genetic studies of bark beetles support the idea that significant geographic barriers, such as mountain ranges {e.g., Horn et al. 2006) and large deserts {e.g., Mock et al. 2007), can prevent or limit inter-population gene flow, causing genetic differentiation among populations. Studying patterns of gene flow can provide critical information for both preventative and reactive forest management and may prove useful for predicting future climate-induced range changes. The main aim of this study was to characterize the spatial genetic structure and dispersal patterns of D. ponderosae over the current epidemic area in western Canada using microsatellites. More specifically, our objectives were first to determine whether patterns of genetic differentiation were concordant with geographic distribution, and 34 second to identify dispersal patterns and thereby infer the origin(s) of the current outbreak. 2.2. Materials and methods 2.2.1. Site selection and beetle collection Beetles were collected in British Columbia and Alberta from 2005 to 2008 prior to summer dispersal in each year (Figure 2.1). Sample sites were selected based on current and past mountain pine beetle outbreak activity. In 2005 and 2006 a wide range of sites were sampled. In 2007 and 2008, sample sites were mainly in the newly infested areas at the eastern edge of the outbreak with the intention of identifying the origin of recent dispersal flights. Trees within 80 km of log yards were not sampled in order to minimize possible bias due to anthropogenic transport of potentially infested trees. During the entire period, a total of 86 sites were sampled. Following Francois & Nicolas (2001), sample sites in close proximity, those with similar landscapes and lacking obvious geographic barriers, were initially analyzed separately. These sites were pooled into a single sample location if there were no significant spatial or temporal differences in FST and Fis. A total of 47 sample locations were identified, but to enable comparisons between all 2005/06 and 2007/08 sample locations, sites sampled in 2005/06 in Golden (GO) and Grande Prairie (GP) were not pooled with those collected in 2007/08. This resulted in 49 sample locations that were used for analysis of population structure below (Table 2.1). 35 Table 2.1. Sampling locations (49), by region, for the mountain pine beetle with GPS locations, year sampled, number of beetles genotyped (N) (number of sites, in locations with more than one collection are given in brackets), mean observed heterozygosity (Ho), mean expected heterozygosity (He), mean number of alleles (NA), allelic richness (AR) and inbreeding coefficient (FiS) are shown. Location Rocky Mountains Pine Pass Willmore Wilderness Kakwa ! Mount Robson Banff Lake Louise Canmore1^ Kootenay Golden Golden t Yoho 1 Crowsnest Pass 1 Sparwood* Northeast of Rocky Mountains Tumbler Ridge Tumbler Ridge f Grande Prairie Grande Prairie1 Fox Creekf Fairviewf Nechako Plateau Fort St. James Francois Lake Houston Telkwa West of Rocky Mountains Mackenzie Code Latitude (N) Longitude (W) Year Sampled N (Ho) (He) NA AR ^IS PP WW KA f MR BA LL CAf KO GO GO f YO f CP f SPf 55.6352 53.3421 53.8036 52.8949 51.1779 51.4172 50.9852 50.6435 51.2402 51.3094 51.1229 49.7485 49.8046 122.2522 119.4744 119.6004 118.7348 115.5588 116.1793 115.3086 115.9786 116.6555 116.7834 116.2908 114.5360 114.8557 2006 2006 2007/08 2005 2006 2006 2007/08 2005/06 2005 2008 2008 2007/08 2008 38 37 280 (6) 45 52 42 472 (7) 44(2) 39 274 (3) 154 (2) 99 217(3) 0.492 0.514 0.522 0.583 0.633 0.610 0.609 0.643 0.653 0.621 0.645 0.639 0.612 0.496 0.533 0.524 0.609 0.626 0.629 0.627 0.644 0.624 0.625 0.635 0.637 0.631 4.69 5.46 7.92 6.69 6.69 6.46 11.31 6.54 7.08 10.92 9.31 8.00 10.92 4.52 5.26 5.22 6.32 5.95 6.12 6.55 6.16 6.67 6.60 6.51 6.14 6.53 NS NS NS NS NS NS 0.028*** NS NS NS NS NS 0.029** TR TRf GP GPf FO f FV 4.93010 55.2598 54.7540 54.9332 54.6456 56.4020 121.2959 121.4616 118.9333 119.1002 116.6522 119.2572 2005/06 2008 2006 2007/08 2007/08 2007/08 32(2) 307 (4) 33 434 (6) 129 (3) 367 (7) 0.519 0.485 0.462 0.482 0.500 0.492 0.515 0.488 0.478 0.489 0.494 0.490 4.92 6.92 4.77 7.46 6.23 6.62 4.92 4.63 4.73 4.64 4.68 4.47 NS NS NS NS NS NS FJ FL HO TE 54.6452 54.0318 53.9940 54.6674 124.4203 124.9387 126.6527 127.0887 2005 2006 2006 2006 44 53 50 51 0.467 0.463 0.486 0.463 0.484 0.461 0.481 0.479 4.46 4.46 5.00 4.54 4.25 4.07 4.56 4.22 NS NS NS NS MA 54.6963 122.8210 2005 50 0.512 0.503 5.00 4.66 NS 36 48 0.492 122.8077 2005 PG 53.9065 Prince George 12 0.474 SA 2006 54.2957 122.8949 Salmon Valley 65 0.473 NL 53.7497 123.4426 2006 Norman Lake 50 0.523 MB 53.3116 120.1266 2005 McBride 0.604 VM 2005 47 Valemount 52.6739 119.0190 52.8994 197 (3) 0.568 Valemount1 VM f 119.3538 2008 Cariboo-Chilcotin 122.2741 55 0.510 QU 53.0370 2006 Quesnel 50 0.515 BL 121.4172 2006 Bowron Lake 53.2488 56 0.501 51.6665 122.9033 2006 FC Farwell Canyon 0.487 TA 51.9715 2006 49 124.4130 Tatla Lake 0.554 48 LH 51.7307 121.5984 2006 Lac La Hache 51.7411 50 0.623 WG 120.0120 2006 Wells Gray Coast Mountains 0.572 WH 50.1678 122.9251 2006 43 Whistler Cascade Mountains 0.604 MP 49.2162 46 Manning Park 121.0697 2006 Thompson-Okanagan LI 48 0.571 Lillooet 50.4566 121.6350 2006 ME 0.614 Merritt 50.0352 120.6562 2006 49 KL 45 0.619 50.4859 120.5316 2006 Kamloops FA 50.5200 52 0.623 119.6018 2006 Falkland KE 0.601 49.9965 119.6693 2006 43 Kelowna Kootenays Nancy Greene NG 49.2591 47 0.660 117.9275 2006 Valhalla VA 49.7503 41 0.587 117.5181 2006 West Arm WA 49.5244 117.2324 0.621 2006 13 AR Argenta 50.1578 116.9173 2006 48 0.596 Kimberley 49.5841 116.1417 49 0.625 KI 2005 Southeastern Alberta Cypress Hillsf 2008 49.6130 110.1884 13 0.604 CHT C locations with beetle, fungal and host samples). NI, not included. NS, non-significant. • PO.05, **P<0.01, *** P<0.001 (significant after the sequential Bonferroni correction). 0.516 0.476 0.484 0.541 0.614 0.585 5.92 3.46 5.31 6.15 7.00 9.23 5.42 NI 4.51 5.60 6.39 6.11 NS NS NS NS NS 0.030* 0.532 0.539 0.531 0.490 0.567 0.620 6.08 6.38 5.77 5.31 6.31 7.08 5.28 5.74 5.20 4.89 5.77 6.43 NS NS 0.056* NS NS NS 0.621 5.69 5.36 0.079** 0.638 7.15 6.44 0.055* 0.585 0.627 0.642 0.619 0.603 6.77 7.23 7.23 7.31 7.08 6.13 6.54 6.56 6.51 6.53 NS NS NS NS NS 0.641 0.606 0.641 0.624 0.633 7.15 7.15 5.00 7.54 7.08 6.50 6.68 NI 6.76 6.45 NS 0.638 4.85 NI NS NS NS 0.045* NS 37 British Columbia Alberta JE d10 # \ W ' 0TA ^ LH o % t*\ x 0 / ^ ^WA0 CPt % wm wm Figure 2.1. Map of mountain pine beetle sampling locations in BC and Alberta. Sampling locations in 2005/06 are represented by light color circles (red) and 2007/08 sampling locations are represented by dark (blue) circles. One location, Cypress Hills (CH) on the Alberta Saskatchewan border (GPS coordinate 49.6130, -110.1884 sampled in 2008) is not shown in the map. The location name, GPS location, year sampled and number of beetles genotyped (N) are given in Table 1.1. 38 At each site, beetles were exclusively sampled from lodgepole pine to avoid the potentially confounding influence of beetles taken from different host trees (i.e., jack pine) (Langor & Spence 1991; but see Mock et al. 2007). We sampled 13 to 20 infested trees separated by a minimum of 10 meters at each site. In most cases, beetles were collected from separate galleries from each of the four sides of the tree. For each tree, a GPS location was taken and collected beetles were stored at -80°C in 95% ethanol. 2.2.2. DNA extraction and evaluation One beetle per gallery was randomly selected for genetic analysis to ensure each analyzed beetle had different parents. We aimed to amplify 40-60 samples per site. DNA was extracted using a standard phenol/chloroform procedure (Sambrook & Russell 2001). Following precipitation, DNA was resuspended in Tris-EDTA (pH 8.0) and concentration normalized using a NanoDrop® ND-1000 UV-Vis Spectrophotometer. 2.2.3. Microsatellite amplification A total of 4607 beetles were genotyped at 16 beetle-specific microsatellite loci using four multiplexes following the conditions described by Davis et al. (2009). Amplified fragments were co-loaded into two injections on an AB 3730 DNA analyzer and band sizes were determined relative to GeneScan-500 LIZ (AB) and scored using GeneMapper software. One locus, MPB012, proved unreliable and one locus, Dpo486, was identified to be sex linked (Davis et al. 2009). Both were removed from further analysis. 39 2.2.4. Hardy-Weinberg equilibrium and linkage disequilibrium Genotypic data from each site were checked for Hardy-Weinberg Equilibrium (HWE) across loci and sites using an expansion of Fisher's exact test. To ensure that all loci are independently assorting at all sites, linkage disequilibrium (LD; Slatkin & Excoffier 1996) was assessed using a likelihood ratio test. Statistical significance was evaluated both before and after sequential Bonferroni correction for multiple tests (Holm 1979; Rice 1989). All analyses were conducted using Arlequin 3.1.1 (Excoffier et al. 2005). 2.2.5. Genetic diversity Gene diversity and allelic richness were used to describe patterns of genetic diversity across the study area. Observed and expected heterozygosity were calculated for each sample location using the MICROSATELLITE TOOLKIT (Park 2001). We regressed mean expected heterozygosity and allelic richness for each sample location on latitude. Allelic richness was corrected for variation in sample size through rarefaction (Petit et al. 1998) implemented in FSTAT 2.9.3.2 (Goudet 2002) and sampling locations with fewer than 30 beetles were excluded. Patterns of genetic diversity were studied for the entire study area as well as within the main clusters identified by Bayesian analysis for population structure as described below. 40 2.2.6. Population structure Population genetic structure was examined using three Bayesian approaches. We first used STRUCTURE 2.3.1 (Pritchard et al. 2000; Falush et al. 2003) assuming an admixture model and correlated allele frequencies. Analyses were done without prior sampling information. Each run with STRUCTURE was performed with 10,000 burn-in and 10,000 MCMC steps. The number of steps was selected after several trial runs which examined variance and outcome. Default values were maintained for all other parameters. Population structure was tested at K values ranging from 1 to 49 with ten replicates, followed by 20 replicates each at K=l-10. The best value of K was chosen using the second order rate of change (AK) method suggested by Evanno et al. (2005). To correctly assess the membership proportions (q values) for clusters identified by STRUCTURE, the results of 20 replicates at best fit K were post-processed using CLUMPP 1.1.2 (Jakobsson & Rosenberg 2007). These values were used to generate pie charts separately for each location to see the geographical pattern of the clusters. A line was drawn to visualize the possible boundary. STRUCTURE and the Evanno method capture only the uppermost level of structure when hierarchical levels of structure exist within a population (Evanno et al. 2005). Therefore, each cluster was further analyzed for nested sub-structures and evaluated with the Evanno method as described above. Though STRUCTURE (Pritchard et al. 2000) is commonly used for the analysis of population structure, it may not correctly identify population structure when overall F ST is small (Latch et al. 2006; Waples & Gaggiotti 2006; Chen et al. 2007). To further explore population structure, we used TESS 2.3 (Durand et al. 2009) which implements a 41 Bayesian clustering algorithm that uses spatial information to ascertain spatial population structure and performs well with small FST values between 0.03-0.05 (Chen et al. 2007). With TESS, runs were done with 10,000 burn-in and 25,000 total sweeps and default values were maintained for all other parameters. We assumed no admixture and started the analysis using K=2; K values were increased until the estimated number of clusters stabilized based on no further changes in the Deviance Information Criterion (DIC). Ten replicates were done for each K value. Taking the value at which DIC stabilized as the upper bound for the model with admixture, 100 replicates were done (assuming admixture) at K2 - K(upper bound) (Fedy et al. 2008). The estimated membership probabilities of the 20 highest likelihood runs of best fit K were averaged using CLUMPP (Jakobsson & Rosenberg 2007) to correct for between-run discrepancies common to cluster analyses (Chen et al. 2007; Fedy et al. 2008). BAPS (Corander et al. 2003; Corander et al. 2005; Corander & Marttinen 2006) also has been shown to be capable of identifying population structure when Fsr is small (Latch et al. 2006). BAPS determines optimal partitions for each K value and then merges the results according to the log-likelihood values to determine the best K value. Clustering analysis with the program BAPS 5.2 was done at the level of groups of individuals (population level), independently using two models {i.e., with and without spatial information models). Each analysis was done selecting 2 to 49 as K values (2 to 10 continuously and the rest with 5 value intervals up to 45 and then 49 as final K). Five repetitions were done at each K value. 42 2.2.7. Genetic differentiation We partitioned genetic variance among and within clusters using analysis of molecular variance (AMOVA) carried out in Arlequin 3.11 (Excoffier et al. 2005) based on pairwise FST corrected for unequal sample size using the method of Weir & Cockerham (1984). To study differentiation among clusters, two independent nested AMOVAs were carried out in which groups of locations were based on the results of the Bayesian analyses. Sample locations were grouped at K=2 (STRUCTURE results) and K=4 (BAPS results) independently. In each analysis, variance components were extracted for (i) among groups (FCT), (ii) among locations within groups (Fsc), and (iii) within sampling locations (F\s) hierarchical levels. Further, independent AMOVAs were done for each cluster and each subcluster to compare the level of genetic differentiation. Each AMOVA was run with 10,000 permutations at 0.05 significance levels. To summarize the population structure and relationships among locations, a neighbor-joining tree was constructed using the program POPTREE. For the tree construction, Nei's genetic distance was used with 1000 bootstrap replicates, resampling loci, to assess node confidence. 2.2.8. Gene flow Relationships between genetic and linear geographic distances (i.e., isolation-bydistance; IBD), were examined using a Mantel test (Mantel 1967). Mantel tests implemented in GENEPOP 3.3 (Raymond & Rousset 1995) were done using the "Isolde" option with 10000 permutations. To visualize IBD patterns, FST / ( l - F S T) estimates from 43 GENEPOP were regressed on logarithm of geographic distance since the locations are distributed in two dimensions (Rousset 1997). Following Gamier et al. (2004), IBD patterns were studied for the whole study area, as well as within and between the clusters and subclusters identified in the Bayesian analyses. Gene flow among locations was assessed using pairwise F^. We considered nominally non-significant pairwise FST to indicate recent and/or historical gene flow between that pair of sample locations. We also used the program BARRIER 2.2 (Manni et al. 2004) to identify and graphically visualize barriers to gene flow. BARRIER uses the Monmonier's (1973) maximum-difference algorithm to identify likely barriers gene flow, i.e., areas where genetic differences between pairs of sampling locations are largest (Manni et al. 2004). To trace the origin of MPB expansion into northern Alberta, beetles from locations that represent recent infestation were assigned to a 'resource dataset' (i.e., all data minus the assigned location and secondly, to explore the temporial patterns, all data minus locations of interest) using assignment tests in GENECLASS 2 (Piry etal. 2004). Beetles from Fox Creek (FO1), Fairview (FV1), and two Grande Prairie locations (GP and GP1) were tested respectively, assigning one sampling location at a time. Individual and population assignments were done using likelihood-based assignment methods (Paetkau etal. 1995). 44 2.2.9. Historical demography Signatures of bottlenecks and/or population expansion were tested in each sample location with a minimum of 30 beetles using the program BOTTLENECK 1.2.02 (Cornuet & Luikart 1997). We considered both the stepwise mutation model (SMM) and the two-phased mutation model (TPM). For the TPM, the variance was set at 30% leaving 70% proportion of SMM in TPM. Wilcoxon signed-rank tests were used to determine whether deviations from mutation-drift equilibrium (MDE) were statistically significant. 2.3. Results 2.3.1. Hardy-Weinberg equilibrium and linkage disequilibrium Averaged across all sites for each of the 14 loci, observed and expected heterozygosity ranged from 0.215 - 0.820 to 0.145 - 0.821, respectively (Table 2.2). Deviations from HWE at 13 of the loci were not consistent across 86 sites, i.e., no sites had more than two loci out of HWE and only 71 out of 1204 total tests (5.9%) were significant before correction for multiple tests (P < 0.05 at a < 0.05). Only three tests were significant after the sequential Bonferroni correction was applied for each locus across sites (i.e., P < 0.05 at a < 0.05/86). Hence, those 13 loci were regarded as loci in HWE. One locus, MPB054, displayed a significant deviation from HWE. MPB054 was monomorphic in 14 sites while tests for HWE showed a significant deviation in another 41 sites (P < a < 0.05), and in 19 sites after the sequential Bonferroni correction was 45 applied across sites. Illustrated by the large and significant F i S value across all sites, these deviations were due to locus-specific heterozygote deficiencies that may suggest the presence of a null allele. Therefore, locus MPB054 was excluded from the main analysis due to possible bias. Significant LD (P < a < 0.05) was occasionally detected between some pairs of loci in some sites, i.e., out of 7826 total comparisons (14 loci at each of the 86 sites), only 91 tests (1.16%) were significant before the correction for multiple tests. These were not clustered at any pair of loci. None of these tests were significant after the sequential Bonferroni correction for multiple tests, at any level (i.e., at all loci across all sites, a < 0.05/7826 comparisons, or at all loci within a site, a < 0.05/91 comparisons) suggesting that these loci segregate independently. 2.3.2. Genetic diversity Mean observed and expected heterozygosity among 49 locations varied between 0.46-0.65 and 0.46-0.64 respectively (Table 2.1). Mean expected heterozygosity and allelic richness by sample location declined from south to north woth latitude (Figure 2.2). 46 Table 2.2. Loci typed. Total number of alleles (NA), mean expected heterozygosity (He), mean observed heterozygosity (Ho), number of loci deviated from HWE before and after (in brackets) sequential Bonferoni correction, and fixation indices ( F I S , F S T and significant values (*** PO.001) are shown). Locus Dpo028 Dpol03 Dpol60 Dpo453 Dpo479 Dpo530 Dpo566 Dpo760 Dpo780 Dpo793 MPB011 MPB017 MPB038 MPB054 NA 19 26 38 22 11 9 12 14 16 14 10 20 14 11 NS, non-significant. He 0.461 0.820 0.702 0.680 0.655 0.660 0.298 0.594 0.553 0.557 0.566 0.411 0.321 0.215 Ho 0.439 0.821 0.696 0.663 0.657 0.644 0.299 0.588 0.550 0.552 0.537 0.403 0.319 0.145 HWE 5 3(1) 5(1) 14 5 6(1) 2 9 5 3 4 4 3 41 (19) ^is P 0.042 NS NS 0.022 NS 0.030 NS NS NS NS 0.050 NS NS 0.349 Fixation Indices FST P 0.056 0.034 0.023 0.008 0.070 0.016 0.019 0.031 0.026 0.089 0.014 0.024 0.076 0.067 *** *** *** *** *** *** *** *** *** #*# *** *** *** *** (a) (b) y = -0 0 3 9 9 x + 8 398 £ R2 = 0 02 % -!*_*^_ .** E....." 65 " * ~ ~ 5 — - S - t t t l .— * • 6 0 6000 0 3601X+24 49 . • R2 = 0 7 2 ' " 55 y = -0 0072x + Q9877 R2 = 0 23* 0 5500 5 45 0 5000 • 0 4500 0 0279X+2 0233 35 R3 - 0 7 9 " ' 0 4000 S3 5a 55 56 57 49 49 5 SO 50 S 51 51 5 53 52 5 S3 53 5 Latitude (c) 0 7000 o y 0 6500 _o ~^ V. 0 60D0 o bo • N o £ 0 5500 T3 *- »• • ~*~—^ " .. • R3-D Allelic Richness •*»• • ^ • ~ - i 39" Significance • . • « . »r*~~"~"""—--^j CJ C r3 Heterozygosity • y = -0 2529x» 18 537 • 0 5000 c • 0 4500 y = -0 0131X + 1 2105 R2 = 0 3 4 " * <0.05 ** <0.005 *** <0.0005 Latitude Figure 2.2. Genetic diversity, (a) Decrease of mean expected heterozygosity (He) and allelic richness from south to north, (b) Pattern of genetic diversity within southern cluster, (c) Pattern of genetic diversity within northern cluster. 2.3.3. Population structure We identified two clusters using STRUCTURE (Figure 2.3) which were supported by the AK criterion (Evanno et al. 2005) and were geographically distinct. In most locations, more than 80 percent of individuals had similar cluster membership. We did not identify further substructure within either cluster. 48 (a) K (c) I British Columbia Alberta f T 1 i ^ % . „ „ : > - •'iff n *~ ™ ^V o " ^ 100 c00 rt \ «_ k 200 OOO H M M H M B Metres 300 000 400 000 N K. Figure 2.3. STRUCTURE analysis of MPB individuals in Western Canada, (a) Mean log probability of data LnP(D) over 10 runs for each K value as a function of K (error bars represent standard deviation), (b) Evanno's ad hoc statistic; AK as a function of K (over 20 replicates), (c) North and south clusters. Pie charts are based on proportion of membership of each predefined sampling location in each of the K = 2 clusters with prior sampling data not used. Solid line represents the boundary or the 0.50/0.50 membership isocline between two main clusters. The 80 percent membership isoclines in each cluster are represented by the dashed lines. 49 We identified a similar boundary between southern and northern clusters at K=2 using the program TESS (not shown). However, the lowest DIC value before the plateau was observed at K=3, which would yield an east-west subdivision of the southern cluster into southwest (SW) and southeast (SE) clusters as well as the northern cluster (Figure 2.4). The location Lac La Hache (LH) close to the boundary between northern and southern clusters was grouped differently in K=2 and K=3 outputs. Our population level analysis using BAPS (without spatial information) identified four clusters while BAPS (with spatial information) identified three (Figure 2.4). These comprise the same main north and south clusters identified using STRUCTURE and TESS, as well as further subclustering of each main cluster. Similar to TESS, the south was divided into SW and SE subclusters. Two locations at the boundary between the southern subclusters were grouped differently in BAPS and TESS (Figure 2.4). In contrast to TESS, the northern cluster was split into lower (NL) and upper (NU) subclusters, however only in the BAPS (without spatial information) analysis. Hereafter we refer to K=2 for the two main clusters identified using the program STRUCTURE and K=4 for the four subclusters identified using the program BAPS. 50 British Columbia Alberta - STRUCTURE and BAPS (BAPS-without spatial information) defined north-south boundary • BAPS (with spatial information) and TESS defined northsouth boundary = BAPS (with and without spatial information) defined boundary between subclusters • TESS defined boundary between Southern subclusters • BAPS (without spatial information) defined boundary between subclusters • BARRIER defined 1 s t and 2 nd barriers Pie charts based on STRUCTURE results at K=2 Figure 2.4. Subclustering pattern of MPB in western Canada. Results of the TESS, BAPS (with and without spatial information), and BARRIER analyses were overlaid on top of STRUCTURE results. A neighbor-joining tree was constructed to examine the phylogeographic patterning of the observed variation (Figure 2. 5). Locations grouped according to predicted STRUCTURE, TESS and BAPS assignments. A clear division was noted between the northern and southern clusters, with the NU and SE subclusters forming weakly supported terminal monophyletic groups. 51 551 rV GP HO I—NL 'GP+ ^ FL TE -PP FVt 67 53 FOI TRt -SA^ TA r 63 -TR -PG -FC — MB 56 QU 98 WW 79 50 KAf BL LH 77 CT 66' rGOt -VA -CHt L LL -KO pCAi I—CP+ BAPS identified four clusters -WA • Northern Upper (NU) Northern Lower (NL) -NG -SP* •YOt Southern West (SW) n Southern East (SE) ^ small sample size Figure 2.5. Neighbor-joining tree of MPB sampling locations based on pairwise Nei's genetic distance. Bootstrap values > 50% are shown are the respective nodes. STRUCTURE, TESS and BAPS identified clusters were overlaid on the tree. Sampling locations with samples sizes < 15 are noted by red dotes. 52 2.3.4. Genetic differentiation Genetic differentiation among the 49 sample locations was low (AMOVA FST = 0.037) but highly statistically significant (P< 0.00001). We observed significant genetic differentiation between the northern and southern clusters (nested AMOVA at K=2 STRUCTURE clusters; F C T = 0.057, P < 0.00001). Similarly, there was significant genetic differentiation between K= 4 clusters identified using BAPS (F CT = 0.045, P <0.00001). The level of population structure was slightly greater among locations within the southern main cluster (FST = 0.0075, P < 0.00001) than within the northern cluster (FST = 0.0048, P < 0.00001). Further, the genetic differentiation between subclusters (FCT) within the southern cluster was slightly higher (0.0080) than that in the northern cluster (0.0064) (both P < 0.00001). When each subcluster was analyzed independently, the highest among location variation was found within the SW (FST = 0.0085, P < 0.00001) and the lowest was found within the NU subcluster (F ST = 0.00122, P = 0.0018). Among location variation within the SE and NL subclusters were intermediate at F S T = 0.0018 (F<0.00001) and 0.0028 (P <0.00001), respectively. The gradient of declining diversity from south to north was apparent within each cluster, but more pronounced in the north (Figure 2.2). The relationship was statistically significant for both heterozygosity and allelic richness within the northern cluster (Figure 2.2c) while in the southern cluster only the gradient in expected heterozygosity was significant (Figure 2.2b). More private alleles were detected in the southern cluster (6.62 mean alleles per locus) when locations were pooled than in the northern cluster (0.38) (Note, the number of beetles in each cluster was ~ 2000). 53 2.3.5. Gene flow There was a highly significant IBD relationship across the whole range studied (Figure 2.6a). The slope of the relationship between comparisons within the southern cluster was steeper than within the northern cluster (Figure 2.6a). Within each of the four subclusters, significant IBD patterns were detected in SE, SW and NL, but not within the NU subcluster (Figure 2.6c & 2.6d). IBD between locations in the two main clusters was highly significant (Figure 2.6b). A strong and significant IBD effect also could be observed between subclusters within the southern group while the IBD effect between the two subclusters within the northern group was relatively low. When program BARRIER was used to identify likely barriers to gene flow, the first identified likely barrier to gene flow corresponded to the boundary between the two main clusters while the second barrier corresponded to the boundary between the two subclusters in the northern group (Figure 2.4). Routes of gene flow were also examined by non-significant pairwise FSJ values. Non-significant values may reflect recent gene flow between sampling locations. The percentage of pair-wise FSJ values, between locations with sample sizes of at least 30, that were not significantly different from 0 within the southern cluster was 37.7% (N = 276 comparisons) and 31.2% within the northern cluster (N = 231 comparisons). In contrast, almost all pairwise FST values between locations in the two main clusters were significantly > 0 (except for 5 locations pairs that were close to the boundary out of 528 total comparisons; Figure 2.7a). When comparisons within each of the four subclusters 54 (b) (a) _ b / ' v Southern & Ncrtfterr R- = 0C 4 4 ' " y = 0 04C3<-01855 b'w Southe'n & No-hem w'n Sou'hem Ra = 0 3 4 ' " y = C 3081 x - 0 0352 U- 0 -I- fl ^ CM2 ,'."'r No-he-n R2 = o 0 8 " ' y = G 0038/ - 0 0136 s ^ a ==b'. R-' = 0 04' y = 0 0037> 0 0093 AI Data S : = 0 4 7 " * y = 0 0 3 6 x - 0 1729 c 3 b.- v SVV S S= R- = 0 45' y = 0 02C3> 0 1057 • ~ C02 1 J A—°7 5 -/ "^p^ff •nv. (d) (C) C 03 -. C 02 - - A''rSA'R- = 0 1 5 " y = 0C04\-00138 3 : a 0O?S k — * ' n SE R^ = C 2 3 ' " >• = 0 C044x - 0 018 co; T 4 _ . A / r K L R2 = 0 19" y = 00032x-00135 v;'n NIU R : = 0 002 / = 000C3* + 0C3'7 cos C02-:o : o - -• :re -co- ^ Ln(geographicai stnught-tine clisrance between locations) (Km) Ln(geograplucal straight-lice distance between locations) (Km) Figure 2.6. Isolation-by Distance (IBD) analysis within (w/n) and between (b/w) clusters identified. Regression of genetic differentiation [estimated by FST/(1 - FST)] against logarithm of geographical distances (km) based on Rousset (1997). (a) IBD across the study area and within each main clusters; (b) IBD between clusters; (c) IBD within southern subclusters; (c) IBD within northern subclusters. 55 (a) Alberta British C o l u m b i a — STRUCTURE and BAPS (BAPS-without spatial information) defined north-south boundary '••• BAPS (with spatial information) and TESS defined north-south boundary • = BAPS (without spatial information) defined boundary between subclusters Or \ * W ^%J*"~—— - ^ 0 100 000 K 200 000 = M M M ^ M Metres J )0 000 |U 30O Ol 400 000 — BARRIER defined barriers to gene flow Pie charts, STRUCTURE results at K=2 Alberta Figure 2.7. Interconnected sampling locations. Solid lines between locations represent connections (i.e., nonsignificant pairwise Fgx), (a) between sampling locations of the southern and northern clusters and (b) connections to expansion sampling locations in Fairview (FVf), Fox Creek (FC^), Grande Prairie (GPf and GP). Results were overlaid on the relevant STRUCTURE, BAPS, TESS and BARRIER results (Figure 2.4). The relative location of Tweedsmuir Provincial Park (TM) to the sampling location is shown. 56 were considered, 40% (SW) 67.9% (SE), 68.8% (NL) and 71.2% (NU) of the pair-wise FST comparisons were not significantly different from 0. In contrast, the percentage of non-significant comparisons between locations in SW and SE was 17.5 % and between locations in NU and NL was 8.3%. Unique to all sampling locations, all pairwise FST values (at P = 0.05) were significant in the Whistler (WH). The new and expanding locations in Alberta during the current epidemic, Fox Creek (FO), Fairview (FV) and Grande Prairie (GP) (Wood and Unger 1996; Safranyik and Carroll 2006; Raffa et al. 2008), were genetically differentiated from all southern cluster locations, and genetically indistinct from most northern ones (Figure 2.7b). Tatla Lake and Fox Creek were not significantly differentiated despite the large geographical distance (-596 Km) between the two locations. Assignment tests (GENECLASS) were also used to explore possible source locations for sampling locations northeast of the Rocky Mountains. Individuals from the 2007/2008 sampling locations, Tumbler Ridge, Fairview, Fox Creek, Grande Prairie, and from the 2006 Grande Prairie locations were assigned to all other sampling locations. Consistent with the shallow genetic divergence noted, few exclusions were noted in the probability-based analysis. Likelihood-based analyses, however, indicated that individuals were most likely of the NU subcluster origin. For 70.2% of 1270 individuals tested, the top likelihood score was of NU origin, while 98.2% was of northern cluster origin. Similar percentages were observed for all five sampling locations tested. 57 In the sampling location level analysis, sampling locations northeast of the Rocky Mountains were assigned only to the other tested locations in the NU cluster at scores greater than 95% (Table 2.3). When the same tests were done after removing all tested locations (Table 2.4), all of the 2007/08 locations assigned to 2006 Francois Lake sampling location (NU cluster, scores > 99%). In contrast, the 2006 GP sample was assigned to a 2005 Mackenzie location (NU subcluster, scores > 97%). 2.3.6. Historical demography No signature of a recent bottleneck event was detected in any location. However, significant deviations (PO.05) from mutation drift equilibrium which may suggest population expansion (i.e., expected heterozygosity less than heterozygosity at equilibrium) were found under the stepwise mutation model in all locations except Whistler. Under the two-phase model (TPM), evidence for expansion was detected in 27 locations including most of the locations at the eastern edge of the epidemic (Table 2.1). 58 Table 2.3. Results of likelihood-based assignment tests for locations in northern Alberta compared to all other locations. The rank (top 5) and score (%) are estimated by GENECLASS2. The scores given are based on frequency based likelihood-based method (Paetkau et al. 1995). Locations codes are found in Table 2.1. Location Tested Source Location & Assignment Score (%) Rank 1 2 f 4 3 Assigned to TR Score 98.65 1.34 0.01 FV f Assigned to TRf 0 FV 1 0 FO f FO1^ Assigned to GP f 100 TRf 100 GP f Score 95.25 3.28 FO Score GP 1 Assigned to Score GP GP f f f 5 1 0 FO f 0 TR ! KA 0 KA f 0 PP 0 MA MA 0 FL 0 MA 0 FJ 0.99 0.25 0.21 FV Table 2.4. Results of likelihood-based assignment tests for locations northeast of the Rocky Mountains compared to all other locations1 . The rank (top 3) and Score (%) are estimated by GENECLASS2. The scores given are based on frequency based likelihoodbased method (Paetkau et al. 1995). Locations codes are found in Table 2.1. Location Tested FO f FV f GP 1 TRf GP Source Location & Assign. Score (%) Assigned to Score Assigned to Score Assigned to Score Assigned to Score Assigned to Score Rank: Frequency Lk 1 2 3 FL 99.72 FL 99.98 FL 100 FL 100 MA 97.91 PP 0.28 MA 0.02 MA 0 MA 0 FJ 2.09 TE 0 PP 0 PP 0 PP 0 FL 0 'The locations GP, GP1^ FO^ FV1^ and TR1 were not included in the reference dataset. 59 2.4. Discussion The mountain pine beetle in western Canada exhibits significant population genetic structure. We identified a clear north-south clustering pattern using all three Bayesian approaches. Only one location at the boundary between the main northern and southern clusters, Lac La Hache (LH), was inconsistently classified. Approximately 5.7% (P < 0.00001) of the genetic variance was partitioned by AMOVA between the northern and southern clusters, and the primary barrier to gene flow delineated by the program BARRIER corresponded with the north-south cluster boundary. Furthermore, patterns of pair-wise FST > 0 and the presence of private alleles within each of the two main clusters also indicate restricted gene flow (Allendorf & Luikart 2007). This indicates that the strongest barrier to gene flow within the studied area exists between the north and south clusters. The observed population structure may be explained by a number of non-mutually exclusive hypotheses, including: 1) The existence of physical or climatic barriers 2) Differing selective pressures between the northern and southern habitats 3) The post-glacial expansion of mountain pine beetles into the northernmost portions of their range. Previous studies in bark beetle population genetics collectively support the role of geographic barriers, such as mountain ranges and large distances, in limiting gene flow and causing divergence among populations. These conclusions have been drawn from 60 other population genetic studies of the mountain pine beetle (Stock and Guenther 1979; Langor and Spence 1991; Kelley et al. 2000; Mock et al. 2007) as well as from studies of other bark beetles (Coleoptera: Scolytidae), e.g., the Douglas-fir beetle, D. pseudotsugae Hopkins (Stock et al. 1979), the southern pine beetle, D. frontalis Zimmermann (Anderson et al. 1979; Namkoonge/a/. 1979; Roberds et al. 1987), the Jeffrey pine beetle, D. jeffreyi Hopkins (Six et al. 1999), the western pine beetle, D. brevicomis LeConte (Kelley et al. 1999), the pinyon pine beetle, Ips confusus (LeConte) (Cognato et al. 2003), and the spruce beetle, D. rufipennis (Kirby) (Maroja et al. 2007) in North America, and in Europe, /. typographus (Linnaeus) (Stauffer et al. 1999), the Tomicus piniperda (Linnaeus) species complex (Duan et al. 2004; Ritzerow et al. 2004), and T. destruens (Wollaston) (Faccoli et al. 2005; Horn et al. 2006). In this study, however, there exists neither a large distance nor an obvious geographic barrier that separates the northern and southern clusters. Excluding the recent expansion locations northeast of the Rocky Mountains, the northern cluster beetles are generally found on the Chilcotin, Cariboo and Nechako Plateaus, in an area jointly known as the Fraser Plateau. Here, the primary host, lodgepole pine, is found in large, widely distributed forest stands (Taylor and Carroll 2004). In contrast, the beetles in the southern cluster are found in more mountainous habitats, where the suitable hosts are generally found in a more patchy distribution along the valley slopes (Ritchie 2008). It is not clear why the transition from mountain habitat to the northern plateau would limit gene flow from the southern beetles into this region, although biological or climatic factors may be involved. Ongoing studies are looking at the possible roles of 61 host availability, host genotype, the presence and diversity of fungal associates, and other geographic or climatic features (P. James, personal communication). The wind direction in the summer also can be a factor that determines the direction of spread of the beetles (P. Jackson unpublished data). At the same time, methods are being developed to study local adaptation within the clusters. Sets of gene-linked markers, including EST-linked microsatellites (Chapters 3 & 4) and single nucleotide polymorphism (F. Sperling, personal communication), will be used to perform genome scans to identify loci with a signature of selection (e.g., Neilson 2005), and to establish the degree of adaptive polymorphism associated with the clusters (i.e., population adaptive index, Bonin et al 2007). A hypothesis of post-Ice Age range expansion would predict that populations are the oldest in the southern part of Western Canada as these areas would have been recolonized first following glacial retreat (Abbott & Brochmarm 2003; Beatty & Pro van 2010). In the presence of limited gene flow, newly founded populations are expected to contain lower levels of diversity. Genetic diversity should, thus, decline from south to north. The pattern of southern richness and northern purity has been found in many taxa (Hewitt 1999; 2004), including species endemic to the Pacific northwest (e.g., Green et al. 1996) indicating this as a common scenario in post-glacial recolonization. This predicted genetic pattern is concordant with the observed diversity gradient, i.e., reduction in heterozygosity, allelic diversity, and numbers of private alleles, from south to north in this study. The diversity gradient, however, is more pronounced within the northern cluster than the southern cluster. Recolonization within the northern cluster 62 seems to follow the pattern predicted in the stepping stone colonization model (Slatkin 1991). Long-term persistence of beetles in the southern cluster, and hence, a likely series of complex historic events, may have disrupted this pattern within the southern cluster. In terms of recolonization, there is mounting evidence (Kelley et al. 2000; Mock et al. 2007) that the mountain pine beetle exhibits range-wide patterns of decreasing genetic diversity from "central" populations in Idaho/Utah. Our results also suggest that D. ponderosae repopulated Western Canada from a single refugium south of the continental ice sheet. The existence of such a refugium during the last glaciation is supported by genetic (Marshall et al. 2002; Fazekas and Yeh 2006) and fossil pollen data (MacDonald and Cwynar 1985; Cwynar and MacDonald 1987) for P. contorta var. latifolia, the beetle's primary host. The mountain pine beetle may have also persisted in minor coastal refugia which were populated by P. contorta var. contorta, the shore pine (Heusser 1960; Peteet 1991; Fazekas and Yeh 2006). The phylogeography of mountain pine beetle populations endemic to regions with shore pine {e.g., northwest BC; Vancouver Island), should be compared with beetles from continental regions to determine if different refugial populations existed. Further recent genetic and pollen evidence support another possible refugium of flora and fauna during the last glaciation, in the Beringia region ( Brubaker et al. 2005; Anderson et al. 2006; Demboski & Cook 2009; Beatty & Provan 2010). However, our study does not support a spread of beetles from that region {i.e., no decline in diversity from the northwest). Fine-scale research to isolate the descendants of separate refugia may help to further understand whether beetles in different clusters represent different refugia. 63 The evidence for population structure within clusters was not as strong as that found between clusters. Subclustering within the southern cluster was identified by both TESS and BAPS, while subclustering within the northern cluster was identified only by BAPS. The second likely barrier to gene flow identified by BARRIER supported the subclustering of the northern cluster defined by BAPS. The genetic differentiation between subclusters in either the northern or southern subclusters {Fcx = 0.0064 and 0.008, respectively) was 7-9X lower than that found between the clusters (FCT = 0.057). In the southern cluster, this structure may simply reflect the spatial IBD trends observed in the data and not have any further biological significance. In the northern cluster, however, where IBD trends are weaker (and even lacking in the NU subcluster), we feel this structure is most likely the signature of the northeastern expansion of the beetles in the current outbreak. In this regard the NU subcluster can be viewed as an expanding group that originated in the northern cluster. Among the four subclusters, the NU subcluster is characterized by a lack of IBD, the lowest genetic differentiation among locations C^ST), and the least genetic diversity. The nonsignificant IBD pattern in the NU subcluster indicates a lack of equilibrium between genetic drift and gene flow and is consistent with the recent expansion of MPB to many of the sampling locations in the subcluster. Plots of genetic by geographic distance (Figure 2.6b) show that pairwise Fsr values in the NU subcluster vary within a small range. This type of scatter plot reveals a nonequilibrium situation and the dominance of gene flow over drift (Hutchison & Templeton 1999). Hence, the lack of 64 IBD in the NU subcluster is most likely due to both long distance dispersal events and recent age. Compatible with this, the highest percentage of nonsignificant pairwise FST comparisons were found in this subcluster. Low levels of differentiation can be due to high gene flow among locations and/or the recent origin of beetles from one or a few common sources. As Namkoong et al. (1979) reported, a region-wide homogenization of population allele frequencies typically occurs when epidemics spread from epicenters. The comparative lack of mountain ranges, as well as the contiguous cover of susceptible hosts over large areas, in the north versus the south may have facilitated more long distance gene flow among locations in the north. Field observations combined with the results of this study support the assumption that the mountain pine beetle outbreaks in locations in the NU cluster are mainly due to long distance dispersers originating from an epicenter. Mountain pine beetles were not reported northeast of the Rocky Mountains in northern Alberta prior to the current outbreak. These represent the best locations with which to study assumptions of dispersal. Indeed, the movement of beetles into this region was so pronounced in the summer of 2006 (corresponding to the 2007 sample) that it was described as a "rain of beetles" (S. Lindgren - BBC telaconference). Assignment tests clearly show that the likely origin of the 2007/2008 samples northeast of the Rocky Mountains were from NU subcluster locations west of the Rocky Mountains. Previous studies have reported the long distance dispersal events of bark beetles by atmospheric winds (Furniss & Furniss 1972; Safranyik et al. 1992; Jackson et al. 2008; Westfall & Ebata 2008) and beetles have been captured moving eastward over the Rocky Mountains (Jackson et al. 2008). Curiously, the pre-2007 Grande Prairie samples were assigned to a 65 different location west of the Rocky Mountains, suggesting multiple waves of immigration. The results show that the beetles in northern Alberta are mainly or completely from northern BC. Tweedsmuir Provincial Park, located in west-central BC, has been implicated as the epicenter of the current outbreak. Although the data set did not include beetles from the park itself, the Houston site just to the north of the park and Tatla Lake just south of the park may be considered surrogates for the Tweedsmuir beetles. Indeed the Tatla Lake area is close to one of the first regions that erupted in the mid-1990s during the onset of the current epidemic. As both these location are part of the NU subcluster, the results are consistent with this location being the epicenter of the NU cluster outbreaks. Although a spatiotemporal analysis of the current outbreak by Aukema et al. (2006) found evidence for a northern epicenter in Tweedsmuir Provincial Park, it also indicated simultaneous geographically-isolated outbreaks in southern BC. The existence of genetic diversity gradients and substructure during the outbreak clearly support multiple epicenters across BC. Further, the presence of IBD patterns indicates the longterm persistence of beetle populations at locations throughout most of study area. If IBD exists in an area of concern this reveals that an equilibrium has most likely been reached between the gene flow and genetic drift (Slatkin 1993), a situation that may take thousands of generations to develop (Johnson et al. 2007). The factors governing the epidemics in the locations in the southern cluster seem to be mainly due to the expansion 66 of numerous endemic-phase native populations. The isolated nature of locations in the southern cluster is confirmed by the high among-location differentiation and the low percentage of nonsignificant pairwise FSj comparisons. The Whistler (WH) sampling location can be viewed as an extreme example of an isolated outbreak in the southern cluster. Among all locations studied, Whistler showed unique characteristics including being the only location to have all pairwise FST values significant and to show a lack of evidence of population expansion with the program BOTTLENECK. This corresponds with the sparse infestations recorded in recent field observations in this location (C. Boone, Personal communication). The beetle activity in this location seems to be in the endemic phase. Whistler is close to the west coast of BC. The predominant west to east atmospheric wind direction during the summers would not favour passive beetle movement into this location from the more easterly locations sampled. Expansion of the mountain pine beetle range northeastward across the Rocky Mountains and into the lodgepole and subsequent jack pine forests of the boreal forest is a major concern. Climate modeling predicts that climatic suitability for the beetles will increase in this region (Carroll et al. 2004). Mountain pine beetle outbreaks are not considered endemic to these forests and recent successful invasions into northern Alberta have been noted (Raffa et al. 2008). This study clearly shows that the spreading of beetles into northern Alberta has occurred mainly from northern BC. Beetles either have not dispersed from the southern cluster to northern Alberta or the southern beetles have not survived in northern Alberta due to lack of adaptations to this less climatically suitable region. Ongoing studies on the MPB, based on single nucleotide polymorphism (SNPs), 67 variation in gene-linked microsatellite markers, and modeling with other factors such as wind direction will help to further understand the spread of the epidemic into new areas. Warm temperatures, drought, over-mature forests, and fire suppression are all factors suspected of having a role in fueling the current epidemic (Wulder et al. 2009). Further fine scale studies are needed to investigate the relative roles of endemic population expansion versus short or long-distance immigration events in the genesis of various outbreaks. Studying the genomes of the close associates of the MPB, its main host, and its main fungal symbionts, will also contribute to a better understanding of both the current epidemic and the population structure of the MPB. The results of this study suggest that to prevent further dispersal of beetles into northern Alberta, the first priority should be controlling beetle populations in the northern-upper (NU) cluster locations. Acknowledgments Financial assistance was provided by the University of Northern BC, the Forest Investment Account of the BC Forest Science Program, Genome BC, Genome Alberta and Genome Canada (The TRIA Project - mountain pine beetle system genomics), and the USDA Forest Service Rocky Mountain Research Station. Undergraduate students E. Carlson and B. Shelest assisted with sample collection and analysis. Assistance with sample collection was also provided by M. Duthie-Holt and M. Cleaver, and numerous employees of the Canadian Forest Service, BC Ministry of Forests and Range, Alberta Sustainable Resource Development, BC Provincial Parks, and Canadian National Parks. We thank Allan Carroll, Celia Boone and Barbara Bentz for their critical review of this 68 research. As this study was conducted under the umbrella of a larger mountain pine beetle system genomics (TRIA) project our gratitude goes out to all project members (www.thetriaproject.ca). 69 Chapter Three ISOLATION AND CHARACTERIZATION OF EST-DERIVED MICROSATELLITE MARKERS FOR THE MOUNTAIN PINE BEETLE (DENDROCTONUS PONDEROSAE HOPKINS)11 1 These results were accepted for publication in Molecular Ecology Resources (2011), in press. Chapter 3 is a modified version of the accepted publication. 70 Abstract Gene-linked microsatellite markers were developed from an EST database of Dendroctonus ponderosae. Fifty markers out of 79 were polymorphic in a collection of eight geographically separate beetles. Forty-eight of these markers were polymorphic (28 alleles) in 16 beetles from a single sampling location. Polymorphic information content ranged from 0.062 - 0.785. The observed/expected heterozygosities ranged from 0.000 0.786 and 0.067 - 0.839, respectively. Nine loci showed deviation from Hardy-Weinberg equilibrium after sequential Bonferroni correction. Linkage disequilibrium was not detected. Most polymorphic loci found were within the 3' untranslated region, followed by the open reading frame and 5' untranslated region, respectively. 71 3.1. Introduction Mountain pine beetle (MPB), Dendroctonus ponderosae, is the most significant pest of pine forests in Western North America (Safranyik & Carroll 2006), and with the current epidemic the historical range has being greatly expanded. To study the current outbreak and associated range expansion 'neutral' microsatellites have been developed (Davis et al. 2009) and genomics projects (i.e., EST-transcriptome sequencing) have been initiated. Polymorphic EST- derived microsatellite markers are potentially useful sources of gene-associated polymorphism and are useful for the whole genome surveys in the fields of molecular ecology, quantitative genetics and genomics (Bouck & Vision 2007). Therefore, the main objective of this study was to develop additional microsatellite markers from an EST database of MPB. 3.2. Materials and methods 3.2.1. Screening of EST database A total of 14,441 contigs in a mountain pine beetle EST database (C. Keeling, unpublished data) were screened with the program SSRIT (Temnykh et al. 2001) to identify microsatellite sequences linked to the coding region. To ensure a complete screening for available microsatellites and to understand the nature of repeat sequences (combined or perfect repeats) the online programs Websat (Martins et al. 2009) and TRF 4.0 (Benson 1999) were also used. Mononucleotide repeats were deliberately avoided and the detection criteria were constrained to motifs of length 2-6 bp and a minimum 72 repeat number of four. Microsatellite loci were selected for marker development if they were polymorphic within the library, the number of repeats was large, or if the loci were within a gene of interest 3.2.2. Primer designing Primers were developed for the selected loci (Table 3.1) using PRIMER3 (Rozen & Skaletsky 2000). The parameters for primer designing was as follows: product length 150-350 bp (optimum 200 bp) (100 250 for locus 7119), primer size 18-25 bp (optimum 20 bp), and primer melting temperature of 57-63 °C (optimum 60 °C) and GC contents from 40% to 70% (50% optimum). 3.2.3. Microsatellite amplification For genotyping, an M13-tailed primer method was used (Schuelke 2000). The total volume of each polymerase chain reaction (PCR) reaction was 15 ul and contained 20 ng genomic DNA, 1 unit of Taq DNA polymerase (Invitrogen, Carlsbad, CA), lx PCR buffer, 200 UM of each dNTP, 0.16 \iM of forward M13-tailed primer, 0.32 UM of reverse primer, and 0.32 UM of fluorescence-labeled Ml 3 primer. Two concentrations of MgC^ (1.6 mM or 2.4 mM) were used depending on the locus (Table 3.1). To reduce nonspecific primer binding, the fragments were amplified using a touchdown PCR protocol (Don et al. 1991). The PCR cycling conditions were an initial denaturing cycle of 94 °C for 5 min, followed by a 16 cycle touchdown stage that consisted of 94 °C for lmin, a 1 min annealing temperature beginning at 53 °C (decreasing by 0.5 °C every cycle) and a 72 °C extension for 30 sec. Touchdown cycles were followed by 17 cycles 73 that consisted of 94 °C for lmin, 52 °C for 1 min and 72 °C for 30 sec. The final extension was at 72 °C for 11 min. For some loci the first touchdown annealing temperature was 57 °C (Table 3.1). For the initial scoring for polymorphism, the amplified loci were tested with eight MPB beetle DNA samples. To increase the likelihood of finding polymorphic markers, these individuals were selected from eight different regions in Western Canada (i.e., Alberta - Banff and Willmore Wilderness and BC - Houston, McBride, Kelowna, Mackenzie, Whistler and Prince George). Fragment sizes (alleles) were analyzed with a Beckman-Coulter CEQ8000 automated DNA sequencer and the allele sizes were scored manually. 3.2.4. Polymorphism within a location To further test for the level of polymorphism within a single location, each polymorphic marker identified above was further genotyped with 16 individuals from one location (Quesnel, BC). The level of polymorphism (the absolute number of different alleles per microsatellite locus), polymorphism information content (PIC), and observed (Ho) and expected (HE) heterozygosity of each new locus were determined with the program MICROSATELLITE TOOLKIT (Park 2001). Tests for Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) were conducted with program ARLEQUIN 3.11 (Excoffier et al. 2005). 74 Table 3.1. PCR primers and reaction conditions for 50 polymorphic EST-derived microsatellite markers tor Dendroctonus ponderosa. Representative Genbank accession numbers, repeat motif and location in the putative mRNA (coding sequence (CDS) or untranslated region (3' or 5') or shown. A '?' indicates the position is unknown or uncertain. Locus GenBank Accession1 Repeat Motif Location Ml3-Tailed Primer2 Reverse Primer MPBC71771 GT416554 (CA)7 5' AGAGTAATGCGACGGATGCT GGGGACGTTTCTCATATGTTT 53 MPBC8_3135 GT363660 (AC)8 5' CACGTCACTGAAACCACACC TAGGACTGCATGCACTTTCG 53 MPBC6_893 GT369500 (CG)5 3' ACAGCTGGAACGTCACACAC GGTGCAGCGTTTCACTAGC 57 MPBC61403 GT393905 (TA)6 3' TTGCTTAATGTGCAGCTTCG ACACAAAAGTGCAACGACGA 53 MPBC61504 GT430043 (TA)S 3' TGTCAAACCGCATCATCAGT TCGAAGGCTGCTAAGGAAAA 53 MPBC6_3837 GO486077 (TA)9 3' CAAGTCGAAGAATAGAACAGTTGC CCGAAACAAAATGCTACCAAA 53 MPBC7J284 GT461671 (TA), 3' TGCTAGGTGACTGTACAAGTTGA CAGGCAAGATCGATCAGAAA 53 MPBC814256 GT383057 (AT)I2 3' CGGGGATTTAAGAAGCGAGA GGACTGCCATTTCCATCTGT 53 MPBC712514 GT345241 (AC)9 3' TCAGTTTGCTGTCGTTTTCG GGGGTGCGATTGTTGAATA 53 MPBC8_297 GT381367 (TA)6 3' CATGTTCCAACAACATTAGCA GCAAGTGCAATGAAGGGAAT 53 MPBC7J1362 GT457678 (AT), 9 CCCTAATGGCAGCAGTTTTG CCGAACCGTCGACTTTATGT 53 MPBC83807 GT485805 (AC), ? CATATTCAATGGCACGACGA CCGACATAATGCAAAAACTTAACA 53 MPBC8_6132 GT413201 (AT), ? TGGTTTCGAAAGGTTGGTTC CAGCTGCCAAATGGAACTTT 53 MPBC8J1035 GT489170 (AT), ? GGATTGCGTTTTGGAGATTC GTGGGTGCACTTGACACATC 53 3 MPBC711376 GT458184 (GT)„ ? CGCGTCTCGCCTCTTATTAC TTGTTGAGCGATTTTTGCAG 57 MPBC8_7725 GT436798 (TAA)7 5' CAAGGTTTCATCTGGCCAAC GGACAGACAGCTGTTCGTTTG 53 MPBC5J5124 GT403944 (AGG)6 CDS AGGGGAACGTAATGTGCAAG CGCATTCTCGCTTAATAGCC 53 MPBC5811 (GAG)6 CDS CCGAAGCTCCCACTAGACTG ATGTCGTCAAAGTGCAACCA 53f MPBC6675 GT357891 GT401041 GT408450 (AGT)9 CDS GTCTTCGGGCACTGAATTTG CCCAGCTTGTCCCTTTGTAA 53 MPBC67245 GT339861 (TGC)7 CDS TTATGCACACAAACTGGGAAA GGAAAAGAGCCAGCTTAGGG 53 MPBC7_548 GT320845 (GTG)6 CDS GCTTCCGATTCTGGAGTGAG AAGAAATCAGCCCAGCAAGA 53 MPBC8_2778 GT344705 (CAG)7 CDS GTTGAAAGACAACCCGAAGG GGTGCGCTCTCTACTGGTTT 53 MPBC8_4511 GT415941 (TCA)„ CDS AATTGGCATTTGTCGCATTT TGCCGTCGTTTAATTGTTCA 53 MPBC86649 GT419741 (TCA)8 CDS ATCATTGCCTTCTCGTTTGG CCCAGCCTCCAATGAAGTAA 53 MPBC8_9094 GT433817 (CAT)6 CDS TCTTATACAATTTAATCATCGTTCCAA TGGCAGATTTGCATCTGAAG 53 75 Table 3.1 continued Locus GenBank Accession1 Repeat Motif Location M13-Tailed Primer2 Reverse Primer T 3 MPBC8_9385 GT451465 (CTA)10 CDS TTCGACATGTTCGTGTTAATTC TCATGCAGTGTTGAAGCTGA 53 MPBC8J0137 GT464982 (ATT)4 CDS CGCATCAGGCAATAAGTTAGC GATGCGTTTTTGGGGAATTA 53 MPBC5_4313 GT3 50467 (TCA)8 3' GGATCCGAAACAGCAGAAAG AGTCCCAACCACATCAGAGC 53 MPBC8_8574 GT490424 (ATG)7 3' TGGGTCCAAATGGGGTATTA CCAAACTCATCCGTCGATCT 53 53+ MPBC5_73 GT328703 (TAC)6 3' CGTGCGTTGGCTCATAATAA AAGCTTTTTGTGCTGGTTTTT MPBC8_884 GT421807 (GTA)8 3' TTTTTAGCCTTGCATCAGCA ATTTGGTTGCGCAATTTGA 53 MPBC6_655 GT324623 (TAC)6 3' TTCCACCACATCAATGCCTA GGCTTGTCGAAAAAGTACGG 57 CTTGAAGGAGAGCGTGAACC AGGTGCAGTCTTGCTGTTTG 53 MPBC8J2800 GT490735 (AAC)4 MPBC6_4141_1 GT350767 (AGT)„ 3' ? GCCACGCGTTTAATAACACA TTGCCGATGTTGAGGATGTA 57 MPBC5_6823 GT404280 (AAT)5 ? TTTGCCGTTCAAATTGAGGT TCCGCACAACATTATTACCG 53 MPBC5_7119 GT490498 (TTG)6 ? GATAATGCCGCTTTCACCAT GCCATAGGAATCAACGTCAAA 53 MPBC5J7419 GT473994 (TAG)7 ? TTGGTCTGAGCTCGATTGTG GGAAGCAACGAATCCCAATA 53f MPBC6_656 GT325939 (TAG)8 ? TCAACGGTGGTGTTCGATAA CGCTAAAGTCGTCCTCAGGT 53 MPBC5_1480 GT331212 (TAAA)4 TGAATTCTTGAAAGCATCTTTATTTC CCCCGTAGTAACCAAAGCAA 53 MPBC6_4141_2 GT356832 (TGAC)4 3' ? AATTTCCGGTGTGCATGTTC TGTTTGTAAATGGGGGAATGA 53 MPBC8_5651 GT413070 (TTAA)s ? AAGAACAACCGCCACAATTT CCAACGAGGACTTTCCATGT 53 MPBC8J0169 GT465588 (TAAA)7 9 ACTGGTTTGCCAACATGTGA CCATTGAATCGCATTGAAGA 53 MPBC8J2235 GT491361 (ATACGC)4 ? CTGCAAGATTTGCTCAAAATTA CGGTACCACGAACACGACTA 53 MPBC5_4357 GT429515 (TA)2TTC(TA)7 3' AGGCAGTAAGACGACGATCC TGCAATCCCTCCTAATTTGC 53 MPBC8J68 GT3 24841 (CAGT)5T(CAGT) CATCTCGAGGCCTTCCACTA AGGCACGGAGTTGACATACC 53 MPBC8J0875 GT474165 (CAA)2G(AAT)7 3' ? TGCATTTCGAACAACCATTG GCTGAGACGTTGGGTGTTTT 53 MPBC8J2050 GT486724 (CT)9CACT(CA)4 5' GGGCTCTTCTTATCGCTTTT GTAGCGCTTCTCCATCTTGG 53 MPBC7_24 GT317345 (TA)3TG(TA)8 3' ATGCGTTCACAAAAGGGTTT ACTTTCACGACGGCCATTAT 53 MPBC7J01 GT373329 (ATT)5(AGT)6 3' AACCAAATGAGGAGCGGTAA ATTTTAGGCCGCGTGTATTG 53 MPBC7 1578 GT322895 (CTG)2CGG(CTG)4 3' GGAGCAGGAGTAGCCAGAGA TAATGATGGAGGGCAAGACC 53 Genbank accession numbers are given as a representative EST for each locus. M13 sequence (TGTAAAACGACGGCCAGT) was added to the 5' end of M13-tailed primers. 3 MgCl2 concentration was 1.6 mM, except where indicated by f (2.4 mM). 2 76 3.2.5. Characterization of the loci The anticipated functions of the genes that contained polymorphic microsatellites were predicted by BLASTx search against the NCBI nr database. The sequences of the contigs were analyzed with the tool 'Open Reading Finder' on the NCBI website. Both BLASTx match and the open reading frame (ORF) analyses were used to detect the location of the polymorphic markers within each gene. 3.3. Results 3.3.1. Screening of EST database Of the 14441 contigs screened, a total of 2,938 microsatellites with four or more repeats were identified. Dinucleotide repeats were the most abundant (72%) type of microsatellite sequences in the MPB transcriptome (Figure 3.1). Among them AT/TA was the most common motif followed by TG/CA. Among trinucleotide repeats TAA/TTA, AAT/ATT, AAG/CTT and TGA/TCA repeat motifs were the most common types. The number of loci decreased with repeat length, with very few penta (1) and hexa (6) nucleotide loci identified (Figure 3.1). 77 2500 -i 2000 - o o o <4H 1500 - 3 1000 - '- o 500 - I 0-1 L — J di , 1 1 , <=> , tn tetra , penta , hexa Type of the microsatelhte Figure 3.1. The composition of microsatelhte sequences with four or more repeats in contigs in build 8 of MPB EST database. A total of 2938 microsatelhte loci were found in 14441 contigs. 3.3.2. Microsatellite amplification From the identified microsatellite loci those showing polymorphism in the EST database, with a large number of repeats, or from genes of interest were chosen for further analysis. Hence, a total of 120 microsatellite loci were selected as potential loci to be developed as markers. Of the 120 primer pairs tested, 79 (65.8%) successfully amplified with the initial two DNA samples. When those 79 loci were genotyped separately with eight different DNA samples, 50 of them were shown to be polymorphic each having at least two alleles and one heterozygote. Of the 29 loci considered 78 monomorphic, three loci had more than one allele, but were not included in the list of polymorphic markers as heterozygotes were not observed. 3.3.3. Characterization of the loci BLASTx matches were found for 35 of the 50 polymorphic loci (Table 4.1). The majority of the loci (20) were found within the 3'UTR, four loci were found within the 5'UTR, and 11 loci were found within the ORF (i.e., coding sequences, CDS) of a gene (Table 3.2). All the polymorphic loci found within the ORF were trinucleotide repeats. The position within the gene could not be determined with confidence for 15 loci, when the possible ORFs were very small and /or when there was no significant hit in the BLASTx search. 3.3.4. Polymorphism within a location Within the Quesnel dataset, 48 loci were polymorphic. The number of alleles per locus (NA) ranged form 2 to 8 (mean = 3.9) and the range of PIC was from 0.062 to 0.785 (mean 0.480). The H0 and HE ranged from 0.000 to 0.786 (mean = 0.359) and from 0.067 to 0.839 (mean = 0.546), respectively (Table 3.2). A significant LD was not detected between any pair of loci. Before the correction for multiple tests, 24 loci showed deviations from HWE at 0.05 significant levels. Nine loci showed deviation from HWE after the sequential Bonferroni correction (Rice 1989). All the deviations (24) were due to deficiency of observed heterozygosity. 79 Table 3.2. Genetic diversity statistics within MPB sampled in Western Canada. For each locus the allele size range for all samples and the number of alleles (NA ) found in the initial survey of eight geographically separated MPB (f) and from 16 MPB from the Quesnel location are shown. For the Quesnel sampling location observed heterozygosity (H0>), expected heterozygosity (HE), and polymorphic information content (PIC) are also shown. Total Locus MPBC7J771 MPBC8 3135 MPBC6893 MPBC61403 MPBC61504 MPBC63837 MPBC7J284 MPBC8_14256 MPBC712514 MPBC8297 MPBC711362 MPBC83807 MPBC8_6132 MPBC8 11035 MPBC7J1376 MPBC8J7725 MPBC5_6124 MPBC5811 MPBC6675 MPBC6_7245 MPBC7548 MPBC82778 MPBC84511 MPBC86649 MPBC89094 MPBC89385 MPBC810137 MPBC54313 MPBC88574 MPBC5J73 MPBC8_884 MPBC6655 MPBC8 12800 Quesnel Sampling location Size Range N/ NA Ho HE PIC 193-198 314-323 161-165 196-202 214-220 169-177 233-237 243-263 232-246 217-221 168-200 338-364 278-288 292-310 208-242 349-355 221-227 260-289 171-180 212-233 280-289 170-172 385-409 315-324 229-232 275-281 364-378 261-267 228-234 225-234 167-176 267-282 177-201 2 3 3 3 3 5 2 5 3 2 2 4 5 5 5 2 3 3 2 4 4 2 6 2 2 2 2 2 3 3 3 3 2 5 4 3 4 2 3 3 3 4 2 6 4 5 8 7 3 3 5 4 5 4 1 6 4 2 2 2 3 2 2 2 4 5 0.789 0.637 0.606 0.699 0.443 0.446 0.532 0.415 0.752 0.175 0.722 0.706 0.789 0.757 0.839 0.248 0.675 0.774 0.743 0.472 0.669 0.000 0.699 0.721 0.198 0.508 0.067 0.204 0.254 0.148 0.331 0.578 0.706 0.724 0.552 0.498 0.614 0.335 0.378 0.450 0.359 0.676 0.155 0.648 0.621 0.723 0.704 0.785 0.227 0.580 0.706 0.664 0.429 0.577 0.000 0.628 0.635 0.173 0.369 0.062 0.186 0.215 0.132 0.269 0.506 0.644 0.467* 0.308 0.071* 0.583 0.308 0.313 0.286 0.214 0.333* 0.063 0.250* 0.214* 0.400 0.500 0.786* 0.267 0.750 0.438 0.333 0.500 0.571 0.000 0.417 0.417 0.214 0.385 0.067 0.143 0.286 0.000 0.400 0.538 0.214* 80 Table 3.2 continued Total Locus MPBC6_4141_1 MPBC5 6823 MPBC5J7119 MPBC5_7419 MPBC6_656 MPBC5J480 MPBC641412 MPBC8_5651 MPBC810169 MPBC812235 MPBC54357 MPBC8368 MPBC8_10875 MPBC812050 MPBC724 MPBC7 101 MPBC7 1578 Size Range 167-185 246-249 125-134 191-209 259-280 256-272 239-247 194-202 241-285 333-348 235-245 219-246 278-290 277-281 180-186 178-196 188-197 Quesnel Sampling location N/ NA 5 2 3 3 2 2 2 3 2 3 4 2 3 2 3 3 2 6 1 3 5 7 3 3 2 5 2 6 4 3 3 4 5 4 #o 0.438 0.000 0.500 0.625 0.615 0.125 0.250 0.429 0.125 0.214 0.385* 0.333 0.429 0.286 0.615 0.313* 0.500 HE PIC 0.815 0.000 0.522 0.655 0.652 0.123 0.333 0.423 0.341 0.198 0.837 0.614 0.561 0.622 0.742 0.724 0.722 0.759 0.000 0.406 0.565 0.600 0.116 0.299 0.325 0.317 0.173 0.775 0.536 0.453 0.529 0.661 0.662 0.639 * Loci with significant deviation from HWE after sequential Bonferroni correction. 3.4. Discussion The composition of the microsatellites in the EST database may represent the overall composition of microsatellites in the genes of MPB genome. This composition was consistent with other studies of various organisms. The most common repeat type MPB EST database was dinucleotide. Dinucleotide repeats are the most frequent type found in other arthropods (Demuth et al 2007). The CA/TG repeats form the most common microsatellites in Drosophila melanogaster (Schug et al. 1998). In humans, CA/TG repeats are also the most common with their abundance estimated as twice the occurrence of AT/TA repeats (Beckmann & Weber 1992). In contrast, among the ESTs of MPB the AT/TA was the most common type of microsatellite sequences, yet, the percentage of CA/TG was also close to that. As with many organisms trinucleotide repeats were the second most common type of microsatellites in the database. However, different species have shown different mixtures of trinucleotide repeats (Smit et al. 1995; Song et al. 2002; Prasad et al. 2005). In this study five types of trinucleotide repeats were shown to be equally abundant among the ESTs of MPB. Consistent with the other organisms the availability of tetra, penta and hexanucleotide microsatellites was relatively low in MPB. Among the new polymorphic markers, all the deviations (24) from HWE were due to deficiency of observed heterozygosity. As only 2 of 13 'neutral' microsatellite markers (Davis et al 2009) showed evidence for deviation from HWE in this sample of 16 beetles (data not shown) and none in a larger sample of 55 beetles (chapter two), this may indicate the presence of null alleles. However, sufficient variation was found to show that the markers were polymorphic and of potential use for population level comparisons. The Ml3 tailed primer labeling system used here may decrease primer stability and therefore led to an increase in null alleles. It is recommended to directly label primers showing heterozygote deficiency prior to population studies. 82 Chapter Four A PRELIMINARY SURVEY OF SPATIAL GENETIC VARIATION OF THE MOUNTAIN PINE BEETLE (DENDROCTONUS PONDEROSAE) OUTBREAK IN WESTERN CANADA WITH FIVE E S T - D E R I V E D MICROSATELLITE MARKERS: DETECTING SIGNATURES OF SELECTION 83 Abstract A preliminary survey was carried out with five EST-derived microsatellite markers to study levels of variation and to detect evidence of selection at gene-linked markers in the mountain pine beetle (MPB). Each marker was amplified from beetles collected from three northern and three southern locations in western Canada. Sample sizes of the six locations varied from 21-24 beetles per location. All five markers conformed to the expectation of linkage equilibrium and some markers were out of Hardy-Weinberg equilibrium in some locations. Locus 6823, linked to the ornithine decarboxilate antizyme gene, was less polymorphic compared to the other four. The EST-derived markers further confirmed the north-south clustering pattern (Chapter 2). Overall genetic differentiation among sample locations measured at the five EST-derived markers was higher compared with the 13 genomic markers (global FST values were 0.15 (P < 0.00001) and 0.064 (P < 0.00001), respectively). Two EST-derived loci showed evidence of selection. When those two were excluded from the analysis, the F S T and FCT were comparable at genomic and EST-derived markers. Locus 4357, found within the ring finger protein 141 gene, reported the highest polymorphism but the lowest genetic differentiation between clusters, giving evidence for balancing selection. Locus 675, located within the coding sequence of the gene of an inhibitor of apoptosis 1 protein, showed the greatest genetic differentiation (locus specific FST = 0.503, P < 0.00001) and showed a clear selection signature (directional selection) in all the tests done to identify deviation from neutral expectations. This study showed that the use of EST-derived microsatellites is a promising approach for identifying signatures of selection and to study gene linked variation in the mountain pine beetle. 84 4.1. Introduction The objective of this study was to use EST-derived microsatellite markers in a preliminary survey to identify evidence of selection and to study gene linked variation in mountain pine beetle (MPB). Although the current MPB epidemic can be traced to a single outbreak in the mid 1990's evidence suggests that since this time several MPB outbreaks have erupted in different locations in western Canada (Carroll et al. 2006; Burton 2010; Chapter 2) and have coalesced to form the largest insect outbreak in recorded Canadian history (Clark et al 2007). MPB eruptions at multiple places coupled with the invasion of new habitats (Aukema et al. 2006) have made this epidemic remarkable. The study described in Chapter 2 identified two main clusters of MPB (north & south) and the northernmost locations were identified as the sources most likely responsible for the north-eastward expansion of outbreaks. Identification of loci under selection in different localities may help to identify if any genetic factor(s) are involved in the spread of current outbreaks. The EST-derived microsatellite markers are very useful in this respect since they are physically linked to the genes. EST-derived microsatellites have been used to find evidence of selection in a range of species (Li et al. 2002; Vigouroux et al. 2002; Luikart et al. 2003; Vasemagi et al. 2005; Yatabe et al. 2007). The changes in the DNA sequence of a particular locus may influence the morphological and/or physiological phenotype of an organism. If the changes are heritable and affect the fitness of an organism, these changes provide the source of variation for evolution by natural selection (Whitehead & Crawford 2006). 85 Selective pressures can vary in different localities due to differences in environmental and biological factors, and depending on the traits and the corresponding genotypes, can conserve diversity (balancing selection) or led to divergence in space (directional selection) (Endler 1986 in Whitehead & Crawford 2006; Schluter 2001; Langerhans et al. 2007). Spatial genetic variation of a species, however, is not only the result of natural selection but also of random-neutral changes (genetic drift) that accumulate over time (Whitehead & Crawford 2006). Teasing apart the changes due to random drift from directional selection is challenging. At the molecular level, however, selection can be distinguished since it is locus specific, while random changes affect the entire genome (Luikart et al. 2003). A signature of selection can be detected by examining spatial genetic differences at many loci, including both those of possible adaptive significance and those that are unlikely to lead to fitness differences (neutral markers). Neutral markers thus represent the expectation of changes due to random drift, and deviation from these neutral expectations can be though of as a signature of selection (Nielsen et al. 2005; Worley et al. 2006; Tian et al. 2009). The effects of selection on a specific locus, however, may also affect variation in the flanking genomic region. The indirect effects of selection on linked genetic polymorphism are known as genetic hitchhiking (Vasemagi et al. 2005). Recent and strong directional selection on a mutation in a gene can reduce or eliminate the variation of the neighboring neutral DNA, known as a selective sweep (Galtier et al. 2000; Wootton et al. 2002; Nielsen et al. 2005). Therefore, the polymorphic neutral markers can also be used to detect a signature of selection (acting at closely linked loci) by 86 screening for deviations from neutral expectations (Vasemagi et al. 2005). However, flanking neutral DNA markers give evidence only for recent selective events because the strength of the effect of selection shown by a neutral genetic marker decays with the time as random genetic variation accumulate masking the effect of the selection (Vasemagi et al. 2005). Selection acts on the functional regions (genes) of the genome. At the molecular level the effect of selection is strongest at the selected site, consequently polymorphic loci closer to the actual site under selection will be more likely to show evidence of selection (by hitchhiking) than loci away from the selected site (Vasemagi et al. 2005). As EST-derived microsatellites occur within the genes, they have a higher power of detecting selection than genomic microsatellites (Vasemagi et al. 2005; Bouck & Vision 2007). Since microsatellites have functional roles in some genes, these repetitive sequences themselves can be a source of functional variation (Kashi & King 2006). Different statistical methods are available to identify loci that show significant deviations from the neutral expectations (Vasemagi et al. 2005). If a locus that was known to be polymorphic in the past shows a greater reduction of variability in a locusspecific and space-specific manner, it is an indication of positive selection either on the particular locus or on a nearby locus, itself being a part of selective sweep (Ihle et al. 2006). Hence, distinguishing selection from a selective sweep requires additional studies to explore functional differences. Sometimes the effect of a selective sweep may occur 87 over the entire geographical range and then differentiating it from a low mutational rate of a gene is also complicated. The use of population-genomics studies to identify adaptive molecular variation is an increasing common approach among recent literature (Luikart et al. 2003; Nielsen et al. 2009; Narum et al. 2011; Sattath et al. 2011). These studies employ large sets of polymorphic markers in an attempt to survey the variation of the genome. Different types of markers have been used including AFLP and gene derived single nucleotide polymorphisms (SNP's) (Luikart et al. 2003; Narum et al. 2011). Genie microsatellites are another source of variation with which to conduct genome scans. These have the advantage, like SNP's, of being part of or closely linked with genes while they have a greater information content per locus than SNPs. There is evidence that microsatellites can be involved in gene expression and function (Kashi & King 2006). For example, changes in the length of microsatellites in gene regulatory regions can affect the binding of transcription factors making qualitative changes that exert effects on the expression of certain genes (Martin et al. 2005). Changes in the length of microsatellites located in introns may also affect levels of gene expression leading to quantitative changes (Pagani et al. 2000; Fabre et al. 2002: Li et al. 2004; Varshney et al. 2005). As microsatellites can play a role in gene expression they can be a source of variation important for adaptive evolution (Kashi & King 2006). Further, if a particular microsatellite sequence is located within the coding sequence, allelic variants will cause changes to peptide sequence which may affect the overall function of the polypeptide produced. The genetic markers that are 88 linked to such phenotypic diversity will help to understand the possible adaptive variation (Eujayl etal. 2001). There are a number of examples where variation at genie mircosatellites have been linked to functional changes, which are therefore possibly adaptive. In rodents, variation of the expression of the VlaR gene and thereby social behavior is coupled with the polymorphism of a genie microsatellites in 5' region (Hammock & Young 2005; Young & Hammock 2007; Donaldson et al. 2008). Further, Walum et al. (2008) reported that microsatellite polymorphism in the same region was linked to the differences in social behavior in humans. The number of repeats of a genie microsatellite locus in the 'clock' gene of Drosophilaperiodvaries with the latitudinal temperature (Sawyer et al. 1997) and this might be linked with an adaptation in different temperature zones. Compared to the genomic microsatellites, the EST-derived microsatellites can be more useful to detect variation in the expressed portion of the genome and to identify adaptive variation as they occur within the genes. Indeed, genie microsatellites have been proven as useful sources of studying quantitative and qualitative variation linked to the genes. The main objective of this study was to conduct a preliminary survey of five ESTderived microsatellite markers to study variation in genes that may have adaptive significance. The spatial genetic patterns of the EST-derived markers was compared with those at the genomic microsatellite markers (chapter 2) at locations selected to represent the two main clusters found in a larger survey of genomic microsatellites (chapter 2). This study allows for a comparison of the ability of genie and genomic/neutral 89 microsatellite markers to detect spatial genetic patterns. Further it enables the detection of markers displaying signatures of selection, i.e., markers which have patterns of spatial genetic diversity outside the expectations of neutrality. 4.2. Materials and methods From the 50 EST-derived microsatellite markers developed in chapter 3, five markers were chosen for a preliminary survey (Table 4.1). The five markers were chosen so as to represent both functional and structural genes. The chosen markers were amplified from DNA samples of 132 beetles collected from six locations (Houston-HO, Mackenzie-MA, Grande Prairie-GP, Whistler-WH, Nancy Greene-NG and Banff-BA, Figure 4.1). The locations were selected in order to represent the northern (HO, MA, GP) and southern clusters (WH, NG, BA) identified by population structure analysis on genotypic data of genomic microsatellite markers (chapter 2). The sample size per location ranged from 21 to 24. Pureplex PCR reactions were done for each of five loci following the optimized PCR conditions, described in Table 3.1 (chapter 3). The five loci were combined and analysed for fragment size with a Beckman-Coulter CEQ8000 automated DNA sequencer. Scoring (assessment of fragment size) was done manually with the aid of the Beckman-Coulter CEQ8000 analysis software. 90 Table 4 1. Predicted function of EST loci based on Basic local alignment search tool (BLASTx). The description, E values and scores are given Locus MPBC71771 MPBC83135 MPBC6893 MPBC61403 MPBC6J504 MPBC63837 MPBC71284 MPBC8J4256 MPBC712514 MPBC8_297 MPBC7J1362 MPBC83807 MPBC8_6132 MPBC8_11035 MPBC7_11376 MPBC8_7725 MPBC5_6124 MPBC5811 MPBC6_675f MPBC6_7245 MPBC7548 MPBC8_2778 MPBC8_4511 MPBC8_6649 MPBC8_9094 MPBC89385 MPBC8J0137 MPBC5_4313 MPBC8J574 MPBC5 73 E value Score Description (BLASTx) - Predicted similar to 9 00E-12 183 reflXP_002052252 1| GJ17452 [Drosophila vmhs] gb|EDW64407 1 No significant hit 7 00E-85 815 reflXP_966783 11 calcium-transporting ATPase sarcoplasmic isoform 1 [Tnbohum castaneum] reflXPOOl 814047 1| CG1440 CGI 440-PC [Tnbolium castaneum] 0 691 4 00E-30 344 refjXP_975044 2| ribonucleic acid binding protein SI [Tnbolium castaneum] 1 00E-153 1405 ref|XP_394382 2| Guanine nucleotide-binding protein subunit alpha homolog (Protein concertina) [Apis mellifera] 1 00E-149 1372 ref|XP_971447 1| set domain protein [Tnbolium castaneum] 5 00E-25 300 ref[XP_970261 2| DEAD box ATP-dependent RNA hehcase [Tnbohum castaneum] 8 00E-16 217 ref|XP_967517 1| GA11062-PA [Tnbolium castaneum] 2 00E-67 664 ref]XP 973290 2| bat5 hla-b-associated transcnpt [Tnbohum castaneum] No significant hit No significant hit No significant hit 9 00E-05 125 ref|XP_001944000 1| hypothetical protein [Acyrthosrphon pisum] No significant hit ref]XP_972981 1 Mps one binder kinase activator-hke 4 (Mob as tumor suppressor protein 4) [Tnbohum castaneum] 1 00E-112 1048 1 00E-154 1412 ref|XP_001810630 1| CGI4722 CG14722-PA [Tnbolium castaneum] 5 00E-89 853 reflXPOO 1120969 1| CG14972-PA [Apis mellifera] 7 00E-45 470 ref|XP_001606017 1 inhibitor of apoptosis 1 protein [Nasonia vitnpennis] 1 00E-53 543 ref|XP_972313 1| prefoldm subunit 5 [Tnbohum castaneum] refjXP_966368 2| GA15696-PA [Tnbohum castaneum] 7 00E-67 659 No significant hit ref|XP_002223455 1| hypothetical protein BRAFLDRAFT_85737 [Branchiostoma flondae] 5 00E-16 221 No significant hit ref|XP_968173 1 open reading frame 19 [Tnbolium castaneum] 4 00E-62 620 No significant hit ref|XP_001899738 1| glycogen synthase kinase [Brugia malayi] gb|EDP31460 1| 2 00E-25 297 ref|XP_976414 2| AGAP002756-PA [Tnbohum castaneum] 2 00E-31 355 ref|XP_001812169 1 predicted protein [Tnbohum castaneum] 6 00E-44 463 5 00E-64 636 refjXP 970661 1| to profilin [Tnbohum castaneum] 91 Table 4 1, continued Locus MPBC8_ 884| MPBC6 655 MPBC8 12800 MPBC6 4141 1 MPBC5_ 6823f MPBC5 7119 MPBC5 7419 MPBC6 656 MPBC5 1480 MPBC6 4141 2 MPBC8 5651 MPBC8 10169 MPBC8 12235 MPBC5 _4357f MPBC8 368 MPBC8 10875 MPBC8 12050 MPBC7 _24| MPBC7 101 MPBC7 1578 Descnption (BLASTx) - Predicted similar to ref]XP_974179 1| F-actm-cappmg protein subunit alpha [Tnbolmm castaneum] ref]XP_970549 1| V-l protein, putative [Tnbolmm castaneum] ref]XP_970633 11 antennae-nch cytochrome P450 [Tribolium castaneum] No significant hit gb|EEC 18673 1| ornithine decarboxylase antizyme, putative [Ixodes scapulans] No significant hit PREDICTED CGI 1267-PA [Apis mellifera] reflXP_970474 1| AGAP001222-PA [Tnbolmm castaneum] No significant hit No significant hit No significant hit No significant hit No significant hit ref]XP_974067 1| ring finger protein 141 [Tribolium castaneum] ref|XP_973939 2| igf2 mRNA binding protein, putative [Tnbolmm castaneum] No significant hit ref]XP_969579 2| CGI0082 CG10082-PA [Tnbolmm castaneum] ref]NP_001095946 1| chitin deacetylase 1 [Tribolium castaneum] gb|ABU25223 1 hypothetical protein TcasGA2JTC014704 [Tnbolmm castaneum] reflXP 002014489 1| GL18928 [Drosophila persimihs] gb|EDW28485 1 E value Score 1 OOE-138 1272 100E-49 512 1 00E-44 470 6 00E-20 256 1 00E-29 2 00E-27 1100 317 1 00E-62 8 00E-69 625 676 1 00E-107 1 00E-131 8 00E-05 1 00E-162 1012 1214 663 1482 t loci chosen for the study described in chapter 4 92 Figure 4 1 Samohng locations and number of samples at each location The locations codes are; HO (Houston), MA (Mackenzie), GP (Grande Prairie), WH (Whistler), NG (Nancy Green), and BA (Banff). To allow comparison between genomic markers and EST-derived markers, the neutral genomic microsatellite data (at 14 loci) of 132 beetles were extracted from the main data set described in Chapter 2. Parallel analyses were done for both the five ESTderived markers and the 14 neutral genomic markers. Note, the genotypic data locus 93 MPB054 in the genomic microsatellite dataset was also evaluated (this marker was omitted from the main analysis in Chapter 2 as it was out of Hardy-Weinberg Equilibrium (HWE)). 4.2.1. Analysis for selection The genotypic data of EST-derived markers and the genomic markers were analyzed together since the genomic markers are presumed to give an estimation of the degree of neutral changes. First, to identify outlier loci, the degree of genetic differentiations at each locus were compared in terms of locus specific FST values. Deviations from HWE and linkage disequilibrium (LD) were also tested. Further, beatles used in this study were grouped based on the sex linked genomic marker (Chapter 2) and analysed for LD to detect sex linked genie markers. Since, the analysis done in Chapter 2 gave clear evidence for two clusters, analyses were performed in order to identify any selection associated with the identified clustering pattern. For that, the locus specific genetic differentiation between two clusters in terms of FQT values, heterozygosity and differences in allelic compositions were compared. Finally, simulation based neutrality tests were performed to confirm that observed outlier markers statistically deviate from neutral expectations. Genetic diversity within sample locations Sample location statistics, allelic diversity and expected and observed heterozygosity were calculated by the program MICROSATELLITE TOOLKIT (Park 2001). Deviations from Hardy-Weinberg Equilibrium (HWE) and linkage 94 disequilibrium (LD) were tested with the program ARLEQUIN 3.1 l(Excoffier et al. 2005). Locus specific Fsi as evidence of selection The analysis based on F S T has been used as the first step of identifying candidate genes that might be under selection (Beaumont 2005). Hence, to identify any locus specific effect, and thereby to identify outlier loci, locus specific AMOVAs were done using the program ARLEQUIN 3.11 (Excoffier et al. 2005) for all 19 loci. Each AMOVA was run with 10,000 permutations at 0.05 significant levels. FCT, heterozygosity and allele compositions as evidence of divergent selection To understand the association between genetic differentiation caused by ESTderived microsatellites and north-south clustering pattern identified by genomic markers, analyses were done after grouping locations into the northern and southern clusters. Locus specific FQJ values between the two main clusters were studied independently. Further, studying of the reduction of heterozygosity is a strict empirical approach of detecting selection (Kauer et al. 2003). Remarkably reduced heterozygosity (less variability) at a locus in one population relative to another population may indicate selection at that particular locus in the first population (Kauer et al. 2003). Hence, the expected heterozygosity values at each locus were compared between the two clusters. Afterward, the allele compositions within northern and southern cluster were determined and compared at each EST-derived microsatellite marker to detect whether any allele or alleles appeared to be selected in one cluster relative to the other. The program ARLEQUIN was used to calculate the locus specific FCT values. 95 Simulation based tests for selection Two different distance based tests of selection were done. Both methods were based on coalescent simulation. The Vitalis et al. (2001) approach implemented in the program DETSEL 1.0 (http://www.univ-montp2.fr/~genetix/detsel.html) assumes that a common ancestor population gives rise to two different populations and hence, employs pairwise comparisons. Depending on the numbers of alleles at a locus, populationspecific parameters (denoted by F) are calculated for each population for each locus (Vitalis et al. 2001), where F reflects the amount of locus specific population divergence relative to the common ancestral population and therefore can be used to identify loci affected by selection (Vitalis et al. 2001). Finally, the analysis shows the expected distribution of data points for all the loci based on neutral expectations and the loci that have undergone recent selection are put outside of the defined area (Kane & Rieseberg 2007). Since this method involves pair-wise comparisons the locations in each cluster were pooled in order to compare southern and northern beetles. The parameters used in the analysis were: population size before the split No = 50, 500; mutation rate fj. = 0.01, 0.001, 0.0001, 0.00001; ancestral population size Ne = 50, 500, and 5000, 50,000; time since bottleneck To = 50, 500, 5000, 10000; and time since divergence t = 50, 500. Significance was tested both at q= 95% and q= 99% (P = q - 1). As a second approach, the Beaumont & Nichols (1996) coalescent based distance method (FDIST); implemented in the program LOSITAN 2 (Antao et al. 2008) was used. The Beaumont & Nichols (1996) method identifies outlier loci by determining a range of expected FSTvalues based on the observed heterozygosity in the dataset. Analyses were done separately under two mutational models of microsatellites (IAM and SMM) with 95000 96 simulations. All loci were used in each simulation and the loci outside the desired confidence intervals were detected at three levels; at 0.95, 0.99 and 0.995. 4.2.2. Variation link to the genes Global FST values were calculated independently with EST-derived markers and genomic markers in order to compare differentiation at two types of markers. To compare the level of genetic differentiation between clusters at EST-derived markers and genomic markers, nested AMOVAs (grouping locations into southern and northern clusters identified in chapter 2) were carried out independently for both types of markers. At each location, the mean observed heterozygosity and expected heterozygosity were compared. 4.3. Results 4.3.1. Evidence for selection Among the EST-derived loci, locus 6823 was monomorphic in two locations while locus 675 was monomorphic in one location (Table 4.2). Mean observed and expected heterozygosity at the EST-derived microsatellite loci ranged from 0.226 to 0.445 and from 0.334 to 0.537, respectively. At genomic microsatellite loci these ranges were 0.452 to 0.643 and 0.482 to 0.625 for the same 132 beetles. The mean number of alleles at EST-derived and genomic microsatellite markers at each location ranged from 3-4 and 3.7-5.7, respectively (Table 4.2). 97 Table 4.2. Genetic diversity of EST-derived loci and genomic loci at each sampling location. The observed (Ho) & expected heterozygosities (He) and number of alleles (NA) at each sampling location are shown. The data of locus 054, which was out of HWE (chapter 2), were not included in the average statistics of genomic loci. Locus ESTderived 24 4357 884 6823 675 Average Genomic Dpo028 Dpol03 Dpol60 Dpo453 Dpo479 Dpo530 Dpo566 Dpo760 Dpo780 Dpo793 MPB011 MPB017 MPB038 MPB054 Average Location HO Ho He NA GP Ho He NA MA Ho He NA BA Ho He NA WH Ho He NA 0.238 0.429 0.333 0.143 0.095 0.248 0.432* 0.684* 0.361 0.136 0.093 0.341 4 4 3 2 2 3 0.348 0.261 0.217 0.043 0.261 0.226 0.422 0.693*** 0.27 0.043 0.243 0.334 4 5 3 2 4 3.6 0.5 0.6 0.25 0 0 0.270 0.645** 0.683 0.39 0 0 0.344 1 5 5 1 1 3.8 0.625 0.583 0.583 0 0.208 0.400 0.662 0.738 0.602 0 0.198 0.440 4 5 4 1 4 3.6 0.545 0.682 0.591 0.091 0.318 0.445 0.652 0.72 0.521 0.09 0.702*** 0.537 4 4 3 3 6 4 0.682 0.409 0.5 0.045 0.300 0.750 0.550 0.450 0.800 0.600 0.450 0.700 0.450 0.600 0.650 0.250 0.050 0 0.508 0.262 0.768 0.581 0.642 0.691 0.553 0.422 0.606 0.499 0.458 0.569 0.304 0.050 0 0.493 2 6 4 4 4 5 3 5 3 4 3 3 2 1 3.7 0.174 0.783 0.522 0.652 0.522 0.478 0.304 0.826 0.435 0.217 0.522 0.348 0.087 0 0.452 0.165 0.819 0.531 0.727 0.663 0.679* 0.264 0.678 0.619* 0.202 0.486 0.341 0.086 0 0.482 2 7 5 6 5 4 2 6 4 3 4 3 3 1 4.2 0.400 0.850 0.800 0.750 0.700 0.600 0.200 0.600 0.550 0.400 0.550 0.300 0.050 0 0.519 0.383 0.831 0.686* 0.697 0.641 0.662 0.185 0.605 0.558 0.383 0.524 0.350 0.050 0 0.504 3 9 5 5 4 4 2 5 4 3 3 4 2 1 4.1 0.667 0.750 0.750 0.625 0.708 0.542 0.417 0.583 0.667 0.667 0.542 0.542 0.542 0.333 0.615 0.684 0.809 0.799 0.598 0.693 0.593 0.424 0.505 0.700 0.744 0.528 0.480 0.489 0.384 0.619 5 9 8 5 4 4 5 7 6 8 4 4 5 4 5.7 0.409 0.727 0.909 0.682 0.591 0.545 0.136 0.500 0.500 0.955 0.636 0.455 0.455 0.182 0.577 0.394 0.777 0.841 0.781 0.758 0.734 0.133 0.701* 0.679 0.723 0.594 0.474 0.538* 0.169 0.625 5 8 8 5 4 5 4 6 4 5 4 2 5 2 5.0 NG Ho He NA 0.5 0.427 0.659 0.706** 0.53 0.13 0.547 0.514 4 5 4 2 4 3.8 0.864 0.864 0.955 0.591 0.773 0.682 0.136 0.455 0.636 0.727 0.500 0.591 0.591 0.318 0.643 0.742 0.838 0.872 0.721 0.678 0.656 0.133 0.453 0.604 0.693 0.511 0.538 0.606 0.322 0.619 7 8 12 6 4 5 4 5 4 5 4 4 5 4 5.6 Deviation from HWE at different significant levels; * at 0.05, ** at 0.01 and *** at 0.005 98 HWE and LD Significant deviation (P < 0.05) from HWE was noted at six out of 54 tests. A sequential Bonferroni correction applied at the sample location level (i.e., a = 0.05, = P < 0.01) leaves four HWE deviations. After correction, two sample locations showed deviations at locus 4357. Locus 24 and 675 each had one sample location with a significant deviation in HWE. No deviations from HWE were noted for locus 884 or 6823 and no sample location had more than one HWE deviation. Significant LD was detected only at two comparisons out of 60 total comparisons. Neither of these were significant after the sequential Bonferroni correction for multiple tests. When analysed with the sex-linked marker locus 4357 showed significant linkage in five out of six locations (at P = 0.05). After the sequential Bonferroni correction two locations still showed strong linkage with the sex. None of the other loci showed significant linkage disequilibrium with the sex marker. Among genomic markers the locus 054 was monomorphic in all three northern locations and was in HWE in all three southern locations. Among 13 other genomic markers, only five deviations were noted (P < 0.05) across the study area and all of those were nonsignificant after the sequential Bonferroni correction. FST as evidence of selection The global F ST over all 13 neutral loci was 0.064 (P < 0.00001). All locus specific F S T values were significantly greater than zero (P < 0.01) both at genie and genomic microsatellite markers. The locus specific AMOVAs revealed that the F S T 99 associated with locus 675 (found within the gene of inhibitor of apoptosis protein 1, FSy = 0.503, P < 0.00001) was much higher when compared to all other loci (Figure 4.2). The range of locus specific FsTof neutral markers was 0.025 - 0.126. The differentiation at loci 4357 (in the gene of ring finger protein 141) and 6823 (in the gene of ornithine decarboxylase antizyme) were the lowest among all loci compared (P < 0.001 and 0.01, respectively). 0.6 -, o.s 0.4 - o Q- 03 3 o a 0.2 0.1 • h- O jn G*- nnnnnnn •*• (N CO t~ t© m co •*- o in n -^ a h- o> •<*- o •- ^" oo ^" oo tso m in Locus Figure 4.2. Comparisons of the F$T values of EST-derived (darker, maroon) and genomic loci (lighter, purple). The overall FST at EST loci was 0.16 and that of at genomic loci was 0.064. 100 FCT as evidence of selection Analysis of among-group variance (FCT) reveals similar patterns to the FSTLocus specific FCT values revealed that the genetic differentiation between southern and northern clusters was remarkably higher at locus 675. A significant between-cluster genetic differentiation was not seen at loci 4357 and 6823 (Figure 4.3). 4) -3 o CO T3 d a e $ o S3 -^- -<3- CO oo o OO CO UO os r*- I— to Locus Figure 4.3. Comparison of FCT values which represent genetic differentiation between northern and southern clusters. Heterozygosity as evidence of selection Consistent with the diversity patterns identified by genomic markers (chapter 2), most of the EST-derived markers showed reduced diversity at northern locations (Table 4.2). When the genotypic data were pooled into northern and southern clusters, the difference between heterozygosity (reduction of heterozygosity) was highest at locus 675. At the EST locus 6823, heterozygosity was low in all populations. Allele compositions as evidence of selection Allele compositions at locus 675 were clearly different in northern and southern beetles (Figure 4.4). The most common allele in the northern group was allele-177 and in the southern group 174. In the northern cluster all the beetles studied contained at least one copy of the '177' allele and 87.5 % of the beetles were homozygous at this locus. In the southern cluster a different allele (allele-174) was most common: 77.9% of the beetles contained at least one copy of the '174' allele. The between-cluster differences in allelic composition of the other EST-derived loci were not as pronounced when compared to locus 675 (Figure 4.4b). 102 (a) (b) Locus 24 Locus 4357 Locus 884 Locus 6823 IL f) %i northern cluster €) © (J southern cluster Figure 4.4. Allele compositions at each locus in each cluster, (a) the north south divergent shown at locus_675. (b) the allele compositions of other EST derived loci. Different colors represent different alleles at each locus. 103 Simulation based analysis as evidence for selection Outlier analyses found four markers which had spatial genetic patterns outside of the neutral expectations. Divergence simulation implemented in DETSEL 1.0 (Figure 4.5) indicated locus 675 as an outlier (q = 99%; P < 0.01). Loci 054 and 884 were also shown to be outliers at a lower p-value (q = 95%; P < 0.05). In a separate approach, estimates of the expected range of FST values for He were used to identify outliers (Figure 4.6). Locus 675 was again confirmed as an outlier within all the parameters studied (i.e., in both IAM and SMM models and at all confidence intervals). Indicating directional selection, locus 675 had an F S T greater than the expected range. In addition, locus 4357 was identified as a locus under balancing selection (characterized by having high heterozygosity and low FST) in all the analyses except at 0.995 confidence intervals under SMM. All the other loci were within the neutral range. BLOCUS_675 .LOCUS_884 •LOCUS_054_OE / " Neutral expectation 1 / -O 9 0 -O SO -O *70 -O SO -O SO -O 4 0 -O 3 0 -O 2 0 -O lO OOO O lO O 20 O 30 O 40 O SO O SO O 70 O SO O S>D 1-1 Figure 4.5. Analysis with the program DETSEL 1.0. Locus_675 was an outlier at P= 0.01. (Locus_054 and 884 were also outliers at P = 0.05). 104 Fst/He 0 6SJ 0 525 Locus.675 0500 0475 0450 0425 04M 0375 0350 0325 0 3OO I 0275, 0 2500 225 0 200 0175 0150 » 0125 I • 0100 • 0 075 0 050 * 0 025 000 005 010 015 020 025 030 035 040 045 050 055 060 065 070 075 080 085 090 He Markers Candidate balancing selection . Candidate neutral Candidate positive selection Figure 4.6. Analysis with the program LOSITAN 2.0; under SMM model at 0.99 confidence intervals. Locus_675 was a candidate for positive selection and locus_4357 was a candidate for balancing selection. 4.3.2. Variation link to the genes Global FST which represent the overall genetic differentiation was higher at the five EST-derived loci (F S T = 0.16 P < 0.00001), being more than double the value seen at the 13 genomic loci (FST = 0.064 P < 0.00001). The greatest genetic differentiation reported was seen at locus 675 (locus specific FST = 0.503 P < 0.00001). Hence, the observed higher genetic variation at the EST-derived markers could be heavily influenced by the locus 675. The overall population differentiation showed by the three EST derived 105 markers that appeared to be neutral (loci 884, 24 and 6823) was similar (F S T = 0.07 P < 0.00001) to the differentiation shown by 13 genomic markers. The overall genetic differentiation between southern and northern clusters (FCT) was 0.201 and 0.078 at ESTderived loci and genomic loci, respectively (in both P < 0.00001). When the loci that were under selection were excluded, the FST and Fcv values calculated from the three remaining EST-derived markers were 0.066 and 0.101, respectively (in both P < 0.00001), revealing that they are comparable to neutral genomic markers. 4.4. Discussion In this study spatial genetic variation was compared at 19 loci, presumably representing both neutral genomic and EST-derived microsatellites. The presumably neutral genomic markers should provide the background level of spatial genetic variation, caused by historic demographic factors and the resulting random drift among locations, which is needed to detect signatures of selection. Genomic microsatellite loci are usually considered ideal markers for neutral evolution as they are generally selected without regard to their position in the genome. In this case the genomic microsatellites were selected from repeat-enriched libraries and selected for use based on the presence of polymorphism (Davis et al 2009). As the vast majority of genomic sequences in most animals, including insects, have no known function, most markers chosen without consideration to location from the genome should also have no function. The narrow range of locus specific FST values found at these markers in both this study and chapter 2 support their use as an indicator of the background level of spatial genetic variation. 106 Among five EST-derived microsatellite markers, most markers show similar trends to the genomic microsatellite markers (i.e., reduction of genetic diversity in northern locations and similar levels of genetic differentiation among sampling locations). However, these comparisons also allowed the identification of outlier markers. The FST and FCT were the two major genetic measures used in this study. The use of FST values in identifying outlier loci under selection was first made by Lewontin & Krakauer (1973) (cited in Worley et al. 2006). F S T is a suitable measurement that can be used to identify outlier loci because if a locus is under selection and diverges differently between populations, it is reflected by the allele frequency differences of the locus (Beaumont & Balding 2004). Allele frequency differences can also result from neutral demographic events, however making the identification of the effects of selection complicated (Beaumont & Balding 2004). Since the effect of selection is locus specific, locus specific analysis of Fsiand FCT are useful measures when data of many loci are available (Worley et al. 2006). By comparing values among loci, the effects of demographic events affecting the genome can be distinguished from the locus specific effect of selection. In this study five markers showed evidence indicative of natural selection. Among all the loci studied, locus 675 was clearly an outlier in all the analyses giving a strong selection signature. Loci 4357, 6823, 884 and 54 each showed some indication in some of the analyses. Since a significant north-south clustering pattern was identified by genomic markers (Chapter 2), divergence between clusters was also expected at EST-derived markers. However, if a locus has diverged due to adaptation to the local environments in 107 the south or north this should be reflected by a higher locus specific FST and FCT values. The high degree of divergence at locus 675 compared with other loci, confirmed by both coalescent based simulation methods, may indicate directional selection between clusters at this locus. Allelic profiles within each cluster indicated a high frequency of allele 174 in the southern cluster and near fixation of allele 177 in the northern cluster. Among the 64 beetles in northern cluster locations all had at least one '177' allele indicating that this allele might confer higher fitness in north populations. In a BLASTx search the locus 675 best matches to the inhibitor of apoptosis protein 1 (IAP) in a parasitic wasp Nasonia vitripennis (Walker) (Hymenoptera: Pteromalidae), IAP proteins contain at least one copy of a conserved baculoviral inhibitor of apoptosis domain, BIR, so named as it was first discovered from baculoviruses. Homologous IAP genes have been identified from a range of animals including many insect species. All have one to three copies of the BIR domain (Clem & Duckett 1998, Huang et al. 2001; Vilaplana et al. 2007). Presence of a RING domain is also common to most of the IAPs (Huang et al. 2001; Vilaplana et al. 2007). Comparison of the EST based consensus MPB IAP sequence with an annotated IAP gene in Bombyx mori Linnaeus (Lepidoptera: Bombycidae) (Huang et al. 2001) reveals significant similarity. The Bombyx IAP gene contains two BIR domains and a downstream RING domain. The MPB consensus sequence is incomplete, comprised of a large section of CDS and the 3' UTR. It contains a section of the initial BIR domain as well as the entire second BIR and downstream RING domains. The microsatellite locus is found between the two BIR domains. This trinucleotide (AGT) microsatellite motif 108 codes for the amino acid serine (Ser). Corresponding regions of IAP1 gene in Nasonia vitripennis only contain two 'Ser' repeats and no repeats were found in Bombyx mori. Since this microsatellite locus is located within the CDS of the gene, the variation in the microsatellite repeated length {i.e., number of 'Ser' in the polypeptide chain) may affect the function of the protein. Biochemical analyses have revealed that the activation of caspase enzymes is a main step in apoptosis pathways (Mace et al. 2010). The IAP proteins inhibit apoptosis by binding to activated caspases or by blocking the pathways that activate caspases (Mace et al. 2010). Two types of apoptosis are found in cells; the programmed cell death and stress induced cell death (Verheij et al. 1996). Mace et al. (2010) reported that IAPs inhibit signals generated through both major pathways of apoptosis: the Extrinsic (death receptor mediated) and the Intrinsic (mitochondrial mediated) pathways. It can be hypothesized that colder temperatures, or any other stress factor, in the northern environment may induce a stress response which can lead to cell death. The inhibitor of apoptosis might play a role to prevent stress-induced cell death caused by cold or any number of other stress factor. Zhang et al. (2005) showed that H2O2 treatment up-regulates IAP's and protect cells from stress induced apoptosis in macrophages in vitro. Further, recent studies show that IAPs are involved in many cellular processes. In addition to the responsibility of regulating caspases and thereby apoptosis, IAP is known to modulate signaling pathways of immunity, mitosis, cell invasion, metamorphosis and development (Vilaplana et al. 2007; Gyrd-Hansen & Meier 2010). Hence, further studies are needed to understand the potential additional roles of IAP's in beetles and to explore the possible functional differences among alleles. These 109 studies could explore differences in gene expression among alleles. The link between microsatellite length variation and gene expression level differences has been shown in many studies (Kashi & King 2006). Some examples include the expression of a vasopressin receptor gene and thereby the social behavior in rodents (Donaldson et al. 2008), prolactin expression and thereby the growth of tilapia fish (Streelman & Kocher 2002). According to the information available in the EST database of MPB, the locus 675 seems to have a broad range of expression with transcripts being sequenced in both larvae and adult libraries. In adults the IAP transcripts were found in antennae as well as midgut and fatbody-derived libraries. This expression pattern is similar to that of Bombyx where a broad tissue and life stage expression pattern is noted (UNIGENE: Organized View of the Transcriptome, March 2011, www.ncbi.nlm.nih.gov/UniGene/). All the MPB, collected for EST library construction were from northern cluster however. Hence it would be important to analyze the expression of this gene in southern beetles. Further gene expression level analyses comparing the different alleles detected at this locus would also help to understand the role of variation (i.e., whether these alleles are related to different levels of protein expression, activity and/or other cellular functions). While potential functional differences among alleles lead to many interesting hypotheses, selection at a closely linked gene may have also lead to the results observed in this study. More studies should be carried out on the locus 675 to distinguish selection acting on the locus 675 versus the effect of selection at a nearby region (a part of 110 selective sweep). Development of a linkage map for MPB is an ongoing process under TRIA project. Determining the linkage relationship of the IAP gene (locus 675) would help identify other candidate loci for investigation. This could lead to a genomic investigation of the region around the IAP gene. Genomic sequencing of the region of interest would enable the development of additional polymorphic markers. The markers could be analyzed for effects of genetic hitchhiking. Since the effect of a selection on polymorphic markers are strongest close to the actual site under selection and get weaker as the distance from the selection increases (Vasemagi et al. 2005), linkage disequilibrium, levels of heterozygozity reduction, locus specific FsT-like parameters can be mapped along the chromosome to identify the exact selective site and determine how large of a region a selective sweep has affected. This approach has been used in other species (Wiehe et al. 2007). Two loci, 4357 and 6823 showed low differentiation compared to all other loci indicating possible locus specific effects. The low genetic differentiation detected at locus 6823 can be attributed to the low level of polymorphism at the locus. In contrast, locus 4357 showed a low level of genetic differentiation and a high level of genetic variation. Curiously, the highest number of alleles was detected at this locus. Maintenance of different alleles can results from balancing selection, either through heterozygote advantage or frequency dependent selection (Borghan et al. 2005, Charlesworth 2006), an interpretation supported by the simulation studies. The analysis of MPB EST contig- containing locus 4357 with ORF and BLASTx searches showed that this dinucleotide repeat sequence occurs at the 3 ' end of a gene similar to ring finger 111 protein 141 (RNF141) in Tribolium castaneum (Herbst) (Coleoptera:Tenebrionidae). RNF141 is a member of a large zinc finger protein family that consists of structurally and functionally diverse members. Some of the many known functions of zinc finger proteins are DNA recognition, transcriptional activation, RNA packaging, ubiquitination, and interestingly regulation of apoptosis (Laity et al. 2001; Vazquez et al. 2007; Duan et al. 1999; Deng et al. 2009). Deng et al. (2009) reported that RNF141 possibly has a broad function during early development of vertebrates. More studies should be carried out on this gene to understand the function of RNF141 in MPB and to detect whether the dinucleotide microsatellite locus found within the 3' UTR region has a functional significance, i.e. regulates the expression of the RNF gene, affects the structure of the gene product, or if it is linked to a balancing selection in the CDS. The evidence for selection was weak at loci 6834, 884 and 054. As noted above, locus 6823 was characterized by having a low level of polymorphism across the entire range. Low diversity across the entire range can be due to a number of factors including a low mutational rate for this locus, purifying selection constraining repeat size, the recent emergence of the repeat, or the result of a shared selective sweep (Kane & Rieseberg 2007) across the entire range. The loci 884 and 054 both shared a modest level of diversity and had F S T values within the neutral range. For loci with weak evidence of selection expanded surveys should be conducted to confirm the results. Locus 054 was the only genomic repeat identified with a selective signature. This locus was found to be out of HWE in a large 112 number of sample locations in the previous large scale survey of variation (chapter 2) and the presence of a null allele was suspected. In this larger survey of variation this locus was polymorphic, having 11 different alleles across the study area. Curiously, this locus was monomorphic or out of HWE only in some of the northern upper (NU) subcluster locations. The spatial pattern of diversity is intriguing, but it also may reflect the presence of a common null allele in north. It is recommended that additional primers be designed for locus 054 to determine if the variation has been adequately sampled, and if new alleles are detected the analysis for signatures of selection should be repeated. The FST, FCT, heterozygosity and coalescent-based methods have been used to identify outlier loci under the influence of selection (Payseur et al. 2002; Storz 2005; Stajich & Hahn 2005). Although different approaches are available, all of these can detect only strong signatures of selection (Ford 2002). Hence, the loci (such as locus 675) that are repeatedly encountered as outliers in several analyses are particularly good examples for genes or genomic regions that represent selection. However, finding signatures of selection is only the first step in studying adaptive variation in MPB. A more detailed analysis of variation coupled with expanding surveys should be conducted to confirm and explore the patterns of spatial genetic variation found in the survey prior to initiation of complex functional studies. 113 Chapter Five GENERAL DISCUSSION 114 The overall objective of this research was to study the spatial genetic structure of the current mountain pine beetle, Dendroctonous ponderosae (MPB), outbreak in western Canada in order to gain insights into the biology of the beetle and the progression of the outbreak. This information also provides key knowledge for a larger MPB systems genomics project (The TRIA Project). The knowledge gained will be used to achieve the long-term goals of the TRIA project (i.e., to generate biological information that can be integrated into accurate risk and economic models that are critical in management of the current and future outbreaks). To achieve this objective, both neutral and gene-linked microsatellite variation of MPB in western Canada were investigated. The neutral microsatellite analyses described in Chapter 2 provide critical information on spatial genetic variation that has arisen due to recent and historic demographic processes. In contrast, the gene-linked markers developed in Chapter 3 are ideally suited for the determination of local adaptation. The preliminary study in Chapter 4 illustrated the promise of this approach. Ultimately this spatial genetic information can be combined with information on neutral and adaptive genetic markers in the other two major biological components of the outbreak, the host pine and associated fungi, to study the interaction among the genomes. This analysis, termed an integrated genomics map, will also be used to study the interaction of physical and environmental variables (P. James, personal communication). By combining the spatial genomic information of all three genomes, a synergy may be obtained leading to new insights into the progression of bark beetle outbreaks. 115 One of the main findings of this research was the north/south population structure of MPB in western Canada. Various hypotheses may be put forth to explain this structure (Chapter 2) and it seems likely that a combination of ongoing postglacial expansion and local adaptation may be key drivers in maintaining the structure. Of the environmental factors within the MPB range studied, temperature is the one that varies from south to north (Carroll et al. 2004). Indeed, the warmer climate relative to the past has been defined as a main factor responsible for the continued outbreaks and spread of MPB (Carroll et al. 2004). It is not known whether the MPB has evolved strategies (either structural, functional or both) in response to the harsh environmental factors that it faces. Investigating adaptation to cold is currently a focus of ongoing work (D. Huber, unpublished data). It can be expected, however, that local adaptations to environmental and geographical differences will also exist. Morphological, behavioral, and physiological differences have been observed among the MPB collected from different locations within its species range (Stock et al. 1984; Bentz et al. 2001). Stock et al. (1984) reported that host selection also differed for MPB in different locations. These differences may be linked with, and be reflected by, the adaptive divergence in MPB genome. Supporting this, the analysis done in Chapter 4 gave signs of adaptive differences between the two main clusters. The use of molecular markers from different genomic regions will help to uncover the extent of local adaptive differences between the north and south MPB populations (Eujayl et al. 2004). Many metabolic genes in other organisms have shown patterns of variation incompatible with neutral evolution but consistent with temperature variations 116 within the species range (Whitehead & Crawford, 2006). Hence the continued study of molecular markers linked to the functional genes, especially metabolic genes, will help to understand the extent to which the clustering pattern is governed by any climate-derived adaptation (Whitehead & Crawford, 2006). The EST-derived microsatellite markers will facilitate this screening as these are found within, or very closely linked to, the expressed portion of the MPB genome. Therefore, the study done in Chapter 4 can easily be expanded to further explore the degree of adaptive variation found among clusters. The development of single nucleotide polymorphism (SNP) markers (F. Sperling, unpublished data) will also provide a valuable source of genetic variation with which to explore local adaptation. When genotypic data from many genetic markers is available for a particular species, it facilitates genome-wide scans to detect selection of signatures and genes of adaptive significance (Schlotterer 2003; Storz 2005; Bonin et al. 2007). This allows assessment of the adaptive value of a species in a particular environment even though nothing is known about the traits or genes involved in the adaptation process of the target species (Schlotterer 2003; Storz 2005). From analysis of genome scans, a new diversity index called the 'population adaptive index' was defined (Bonin et al. 2007). This represents the percentage of loci (presuambly adaptive) with allelic frequencies significantly different from neutral expectations in one population compared to other populations. Both an expanded set of EST-linked markers (Chapter 3) and newly developed SNP markers should be used to explore the population adaptive index of the clusters and subclusters identified (Chapter 2). 117 Of five markers screened in the preliminary study, the inhibitor of apoptosis protein (IAP) gave a clear signature of divergent selection between the north and south clusters identified by neutral microsatellite markers. While further confirming the structuring pattern, this locus gave a genetic indication for possible biological differences between the north and south clusters. Therefore, detailed studies investigating the functional differences caused by microsatellite sequence repeats and, thereby, exploring the mechanism underlying any adaptations to local environments will be important. Complete gene annotation of the entire coding sequence as well as the sequence analysis of the major alleles (i.e., Ill versus 174 etc) are the immediate first steps. Analysis of the sequence differences among alleles found in northern and southern beetles will detect if there is any variation in the functional domains of this gene, in addition to the observed microsatellite variation. Further, gene expression level studies using beetles of known genotypes and/or through transformation of insect cell lines will help to detect if protein level or activity changes are associated with allelic variation. Loci with signatures of directional selection may also prove useful information in tracing the origins and the potential for spread of new outbreaks. For example, allelic composition analysis revealed that all the beetles studied in the northern cluster contained at least one 177 allele for the IAP locus. Hence, this marker is very informative in terms of the spread of the outbreak. By screening beetles from the most recent outbreaks in northern Alberta, the relative frequency of this allele can be determined. This will allow a refinement of the assignment tests performed in Chapter 2, most likely increasing the 118 likelihood of correct assignments. Further, alleles under differential selection, like 177, might be used as molecular tags to rapidly detect populations of interest, i.e., those with higher survival and expansion potential. Although the biological meanings for the differences between the MPB clusters still have to be further investigated, based on the available information, it can be speculated that the beetles in north and south would respond differently to new environments. The different local adaptations may affect the nature of local expansions and thereby, have management implications. Studies done on various strains of the same insect species have shown that the control strategies should be varied depending on location. For example, Anopheles culicifacies (Giles) (Diptera:Culicidae), the primary vector of malaria in some regions, shows different levels of insecticide resistance among strains (Curtis et al. 1978). Although, the overall neutral genetic differentiation between the MPB clusters (FCT) was not very high (Chapter 2), the evidence of divergent selection may indicate a need for different control strategies between outbreaks in the clusters identified. In addition to the use of EST derived markers to study gene-linked variation and for detecting selection signatures, these markers will have many other applications in MPB research. Among these applications, the use of these markers in developing linkage maps of MPB will be an important step. Indeed, the development of linkage maps for MPB was a recent recommendation of the TRIA scientific advisory board (meeting held in January, 2011). Microsatellites, especially EST-derived microsatellites due to the 119 efficiency of marker development and high information content, are widely used in constructing linkage maps, (Prasad et al. 2005; Marcel et al. 2007; Slate et al. 2007). Therefore, the newly developed 50 polymorphic EST-derived markers (Chapter 3) and 16 genomic markers (Davis et al. 2009) of MPB will provide a good platform for linkage map construction in MPB. The genome size of MPB was estimated as approximately between 200-300MB (T. Clarke & D. Huber, unpublished data) and consists of 12 chromosome pairs including the sex chromosomes (Zuniga et al. 2002). Linked groups should be detected as it can be expected that one of the 66 polymorphic markers will be found approximately every 3-4.5 MB. The degree of linkage among the markers is unknown however, to date, no published research is available on the recombination rate of the MPB genome. Another technical challenge will be the development of family groups with known paternity. Multiple paternity has been found in the naturally occurring galleries tested (F. Sperling, unpublished data). Therefore, controlled breeding of virgin beetles would be needed to develop the large true breeding family groups that are ideal for the successful construction of linkage maps. Expanding the scale of population genomic studies across the entire range of the MPB will also be an important area for future studies. During the last glaciation event MPB may have survived in several different refugia since there is evidence for multiple refugia of flora and fauna (Beatty & Provan 2010). So far, no evidence has been found to indicate that the MPB survived in one or multiple refugia. Expanding the geographic scale will allow phylogeographic hypotheses of post-glacial movement out of likely refugia to be tested. This type of study can also be used to explore the genetic basis for 120 previous classifications of MPB populations. Historically, MPB were split into two species, D. ponderosae (the pine beetle found in Pinus ponderosae) and D. monticolae (the mountain pine beetle - the pine beetles found in several other pine species including the white bark pine) (Stock & Amman 1980; Stock et al. 1984). Later, Wood (1963) synonymized the two species into the currently recognized species D. ponderosae (Stock et al. 1984). Range-wide studies of spatial genetic variation with EST derived markers will help to further examine the biological validity of the old split of the MPB and ponderosa pine beetle. Viewing the spatial genetic variation on a larger scale should also help contextualize the variation observed in western Canada. That is, is the observed northward reduction in variation localized in western Canada or is it part of a range-wide trend as suggested by a previous study conducted at a coarser resolution (Mock et al. 2007). Expanding the scale of the study will also allow a better estimate of the degree of local adaptation to be assessed. The markers developed in this study are useful in investigating not only the MPB genome, but may also be of potential use in studies of other members of the genus Dendroctonus as well as other closely related species. As species genetically diverge over time, sequence differences accumulate, predominantly in the noncoding DNA regions first, making it difficult to develop markers that are useful across the species (Wordley et al. 2011). However, compared to the neutral microsatellite markers, ESTlinked markers, because of the relative conserved nature of the coding regions, should have an increased success of cross-species amplification (Wordley et al. 2011). ESTlinked markers should therefore be useful in studying the evolution of genes among 121 closely related species, and hence, in studies of phylogenetic relationships, species conservation and the molecular evolution of genes (Barbara et al. 2007; Chapman et al. 2007). The cross-species amplification success of these newly developed markers (Chapter 3) can be tested on different bark beetles with relatively little extra effort. With the current trend of global warming, bark beetle outbreaks are predicted to increase in the future (Carroll et al. 2006). Therefore, the availability of common, 'universal' bark beetle markers will facilitate the study of related future bark beetle outbreaks . Overall, this study has helped to increase our understanding of the population genetic structure and the recent patterns of dispersal of the current MPB outbreak in western Canada. The EST-derived markers developed in this study have increased the marker availability for spatial genetic studies and provided another way to study the functional portion of the genome of MPB. Preliminary surveys with EST-derived markers showed selection signatures at some markers, revealing the importance of doing further studies. In addition, the new markers developed will have many important applications in bark beetle genetics. Ultimately, these studies will form a key data set that will provide researchers with vital information on the biology of the MPB that can be used to assess current and future outbreaks and inform biologically relevant management strategies. 122 Literature Cited Abbott RJ, Brochmann C (2003) History and evolution of the Arctic flora: in the footsteps of Eric Hulten. Molecular Ecology, 12, 299-313. Allendorf FW, Luikart G (2007) Conservation and the Genetics of Populations. 1st ed Wiley-Blackwell publishing, Oxford, pp 222 (642). Amman GD, McGregor MD, Dolph RE (1990) Mountain pine beetle, Forest Insect and Disease Leaflet 2, USDA Forest Service, Washington, DC. Anderson WW, Berisford CW, Kinrich R (1979) Genetic differences among five populations of the southern pine beetle. Annals of the Entomological Society of America, 72, 323—327. Anderson LL, Hu FS, Nelson DM, Petit RJ, Paige KN (2006) Ice-age endurance:DNA evidence of a white spruce refugium in Alaska. Proceedings of the National Academy of Sciences, 103, 12447-12450. Antao T, Lopes A, Lopes RJ, Beja-Pereira A, Luikart G (2008) LOSITAN: a workbench to detect molecular adaptation based on a FsT-outlier method. BMC Bioinformatics, 9, 323-328. Aukema BH, Carroll AL, Zhu J et al. (2006) Landscape level analysis of mountain pine beetle in British Columbia, Canada: spatiotemporal development and spatial synchrony within the present outbreak. Ecography, 29, 427-441. Aukema BH, Carroll AL, Zheng Y et al. (2008) Movement of outbreak populations of mountain pine beetle: Influence of spatiotemporal patterns and climate. Ecography, 31, 348-358. Avise JC (2004) Molecular markers, natural history, and evolution (second edition). Sinauer, Sunderland, MA. pp 41. Ayers NM, McClung AM, Larkin PD et al. (1997) Microsatellite and single nucleotide polymorphism di.erentiate apparent amylose classes in an extended pedigree of US rice germplasm. Theoretical and Applied Genetics, 94, 773-781. Ayres MP, Lombardero MJ (2000) Assessing the consequences of global change for forest disturbance from herbivores and pathogens. Science of the Total Environment, 262, 263-286. Barbara A, Palma-Silva C, Paggi G.M, et al. (2007) Cross-species transfer of nuclear microsatellite markers: potenital and limitations. Molecular Ecology, 17, 3759-3767. Balanya J, Oiler JM, Huey RB, Gilchrist GW, Serra L (2006) Global genetic change tracks global climate warming in Drosophila subobscura. Science, 313, 1773-1775. Balloux F, Lugon-Moulin N (2002) The estimation of population differentiation with microsatellite markers. Molecular Ecology, 11, 155-165. Bartell N (2008) A Microsatellite Analysis of the Western Canadian Mountain Pine Beetle {Dendroctonus ponderosae) Epidemic: Phylogeography and Long Distance Dispersal Patterns. MSc Thesis, University of Northern BC, Prince George, BC, 164 pp. Beatty GE & Pro van J (2010) Refugial persistence and postglacial recolonization of North America by the cold-tolerant herbaceous plant Orthilia secunda. Molecular Ecology, 19,5009-5021. Beaumont MA, Nichols RA (1996) Evaluating loci for use in the genetic analysis of population structure. Proceedings of the Royal Society of London, Series B: Biological Sciences, 263, 1619-1626. Beaumont MA, Balding DJ (2004) Identifying adaptive genetic divergence among populations from genome scans. Molecular Ecology, 13, 969-980. Beaumont MA (2005) Adaptation and speciation: what can F(st) tell us? Trends in Ecology & Evolution, 20, 435-440. Beckmann JS, Weber JL (1992) Survey of human and rat microsatellites. Genomics, 12, 627-631. Belkum VA (1999) The role of short sequence repeats in epidemiologic typing. Current Opinion in Microbiology, 2, 306-311. Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research, 27, 573-80. Bentz BJ, Logan JA, Vandygriff JC (2001) Latitudinal variation in Dendroctonus ponderosae (Coleoptera: Scolytidae) development time and adult size. The Canadian Entomologist, 133, 375-387. Bentz BJ, Regniere J, Fettig CJ et al. (2010) Climate change and bark beetles of the western United States and Canada: direct and indirect effects. BioScience, 60, 602613. Berryman A. A. (1972) Resistance of conifers to invasion by bark beetle - fungal associations. BioScience, 22, 598-602. Bevilacqua A, Fiorenza MT, Mangia F (2000) A developmentally regulated GAGA boxbinding factor and Spl are required for transcription of the hsp70.1 gene at the onset of mouse zygotic genome activation. Development, 127, 1541-1551. Borghans J, Borghans L, Weel B (2005) Is there a link between economic outcomes and genetic evolution? Cross-country evidence from the major histocompatibility complex,Working Paper, Department of Economics, Maastricht University. Bonin A, Nicole F, Pompanon F, Miaud C, Taberlet P (2007) Population adaptive index: a new method to help measure intraspecific genetic diversity and prioritize populations for conservation. Conservation Biology, 21, 697-708. Borden JH (1982) Aggregation pheromones. In: Bark beetles in North American conifers: a system for the study of evolutionary biology (eds Mitton JB, Sturgeon KB), pp. 7 4 139. Bouck A, Vision T (2007) The molecular ecologist's guide to expressed sequence tags. Molecular Ecology 16, 907-924. Brubaker LB, Anderson PM, Edwards ME, Lozhkin AV (2005,) Beringia as a glacial refugium for boreal trees and shubs: new perspectives from mapped pollen data. Journal of Biogeography, 32, 833-848. Brunelle A, Rehfeldt GE, Bentz B, Munson AS (2008) Holocene records of Dendroctonus bark beetles in high elevation forests of Idaho and Montana, USA. Forest Ecology and Management, 255, 836-846. Burton PJ (2010) striving for sustainability and resilience in the face of unprecedented change: the case of the mountain pine beetle outbreak in British Columbia. Sustainability, 2, 2403-2423. Busturia A, Lloyd A, Bejarano F, et al. (2001) The MCP silencer of the Drosophila AbdB gene requires both pleiohomeotic and GAGA factor for the maintenance of repression. Development, 128, 2163-2173. Calpas J, Konschuh M, Lisowski S (2002) Preliminary investigation into developing DNA fingerprints for "geographically distinct" populations of the mountain pine beetle Dendroctonus ponderosae (Coleoptera: Scolytidae). Alberta Agriculture Research Institute Project # 99E252 Final Report: 19 pgs. Carroll AL, Taylor SW, Regniere J, Safranyik L (2004) Effects of climate change on range expansion by the mountain pine beetle in British Columbia. In: Challenges and Solutions: Proceedings of the Mountain Pine Beetle Symposium (eds Shore TL, Brooks JE, Stone JE), Natural Resources Canada. Information Report BC-X-399. pp. 223-232. 125 Carroll AL, ShoreTL, and Safranyik L (2006) Direct control: theory and practice. In: Safranyik L, and Wilson B, eds. The mountain pine beetle: a synthesis of biology, management, and impacts on lodgepole pine. Natural Resources Canada, Canadian Forest Service, Pacific Forestry Centre. Victoria, BC. pp. 155-172. Cerezke HF (1989) Mountain pine beetle aggregation semiochemical use in Alberta and Saskatchewan, 1983-1987. In: Symposium on the management of lodgepole pine to minimize losses to the mountain pine beetle (ed. Amman GD), pp. 113. USDA Forest Service. General Technical Report INT-262. Cerezke HF (1995) Egg gallery, brood production, and adult characteristics of mountain pine beetle, Dendroctonusponderosae Hopkins (Coleoptera: Scolytidae), in three pine hosts. The Canadian Entomologist, 127, 955 - 965. Chabane K, Abdalla O, Sayed H, Valkoun J (2007) Assessment of EST-microsatellites markers for discrimination and genetic diversity in bread and durum wheat landraces from Afghanistan. Genetic Resources and Crop Evolution, 54, 1073-1080. Chamberlain NL, Driver ED, Miesfeld RL (1994) The length and location of CAG trinucleotide repeats in the androgen receptor N-terminal domain affect transactivation function. Nucleic Acids Research, 22, 3181-3186. Chapman JA (1962) Field studies on attack flight and log selection by the ambrosia beetles Trypodendron lineatum (Oliv.) (Coleoptera: Scolytidae). The Canadian Entomologist, 94, 74-92. Chapman MA, Chang J, Weisman D, Kesseli RV, Burke JM (2007) Universal markers for comparative mapping and phylogenetic analysis in the Asteraceae (Compositae). Theoretical and Applied Genetics, 115, 747-755. Charlesworth D (2006) Balancing selection and its effects on sequences in nearby genome regions. PLoS Genetics, 2, 379-384. Chen C, Durand E, Forbes F, Francois O (2007) Bayesian clustering algorithms ascertaining spatial population structure: A new computer program and a comparison study. Molecular Ecology Notes, 7, 747-756. Clark EL, Carroll AL, Huber DPW (2010) Differences in the Constitutive Terpene Profile of Lodgepole Pine Across a Geographical Range in British Columbia, and Correlation with Historical Attack by Mountain Pine Beetle. Entomological Society of Canada, 142,557-573. Clem RJ, Duckett CS (1998) The IAP genes: unique arbiters of cell death. Trends in Biochemical Sciences, 23, 159-162. 126 Clynen E, Huybrechts J, Verleyen P, De Loof A, Schoofs L (2006) Annotation of novel neuropeptide precursors in the migratory locust based on transcript screening of a public EST database and mass spectrometry, BMC Genomics, 7, 201. Cognato AI, Harlin AD, Fisher ML (2003) Genetic structure among pinyon pine beetle populations (Scolytinae: Ips confusus). Environmental Entomology, 32, 1262-1270. Cognato AI, Grimaldi D (2009) 100 million years of morphological conservation in bark beetles (Coleoptera: Curculionidae: Scolytinae). Systematic Entomology, 34, 93-100. Colbourne JK, Robison B, Bogart K, Lynch M (2004) Five hundred and twenty eight microsatellite markers for ecological genomic investigations using Daphnia. Molecular Ecology Notes, 4, 485-490. Conrad C, Ben A and Arthur H J (2001) A Dendroctonus bark engraving (Coleoptera: Scolytidae) from a middle Eocene Larix (Coniferales: Pinaceae): early or delayed colonization? American Journal of Botany, 88, 2026-2039. Coombs JA, Letcher BH, Nislow KH (2008) CREATE: a software to create 667 input files from diploid genotypic data for 52 genetic software programs. Molecular Ecology Resources, 8, 578-580. Corander J, Marttinen, P (2006 ) Bayesian identification of admixture events using multilocus molecular markers. Molecular Ecology, 15, 2833-2843. Corander J, Waldmann P, Sillanpaa MJ (2003) Bayesian analysis of genetic differentiation between populations. Genetics, 163, 367-374. Corander J, Waldmann P, Marttinen P, Sillanpaa MJ (2004) BAPS 2: enhanced possibilities for the analysis of genetic population structure. Bioinformatics, 20, 2363-2369. Corander J, Marttinen P, Mantyniemi S (2006) Bayesian identification of stock mixtures from molecular marker data. Fishery Bulletin, 104, 550 - 558. Coulibaly I, Gharbi K, Danzmann RG, Yao J, Rexroad CE (2005) Characterization and comparison of microsatellites derived from repeat-enriched libraries and expressed sequence tags. Animal Genetics, 36, 309-315. Cruzan MB, Templeton AR (2000). "Paleoecology and coalescence: phylogeographic analysis of hypotheses from the fossil record". Trends in Ecology and Evolution, 15, 491-496. Cudmore TJ, Bjorklund N, Carroll AL, Lindgren BS (2010) Climate change and range expansion of an aggressive bark beetle: evidence of higher beetle reproduction in nai've host tree populations. Journal of Applied Ecology, 47, 1036-1043. 127 CuUingham CI' Cooke JEK, Dang S, Davis CS, Cooke BJ, Coltman DW (2011) Mountain pine beetle host-range expansion threatens the boreal forest. Molecular Ecology, 20, 2157-2171. doi: 10.1111/J.1365-294X.2011.05086.x. Curtis CF, Cook LM & Wood RJ (1978), Selection for and against insecticide resistance and possible methods of inhibiting the evolution of resistance in mosquitoes. Ecological Entomology, 3, 273-287. Cwynar LC, MacDonald GM (1987) Geographical variation of lodgepole pine in relation to population history. The American Naturalist, 129, 463-469. Davidson S, Starkey A, MacKenzie A (2009) Evidence of uneven selective pressure on different subsets of the conserved human genome; implications for the significance of intronic and intergenic DNA. BMC Genomics, 10, 614. Davis CS, Mock KE, Bentz BJ, et al. (2009) Isolation and characterization of sixteen microsatellite loci in the mountain pine beetle Dendroctonus ponderosae Hopkins (Coleoptera: Curculionidae: Scolytinae). Molecular Ecology Resources, 9, 1071— 1073. Dayanandan S, DoleJ, Bawa K, Kesseli R (1999) Population structure delineated with microsatellite markers in fragmented populations of a tropical tree, Carapa guianensis (Meliaceae). Molecular Ecology, 8, 1585-1592. Decroocq V, Fave MG, Hagen L, Bordenave L, Decroocq S (2003) Development and transferability of apricot and grape EST microsatellite markers across taxa. Theoretical and Applied Genetics, 106, 912-922. De la Rosa-Reyna XF, Rodriguez Perez MA, Sifuentes-Rincon AM (2006) Microsatellite polymorphism in intron 1 of the bovine myostatin gene. Journal of Applied Genetics, 47, 55-57. Demuth JP, Drury DW, Peters ML, et al. (2007) Genome-wide survey of Tribolium castaneum microsatellites and description of 509 polymorphic markers. Molecular Ecology Notes, 7, 1189-1195. Deng Wenqian, Sun Huaqin, Liu Yunqiang, et al (2009) Molecular cloning and expression analysis of a zebrafish novel zinc finger protein gene rnfl41. Genetics and Molecular Biology, 32, 594-600. Don RH, Cox PT, Wainwright BJ, Baker K, Mattick JS (1991) Touchdown' PCR to circumvent spurious priming during gene amplification. Nucleic Acids Research, 19, 4008. 128 Donaldson ZR, Kondrashov FA, Putnam A, et al. 2008. Evolution of a behavior-linked microsatellite-containing element in the 50 flanking region of the primate AVPR1A gene. BMC Evolutionary Biology, 8, 180. Duan H, Wang Y, Aviram M, et al. (1999) SAG, a Novel Zinc RING Finger Protein That Protects Cells from Apoptosis Induced by Redox Agents. Molecular Cell Biology, 19, 3145-3155. Duan Y, Kerdelhue C, Ye H, Lieutier F (2004) Genetic study of the forest pest Tomicus piniperda (Col., Scolytinae) in Yunnan province (China) compared to Europe: new insights for the systematics and evolution of the genus Tomicus. Heredity, 93, 416— 422. Durand E, Chen C, Francois O (2009) Tess 2.3 reference manual. Available at http://membrestimc.imag.fr/01ivier.Francois/tess.html. Endler JA (1986) Natural Selection in the Wild, Princeton Univ. Press, Princeton in Whitehead A & Crawford DL 2006. Epplen JT, Kyas A, Maueler W (1996) Genomic simple repetitive DNAs are targets for differential binding of nuclear proteins. FEBS Letters, 389, 92-95. Eujayl I, Sorrells M, Baum M, Wolters P, Powell W (2001) Assessment of genotypic variation among cultivated durum wheat based on EST-SSRS and genomic SSRS. Euphytica, 119, 39-43. Eujayl I, Sledge MK, Wang L et al. (2004) Medicago truncatula EST SSRs reveal crossspecies genetic markers for Medicago spp. Theorotical and Applied Genetic, 108, 414—422. Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Molecular Ecology, 14, 2 6 1 1 2620. Excoffier L, Laval G, Schneider S (2005) Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online, 1, 47-50. Fabre E, Dujon B, Richard GF (2002) Transcription and nuclear transport of CAG/CTG trinucleotide repeats in yeast. Nucleic Acids Reserch, 30, 3540-3547. Faccoli M, Piscedda A, Salvato P et al. (2005) Genetic structure and phylogeography of pine shoot beetle populations {Tomicus destruens and T. piniperda, Coleoptera, Scolytidae) in Italy. Annals of Forest Science, 62, 361-368. 129 Falush D, Stephens M, Pritchard JK (2003) Inference of population structure: Extensions to linked loci and correlated allele frequencies. Genetics, 164, 1567-1587. Falush D, Stephens M, Pritchard JK (2007) Inference of population structure using multilocus genotype data: dominant markers and null alleles. Molecular Ecology Notes, 7, 574-578. Fazekas AJ, Yeh FC (2006) Postglacial colonization and population genetic relationships in the Pinus contorta complex. Canadian Journal of Botany, 84, 223-234. Fedy BC, Martin K, Ritland C, Young J (2008) Genetic and ecological data provide incongruent interpretations of population structure and dispersal in naturally subdivided populations of white-tailed ptarmigan {Lagopus leucurd). Molecular Ecology, 17, 1905-1917. Felix G, Rolf G, Franz M, and Wermelinger B (2008) Pronounced fluctuations of spruce bark beetle (Scolytinae: Ips typographus) populations do not invoke genetic differentiation. Forest Ecology and Management, 256, 405^409. Ford MJ (2002) Applications of selective neutrality tests to molecular ecology. Molecular Ecology, 11, 1245-1262. Furniss MM, Furniss RL (1972) Scolytids (Coleoptera) on snowfields above timberline in Oregon and Washington. The Canadian Entomologist, 104, 1471-1478. Galtier N, Depaulis F, Barton NH (2000) Detecting bottlenecks and selective sweeps from DNA sequence polymorphism. Genetics, 155, 981-987. Gangwal K, Sankar S, Hollenhorst PC et al. (2008) Microsatellites as EWS/FLI response elements in Ewing's sarcoma. Proceedings of the National Academy of Science of the USA, 105,10149-10154. Gamier S, Alibert P, Audiot P, Prieur B, Rasplus JY (2004) Isolation by distance and sharp discontinuities in gene frequencies: implications for the phylogeography of an alpine insect species, Carabus solieri. Molecular Ecology, 13, 1883-1897. Gaston KJ (2009) Geographic range limits of species. Proceedings of the Royal Society of London, Series B: Biological Sciences, 276, 1391-1393. Gibson K, Skov K, Kegley S, et al. (2006) Mountain pine beetle conditions in whitebark pine stands in the Greater Yellowstone Ecosystem, Rep. 06-03. Missoula, MT: U.S. Department of Agriculture, Forest Service, Northern Region, Forest Health Protection, Missoula Field Office. P.7. 130 Glaubitz JC (2004) CONVERT: A user-friendly program to reformat diploid genotypic data for commonly used population genetic software packages. Molecular Ecology Notes, 4,309-310. Goldhammer DS, Stephen FM, and TD Paine (1990) The effect of the fungi Ceratocystis minor (Hedgecock) Hunt, Ceratocystis minor (Hedgecock) Hunt var. Barrasii Taylor, and SJB 122 on reproduction of the southern pine beetle, Dendroctonus frontalis Zimmermann (Coleoptera: Scolytidae). Canadian Entomologist, 122,407-418. Goldstein DB, Schlotterer C (2001) Microsatellites: Evolution and applications. New York, Oxford, Oxford University Press. Goudet J. (2001) FSTAT, a program to estimate and test gene diversities and fixation indices (version 2.9.3). Available from http://www.unil.ch/izea/softwares/fstat.html. Updated from Goudet (1995). Green DM, Sharbel TF, Kearsley J, Kaiser H (1996) Postglacial range fluctuation, genetic subdivision and speciation in the western North American spotted frog complex, Ranapretiosa. Evolution, 50, 374-390. Grillo V, Jackson F, Gilleard JS (2006) Characterisation of Teladorsagia circumcincta microsatellites and their development as population genetic markers. Molular and Biochemical Parasitology, 148, 181-189. Gyrd-Hansen M, Meier P (2010) IAPs: from caspase inhibitors to modulators of NFkappaB, information and cancer. Nature Reviews Cancer, 8, 561-574. Hammock EAD & Young LJ (2005) Microsatellite instability generates diversity in brain and socio behavioral traits. Science, 308, 1630-1634. Hancock JM (1999) Microsatellites and other simple sequences: genomic context and mutational mechanisms. In Microsatellites: Evolution and applications. Goldstein D, Schlotterer C (2001) New York , Oxford University Press, 1-9. Hardy OJ, Maggia L, Bandou E, et al. (2006) Fine-scale genetic structure and gene dispersal inferences in 10 neotropical tree species. Molecular Ecology, 15, 559-571. Harrington TC (1993) Diseases of conifers caused by species of Ophiostoma and Leptographium. In: Ceratocystis and Ophiostoma: taxonomy, ecology, and pathogenicity. Hewitt G (2000) The genetic legacy of the Quaternary ice ages. Nature, 405, 907-913. Heusser CJ (1960) Late Pleistocene environments of north Pacific North America. American Geographical Society Special Publication No. 35. 131 Hewitt GM (1999) Postglacial recolonization of European Biota. BiologicalJournal the Linnean Society, 68, 87-112. of Hewitt GM (2004) Genetic consequences of climatic oscillations in the Quaternary. Philosophical Transactions of the Royal Society of London B, 359, 183-195. Hisano H, Sato S, Isobe S et al. (2008) Characterization of the soybean genome using EST-derived microsatellite markers. DNA Research, 14, 271-281. Holm S (1979) A simple sequential rejective multiple test procedure. Scandinavian Journal of Statistics, 6, 65-70. Horn A, Roux-Morabito G, Lieutier F, Kerdelhue C (2006) Phylogeographic structure and past history of the circum-Mediterranean species Tomicus destruens Woll. (Coleoptera: Scolytinae). Molecular Ecology, 15, 1603-1615. Huang Q, Deveraux QL, Maeda S, et al. (2001) Cloning and characterization of an inhibitor of apoptosis protein (IAP) from Bombyx mori. Biochimica et Biophysica. Acta, 1499, 191-198. Hutchison DW, Templeton AR (1999) Correlation of pairwise genetic and geographic distance measures: inferring the relative influences of gene flow and drift on the distribution of genetic variability. Evolution, 53, 1898-1914. Ihle S, Ravaoarimanana I, Tautz D (2006) An analysis of signatures of selective sweeps in natural populations of the house mouse. Molecular Biology and Evolution, I'i, 790-794. Jackson PL, Straussfogel D, Lindgren BS, Mitchell S, Murphy B (2008) Radar observation and aerial capture of mountain pine beetle, Dendroctonus ponderosae Hopk. (Coleoptera: Scolytidae) in flight above the forest canopy. Canadian Journal of Forest Research, 38, 2313-2327. Jacob E, Pucshansky L, Zeruya E, Baran N, Manor H (2004) The human protein translin specifically binds single-stranded microsatellite repeats, d(GT)n, and G-strand telomeric repeats, d(TTAGGG)n: a study of the binding parameters. Journal of Molecular Biology, 344, 939-950. Jakobsson M, Rosenberg NA (2007) CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics, 23, 1801-1806. Jakupciak JP, Wells RD (1999) Genetic instabilities in (CTGCAG) repeats occur by recombination. The Journal of Biological Chemistry, 274, 23468-23479. Johnson JA, Dunn PO, Bouzat JL (2007) Effects of recent population bottlenecks on reconstructing the demographic history of prairie-chickens. Molecular Ecology, 16, 2203-2222. Kane NC, Rieseberg LH (2007) Selective sweeps reveal candidate genes for adaptation to drought and salt tolerance in common sunflower, Helianthus annuus. Genetics, 175, 1823-1834. Kantety RV, Rota ML, Matthews DE, Sorrells ME (2002) Data mining for simple sequence repeats in expressed sequence tags from barely, maize, rice, sorghum and wheat. Plant Molecular Biology, 48, 501-510. Kashi Y, King D, Soller M (1997) Simple sequence repeats as a source of quantitative genetic variation. Trends in Genetics, 13, 74-78. Kashi Y, King DG (2006) Simple sequence repeats as advantageous mutators in evolution. Trends in Genetics, 22, 253-259. Kauer M, Dieringer D, Schlotterer C (2003) A microsatellite variability screen for positive selection associated with the 'Out of Africa' habitat expansion of Drosophila melanogaster. Genetics, 165, 1137-1148. Kelley ST & Farrell BD (1998) Is specialization a dead end? The phylogeny of host use in Dendroctonus bark beetles (Scolytidae). Evolution, 52, 1731-1743. Kelley ST, Mitton JB, Paine TD (1999) Strong differentiation in mitochondrial DNA of Dendroctonus brevicomis (Coleoptera: Scolytidae) on different subspecies of ponderosa pine. Annals of the Entomological Society of America, 92, 193-197. Kelley ST, Farrell BD, Mitton JB (2000) Effects of specialization on genetic differentiation in sister species of bark beetles. Heredity, 84, 218-227. Kerdelhue C, Roux-Morabito G, Forichon J, Chambon J, Robert A, and Lieutier F (2002) Population genetic structure of Tomicus piniperda L. (Curculionidae: Scolytinae) on different pine species and validation of T. destruens (Woll.) Molecular Ecology, 11, 483-494. Kerdelhue C, Magnoux E, Lieutier F, Roques A, and Rousselet J (2006) Comparative population genetic study of two oligophagous insects associated with the same hosts. Heredity, 97, 38—45. Kim JJ, Allen EA, Humble LM, and Breuil C (2005) Ophiostomatoid and basidiomycetous fungi associated with green, red, and grey lodgepole pines after mountain pine beetle {Dendroctonus ponderosae) infestation. Canadian Journal of Forest Research, 35, 274-284. 133 Kim KS, Ratcliffe ST, French BW, Liu L, Sappington TW (2008) Utility of EST-derived SSRs as Population Genetics Markers in a Beetle. Journal of Heredity, 99, 112-124. Konkin D, Hopkins K (2009) Learning to deal with climate change and catastrophic forest disturbances. Unasylva, 60, 17-23. Korol L, Shklar G, Schiller G (2002) Diversity among circum-Mediterranean populations of leppo pine and differentiation from Brutia pine in their isoenzymes: additional results. Silvae Genetica, 51, 35-41. Kovtun IV, Goellner G, McMurray CT (2001) Structural features of trinucleotide repeats associated with DNA expansion. Biochemstry and Cellular Biology, 79, 325-326. Kurz WA, Dymond CC, Stinson G et al. (2008) Mountain pine beetle and forest carbon feedback to climate change. Nature, 452, 987-990. Kunzler P., Matsuo k, Schaffner W (1995) Pathological, physiological, and evolutionary aspects of short unstable DNA repeats in the human genome. Biological Chemistry Hoppe-Seyler, 4, 201-211. Lampert KP, Rand AS, Mueller UG, Ryan MJ (2003) Fine-scale genetic pattern and evidence for sex biased dispersal in the tungara frog, Physalaemus pustulosus. Molecular Ecology, 12,3325-3334. Langerhans RB, Gifford M, Joseph E (2007) Ecological speciation in Gambusia fishes. Evolution, 61, 2056-2074. Laity JH, Lee BM and Wright PE (2001) Zinc finger proteins: New insights into structural and functional diversity. Current Opinion in Structural Biology, 11, 39^46. Langor DW, Spence JR (1991) Host effects on allozyme and morphological variation of the mountain pine beetle, Dendroctonus ponderosae Hopkins (Coleoptera: Scolytidae). The Canadian Entomologist, 123, 395^410. Latch EK, Dharmarajan G, Glaubitz JC, Rhodes JR (2006) Relative performance of Bayesian clustering software for inferring population substructure and individual assignment at low levels of population differentiation. Conservation Genetics, 7, 295-302. Lee S, Kim JJ, Breuil C (2006) Fungal diversity associated with the mountain pine beetle, Dendroctonus ponderosae, and infested lodgepole pines in British Columbia. Fungal Diversity, 22,91-105. 134 Lee S, Hamelin RC, Six DL, Breuil C (2007) Genetic diversity and the presence of two distinct groups in Ophiostoma clavigerum associated with Dendroctonus ponderosae in British Columbia and the Northern Rocky Mountains. Phytopathology, 97, 1177— 1185. Lewontin RC, Krakauer J (1973) Distribution of gene frequency as a test of theory of selective neutrality of polymorphisms. Genetics, 74, 175-195. Levinson G, Gutman GA (1987) Slipped-strand mispairing: a major mechanism for DNA sequence evolution. Molecular Biology and Evolution, 4, 203-221. Li, YC, Korol AB , Fahima T, Beiles A, Nevo E (2002) Microsatellites: genomic distribution, putative functions and mutational mechanisms: a review. Molecular Ecology, 11, 2453-2465. Li YC, Korol AB, Fahima T, Nevo E (2004) Microsatellites within genes: structure, function, and evolution. Molecular Biology and Evolution, 21, 991-1007. Liu Z, Karsi A, Dunham RA (1999) Development of polymorphic EST markers suitable for genetic linkage mapping of catfish. Marine Biotechnology, 1, 437-447. Liu L, Dybvig K, Panangala VS, Van Santen VL, French CT (2000) GAA trinucleotide repeat region regulates M9/pMGA gene expression in Mycoplasma gallisepticum. Infection and Immunity, 68, 871-876. Lugon-Moulin N, Hausser J (2002) Phylogeographical structure, postglacial recolonization and barriers to gene flow in the distinctive Valais chromosome race of common shrew (Sorex araneus). Molecular Ecology, 11, 785-794. Luikart G, AUendorf FW, Cornuet JM, Sherwin WB (1998) Distortion of allele frequency distributions provides a test for recent population bottlenecks. Journal of Heredity, 89, 238-247. Luikart G, England PR, Tallmon D, Jordan S, Taberlet P (2003) The power and promise of population genomics: from genotyping to genome typing. Nature Reviwes Genetics, 4, 981-994. Mace PD, Shirley S, Day CL (2010) Assembling the building blocks: structure and function of inhibitor of apoptosis proteins. Cell Death and Differentiation, 17, 46-53. MacDonald GM, Cwynar LC (1985) A fossil pollen based reconstruction of the late Quaternary history of lodgepole pine (Pinus contorta ssp. latifolia) in the western interior of Canada. Canadian Journal of Forest Research, 15, 1039-1044. 135 Manni, Guerard FE, Heyern E (2004) Geographic patterns of (genetic,morphologic, linguistic) variation: how barriers can be detected by using Monmoniers algorithm. Human. Biology, 76, 173-190. Martin P, Makepeace K, Stuart A. et al. (2005) Microsatellite instability regulates transcription factor binding and gene expression. Proceedings of the National Academy of Science of the USA, 102, 3800-3804. Mantel N (1967) The detection of disease clustering and a generalized regression approach. Cancer Research, 27, 209-220. Maroja LS, Bogdanowicz SM, Wallin KF, Raffa KF, Harrison RG (2007) Phylogeography of spruce beetles (Dendroctonus rufipennis Kirby) (Curculionidae: Scolytinae) in North America. Molecular Ecology, 16, 2560-2573. Marshall HD, Newton C, Ritland K (2002) Chloroplast phylogeography and evolution of highly polymorphic microsatellites in lodgepole pine (Pinus contortd). Theoretical and Applied Genetics, 104, 367-378. Martins WS, Cesar D, Lucas S, et al. (2009) Bioinformation WebSat - A web software for microsatellite marker development. Bioinformation, 6, 282-283. Bamshad MJ, Wooding S, Watkins WS, et al. (2003) Human population genetic structure and inference of group membership. The American Society of Human Genetics, 11, 578-589. Mita K, Morimyo M, Okano K et al.. (2003) The construction of an EST database for Bombyx mori and its application. Proceedings of the National Academy of Sciences, USA,100, 14121-14126. Mock KE, Bentz BJ, O'Neill EM et al. (2007) Landscape-scale genetic variation in a forest outbreak species, the mountain pine beetle (Dendroctonus ponderosae). Molecular Ecology, 16, 553-568. Moxon R, Willis C (1999) DNA microsatellites: Agents of evolution? Scientific American, 280, 94-99. Mudunuri SB, Nagarajaram HA (2007) IMEx: Imperfect Microsatellite Extractor. Bioinformatics, 23, 1181-1187. Namkoong G, Roberds JH, Nunnally LB, Thomas HA (1979) Isozyme variations in populations of southern pine beetles. Forest Science, 25, 197-203. Narum SR, Hess JE (2011) Comparison of FST outlier tests for SNP loci under selection. Molecular Ecology Resources, 11, 184-194. 136 Nelson TA, Boots B, Wulder MA, Carroll AL (2007) Environmental characteristics of mountain pine beetle infestation hot spots. BC Journal of Ecosystems and Management, 8, 91-108. Nielsen R, Williamson S, Kim Y, et al. (2005) Genomic scans for selective sweeps using SNP data. Genome Research, 15, 1566-1575. Nielsen EE, Hemmer-Hansen J, Larsen PF, Bekkevold D (2009) Population genomics of marine fishes: identifying adaptive variation in space and time. Molecular Ecology, 18,3128-3150. Niemann KO & Visintini F (2005) Assessment of potential for remote sensing detection of bark beetle-infested areas during green attack: a literature review. Natural Resources Canada, Canadian Forest Service, 1-14. Owen DR, Lindahl KQ, Wood DL, and Parmeter JR (1987) Pathogenicity of fungi isolated from Dendroctonus valens, D. brevicomis, and D. ponderosae to ponderosa pine seedlings. Phytopathology, 77, 631-636. Paine TD, Raffa KF, and Harrington TC (1997) Interactions among scolytid bark beetles, their associated fungi, and live host conifers. Annual Review of Entomology, 42, 179206. Pan X, Xie D, Yu RW, Lam D, Saddler JN (2007) Pretreatment of lodgepole pine killed by mountain pine beetle using the ethanol organosolv process: fractionation and process optimization. Industrial and Enginering Chemistry Research, 46, 2609-2617. Pannebakker B, Niehuis O, Hedley AA, Gadau J, Shuker DM (2010) The distribution of microsatellites in the Nasonia parasitoid wasp genome. Insect Molecular Biology, 19, 91-98. Park SDE (2001) Trypanotolerance in West African cattle and the population genetic effects of selection. PhD thesis. University of Dublin. Pagani F, Buratti E, Stuani C, et al. (2000) Splicing factors induce cystic transmembrane regulator exon 9 skipping through a nonevolutionary conserved intronic element. The Journal of Biological Chemistry, 275, 210141-210147. Payseur BA, Cutter AD, Nachman MW (2002) Searching for evidence of positive selection in the human genome using patterns of microsatellite variability. Molecular Biology and Evolution, 19, 1143-1153. Peakall R, Smouse P, (2006) GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research, Molecular Ecology Notes, 6, 288-295. 137 Peng JH, Lapitan NLV (2005) Characterization of EST-derived microsatellites in the wheat genome and development of eSSR markers. Functional and Integrated Genomics, 5, 80—96. Perez F, Ortiz J, Zhinaula M, Gonzabay C, Calderon J, Volckaert FA (2005) Development of EST-SSR markers by data mining in three species of shrimp: Litopenaeus vannamei, Litopenaeus stylirostris, and Trachypenaeus birdy. Marine Biotechnology, 7, 554-569. Peteet DM (1991) Postglacial migration history of lodgepole pine near Yakutat, Alaska. Canadian Journal of Botany, 69, 786-796. Petit R, Mousadik A, Pons O (1998) Identifying populations for conservation on the basis of genetic markers. Conservation Biology, 12, 844-855. Piry S, Luikart G, Cornuet JM (1999) Bottleneck: a computer program for detecting recent reductions in the effective population size using allele frequency data. Journal of Heredity, 90, 502-503. Piry S, Alapetite A, Cornuet JM et al. (2004) GENECLASS2: a software for genetic assignment and first generation migrant detection. Journal of Heredity, 95, 536-539. Prasad MD, Muthulakshmi M, Madhu M et al. (2005) Survey and analysis of microsatellites in the silkworm, Bombyx mori: frequency, distribution, mutations, marker potential and their conservation in heterologous species. Genetics, 169, 197— 214. Prathepha P (2008) Variation of the waxy microsatellite allele and its relation to amylose content in wild rice (Oryza rufipogon Griff.). Asian Jarnal of Plant Science, 7, 156— 162. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics, 155, 945-959. Raffa KF, Aukema BH, Bentz BJ et al. (2008) Cross-scale drivers of natural disturbance prone to anthropogenic amplification: The dynamics of bark beetle eruptions. BioScience, 58, 501-517. Raymond M, Rousset F, (1995) GENEPOP: Population genetics software for exact tests and ecumenicism. Journal of Heredity, 86, 248-249. Regniere J & Bentz B (2007) Modeling cold tolerance in the mountain pine beetle, Dendroctonus ponderosae. Journal of Insect Physiology, 53, 559-572. Reid RW (1961) Moisture changes in lodgepole pine before and after attack by mountain pine beetle. The Forestry Chronicle, 37, 368-75. 138 Rice WR (1989) Analyzing tables of statistical tests. Evolution, 43, 223-225. Ritchie C (2008) Management and challenges of the mountain pine beetle infestation in British Columbia ALCES, 44, 127-135. Ritzerow S, Konrad H, Stauffer C (2004) Phylogeography of the Eurasian pine shoot beetle Tomicus piniperda (Coleoptera: Scolytidae). European Journal of Entomology, 101, 13-19. Riva G, Cohen CJ, Eitan Y, et al. (2000) Simple Sequence Repeats in Escherichia coli: Abundance, Distribution, Composition, and Polymorphism. Genome Research, 10, 62-71. Roberds JH, Hain FP, Nunnally LB (1987) Genetic structure of southern pine beetle populations. Forest Science, 33, 52-69. Robertson C, Nelson TA, Boots B (2007) Mountain pine beetle dispersal: the spatialtemporal interaction of infestations. Forest Science, 53, 395-405. Rosenberg SM, Longerich S, Gee P, Harris RS (1994) Adaptive mutation by deletions in small mononucleotide repeats. Science, 265, 405-407. Rousset F (1997) Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics, 145, 1219-1228. Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist programmers. In: Bioinformatics Methods and Protocols: Methods in Molecular Biology (eds. Krawetz S, Misener S), pp.365-386. Humana Press, Totowa, New Jersey. Safranyik L (1978) Effects of climate and weather on mountain pine beetle populations. Pages 77-84 in Kibbee DL, Berryman AA., Amman G.D, and Stark RW, eds. Theory and practice of mountain pine beetle management in lodgepole pine forests. Symposium Proceedings, University of Idaho, Moscow, ID. Safranyik L, Silversides R, McMullen LH, Linton DA (1989) An empirical approach to modeling the local dispersal of the mountain pine beetle (Dendroctonus ponderosae Hopk.) (Col., Scolytidae) in relation to sources of attraction, wind direction and speed. Journal of Applied Entomology, 108, 498-511. Safranyik L, Linton DA, Silversides R, McMullen LH (1992) Dispersal of released mountain pine beetles under the canopy of a mature lodgepole pine stand. Journal of Applied Entomology, 113, 441-450. 139 Safranyik L, Linton DA (1998) Mortality of mountain pine beetle larvae, Dendroctonus ponderosae (Coleoptera: Scolytidae) in logs of lodgepole pine (Pinus contorta var. latifolia) at constant low temperatures. Journal of the Entomological Society of British Columbia, 95, 81-87. Safranyik L, Carroll AL (2006) The biology and epidemiology of the mountain pine beetle in lodgepole pine forests. In: The mountain pine beetle: a synthesis of biology, management, and impacts on lodgepole pine (eds Safranyik L, Wilson B), pp. 3-66. Natural Resources Canada. Safranyik L, and Wilson B (2006) The mountain pine beetle a synthesis of biology, management, and impacts on lodgepole pine, Natural Resources Canada, Canadian forest service Pacific Forestry Centre. Victoria, BC. p3. Safranyik L, Carroll AL, Regniere J, et al. (2010) Potential for range expansion of mountain pine beetle into the boreal forest of North America. Canadian Entomologist, 142, 415-442. Salle A, Arthofer W, Lieutier F, Stauffer C Kerdelhue C (2007) Phylogeography of a host-specific insect: genetic structure of Ips typographus in Europe does not reflect past fragmentation of its host. Biological Journal of the Linnean Society, 90, 239246. Salom SM, McLean JA (1990) Dispersal of Trypodendron lineatum (Olivier) within a valley setting. The Canadian Entomologist, 122, 43-58. Salvador L, Alia R, Agundez D, Gil L (2000) Genetic variation and migration pathways of maritime pine {Pinus pinaster Ait.) in the Iberian peninsula. Theoretical and Applied Genetics, 100, 89-95. Sambrook J, Russell DW (2001) Molecular Cloning: A Laboratory Manual, 3 r edn. Cold Spring Harbour Laboratory Press, New York. Sangwan I, O'Brian MR (2002) Identification of a soybean protein that interacts with GAGA element dinucleotide repeat DNA. Plant Physiology, 129, 1788-1794. Sattath S, Elyashiv E, Kolodny O, Rinott Y, Sella G (2011) Pervasive Adaptive Protein Evolution Apparent in Diversity Patterns around Amino Acid Substitutions in Drosophila simulans. PLoS Genetics, 7, 1-6. Sawyer LA, Hennessy JM, Peixoto AA et al. (1997) Natural variation in a Drosophila clock gene and temperature compensation. Science, 278, 2117-2120. Schlotterer C (2003) Hitchhiking mapping-functional genomics from the population genetics perspective. Trends in Genetics, 19, 32-38. 140 Schluter D (2001) Ecology and the origin of species. Trends in Ecology Evolution, 16, 372-380. Schuelke M (2000) An economic method for the fluorescent labeling of PCR fragments. Nature Biotechnology, 18, 233-234. Schug MD, Hutter CM, Wetterstrand KA et al. (1998) The mutation rates of di-, tri- and tetranucleotide repeats in Drosophila melanogaster. Molecular Biology and Evolution, 15,1751-1760. Six DL, Paine TD (1998) Effects of mycangial fungi and host tree species on progeny survival and emergence of Dendroctonus ponderosae (Coleoptera: Scolytidae). Environmental Entomology, 27, 1393-1401. Six DL, Paine TD, Hare JD (1999) Allozyme diversity and gene flow in the bark beetle, Dendroctonus jeffreyi (Coleoptera: Scolytidae). Canadian Journal of Forest Research, 29, 315-323. Six DL (2003) A comparison of mycangial and phoretic fungi of individual mountain pine beetles. Canadian Journal of Forest Research, 33, 1331-4. Slatkin M, Maddison WP (1990) Detecting isolation by distance using phylogenies of genes. Genetics, 126, 249-260. Slatkin M (1991) Inbreeding coefficients and coalescence times. Genetical Research, 58, 167-175. Slatkin M, (1993) Isolation by distance in equilibrium and non-equilibrium populations. Evolution, 47, 264-279. Slatkin, M., and L. Excoffier (1996) Testing for linkage disequilibrium in genotypic data using the expectation-maximization algorithm. Heredity, 76, 377-383. Smit AF, Toth G, Riggs AD, Jurka J (1995) Ancestral, mammalian-wide subfamilies of LINE-1 repetitive sequences. Journal of Molecular Biology, 246, 401^-17. Solheim H, Krokene P (1998) Growth and virulence of mountain pine beetle associated blue-stain fungi, Ophiostoma clavigerum and Ophiostoma montium. Canadian Journal of Botany, 76, 561-566. Solignac M, Vautrin D, Loiseau A, et al. (2003) Five hundred and fifty microsatellites markers for the study of the honey bee genome (Apis mellifera L). Molecular Ecology Notes, 3,307-311. Song QJ, Fickus EW, Cregan PB (2002) Characterization of trinucleotide SSR motifs in wheat. Theoretical and Applied Genetics, 104,286-293. 141 Soranzo N, Alia R, Provan J, Powell W (2000) Patterns of variation at mitochondrial sequence-tagged-site provide new insights into the postglacial history of European Pinus sylvestris populations. Molecular Ecology, 9, 1205-1211. Stajich JE, Hahn MW (2005) Disentangling the effects of demography and selection in human history. Molecular Biology and Evolution, 22, 63-73. Stauffer C, Lakatos F, Hewitt G (1999) Phylogeography and postglacial colonization routes of Ips typographus L. (Coleoptera, Scolytidae). Molecular Ecology, 8, 763773. Stock MW, Pitman B, Guenther JD (1979) Genetic differences between Douglas-fir beetles {Dendroctonus pseudotsugae) from Idaho and coastal Oregon. Annals of the Entomological Society of America, 72, 394-397. Stock MW, Guenther JD (1979) Isozyme variation among mountain pine beetle {Dendroctonus ponderosae) populations in the Pacific Northwest. Environmental Entomology, 8, 889-893. Stock MW, Amman GD (1980) Genetic differentiation among mountain pine beetle populations from lodgepole pine and ponderosa pine in northeast Utah. Annals of the Entomological Society of America, 73, 472-478. Stock MW, Amman GD, Higby PK (1984) Genetic variation among mountain pine beetle {Dendroctonus ponderosae) (Coleoptera; Scolytidae) populations from seven western states. Annals of the Entomological Society of America, 11, 760-764. Stock MW, Amman GD (1985) Host effects on the genetic structure of mountain pine beetle, Dendroctonus ponderosae, populations. In: Proceedings oflUFRO Conference on: The Role of the Host in the Population Dynamics of Forest Insects, pp. 83-95. Storz JF (2005) invited review: Using genome scans of DNA polymorphism to infer adaptive population divergence. Molecular Ecology, 14, 671-688. Streelman JT, Kocher TD (2002) Microsatellite variation associated with prolactin expression and growth of salt-challenged Tilapia. Physiological Genomics, 9, 1-4. Sturgeon KB, Mitton JB (1986) Allozyme and morphological differentiation of mountain pine beetles Dendroctonus ponderosae Hopkins (Coleoptera: Scolytidae) associated with host tree. Evolution, 40, 290-302. Taylor SW and Carroll AL (2004) Disturbance, forest age dynamics, and mountain pine beetle outbreaks in BC: a historical perspective. In: Shore, T. L. et al. (eds), Challenges and solutions. Proc. of the Mountain Pine Beetle Symp., Kelowna, BC, 142 Canada October 30-31, 2003, Canadian Forest Service, Pacific Forestry Centre, Information Report BC-X- 399, NRC Research Press, pp. 41-51. Telles M, Diniz-Filho AJ (2005) Multiple Mantel tests and isolation-by-distance, taking into account long-term historical divergence. Genetics and Molecular Research, 4, 742-748. Temnykh S, Declerck G, Lukashova A, et al. (2001) Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. Genome Research, 11, 14411452. Tsui CKM, Feau N, Ritland CE, et al. (2009) Characterization of microsatellite loci in the fungus, Grosmannia clavigera, a pine pathogen associated with the mountain pine beetle. Molecular Ecology Resources. Tian F, Stevens NM, Buckler ES (2009) Tracking footprints of maize domestication and evidence for a massive selective sweep on chromosome. Proceedings of the National Academy of Science of the USA, 106, 9979-9986. Tidiane Aw, Schlauch K, Keeling CI, et al. (2010) Functional genomics of mountain pine beetle (Dendroctonus ponderosae) midguts and fat bodies. BMC Genomics, 11, 215— 227. Timchenko LT, Timchenko NA, Caskey CT, Roberts R (1996) Novel proteins with binding specificity for DNA CTG repeats and RNA CUG repeats: implications for myotonic dystrophy. Human Molecular Genetics, 5, 115-121. Uunila L, Guy B, and Pike R (2006) Hydrologic effects of mountain pine beetle in the interior pine forests of British Columbia: key questions and current knowledge. Streamline Watershed Management Bulletin, 9, 1-6. Varshney RK, Graner A, Sorrells ME (2005) Genie microsatellite markers in plants: Features and applications. Trends in Biotechnology, 23, 48-55. Vasconcelos T, Horn A, Lieutier F, Branco M, Kerdelhue C (2006) Distribution and population genetic structure of the Mediterranean pine shoot beetle Tomicus destruens in the Iberian Peninsula and Southern France. Agricultural and Forest Entomology, 8, 103-111. Vasemagi A, Nilsson J, Primmer CR (2005) Expressed sequence tag-linked microsatellites as a source of gene-associated polymorphisms for detecting signatures of divergent selection in Atlantic salmon (Salmo salar L.). Molecular Biology and Evolution, 22, 1067-1076. 143 Vazquez O, Vazquez ME, Blanco JB, Castedo L, Mascarenas JL (2007) Specific DNA recognition by a synthetic, monomeric Cys2His2 zinc-finger peptide conjugated to a minor-groove binder. Angewandte Chemi Intearnational Edition, 46, 6886-6890. Verheij M, Bose R, Lin XH, et al. (1996) Requirement for ceramide-initiated SAPK/JNK signalling in stress-induced apoptosis, Nature, 380, 75 - 79. Vilaplana L, Nuria P, Perera N, Belles X (2007) Molecular characterization of an inhibitor of apoptosis in the Egyptian armyworm, Spodoptera littoralis, and midgut cell death during metamorphosis. Insect Biochemistry and Molecular Biology, 37, 1241-1248. Vigouroux Y, McMullen M, Hittinger CT, et al. (2002) Identifying genes of agronomic importance in maize by screening microsatellites for evidence of selection during domestication. Proceedings of the National Academy of Science of the USA, 99, 9650-9655. Vitalis R, Dawson K, Boursot P (2001) Interpretation of variation across marker loci as evidence of selection. Genetics, 158, 1811-1823. Wagner TL, Gagne JA, Doraiswamy PC, Coulson RN, Brown KW (1979) Development time and mortality of Dendroctonus frontalis in relation to changes in tree moisture and xylem water potential. Environmental Entomology, 8, 1129-38. Wagner WL, Wilson B, Peter B, Wang S, Stennes B (2006) Economics in the management of mountain pine beetle in lodgepole pine in British Columbia: A synthesis. In: The mountain pine beetle: a synthesis of biology, management, and impacts on lodgepole pine (eds Safranyik L, Wilson B), Natural Resources Canada, pp. 277-300. Walum H, Westberg L, Henningsson S et al. (2008) Genetic variation in the vasopressin receptor la gene (AVPR1A) associates with pair-bonding behavior in humans Proceedings of the National Academy of Sciences of the U.S.A, 105, 14153. Waples RS, Gaggiotti O (2006) What is a population? An empirical evaluation of some genetic methods for identifying the number of gene pools and their degree of connectivity. Molecular Ecology, 15, 1419-1439. Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure. Evolution, 38, 1358-1370. Wells RD (1996) Molecular basis of genetic instability of triplet repeats. The Journal of Biological Chemistry, 111, 2875-2878. Westfall J, Ebata T (2008) 2007 Summary afforest health conditions in British Columbia. BC Ministry of Forests and Range. 144 Whitehead A & Crawford DL 2006. Neutral and adaptive variation in gene expression. Proceedings of the National Academy of Science of the USA, 103, 5425-5430. Whitney HS, Bandoni RJ, Oberwinkler F (1987) Entomocorticium dendroctoni gen. et. Sp. Nov. (Basidiomycotina), a possible nutritional symbiote of the mountain pine beetle in lodgepole pine in British Columbia. Canadian Journal of Botany, 65, 9 5 102. Wiehe T, Nolte V, Zivkovic D, Schlotterer C (2007) Identification of selective sweeps using a dynamically adjusted number of linked microsatellites. Genetics, 175,207-218. Wood CS, Unger L (1996) Mountain pine beetle - a history of outbreaks in pine forests in British Columbia, 1910 to 1995. Natural Resources Canada. Wootton, JC, Feng X, Ferdig MT, et al. (2002) Genetic diversity and chloroquine selective sweeps in Plasmodium falciparum. Nature, 418, 320-323. Wordley C, Slate J, Stapley J (2011) Mining online genomic resources in Anolis carolinensis facilitates rapid and inexpensive development of cross-species microsatellite markers for the Anolis lizard genus. Molecular Ecology Resources, 11, 126-133. Worley K, Carey J, Veitch A, Coltman DW (2006) Detecting the signature of selection on immune genes in highly structured populations of wild sheep {Ovis dalli). Molecular Ecology, 15,623-37. Wulder MA White JC, Grills D, et al. (2009) Aerial overview survey of the mountain pine beetle epidemic in British Columbia : Communication of impacts. BC Journal of Ecosystems and Management, 10, 45—58. Wygant ND (1940) Effects of low temperature on the Black Hills beetle {Dendroctonus ponderosae Hopkins). Ph.D. Dissertation, State College of New York, Syracuse, NY. in Safranyik & Linton (1998) YamaokaY, HiratsukaY, and Maruyama PJ (1995) The ability of Ophiostoma clavigerum to kill mature lodgepole pine trees. European Journal of Forest Pathology, 25, 401^104. Yatabe Y, Kane NC, Scotti-Saintagne C, Rieseberg LH (2007) Rampant gene exchange across a strong reproductive barrier between the annual sunflowers, Helianthus annuus and H. petiolaris. Genetics, 175, 1883-1893. Young LJ, Hammock EA (2007) On switches and knobs, microsatellites and monogamy. Trends in Genetics, 23, 209-212. 145 Zhang Y, Fong CC, Wong MS, et al. (2005) Molecular mechanisms of survival and apoptosis in RAW 264.7 macrophages under oxidative stress. Apoptosis, 10, 545556. Zheng Y, Aukema BH (2010) Hierarchical dynamic modeling of outbreaks of mountain pine beetle using partial differential equations. Environmetrics, 21, 801-816. Zuniga G, Cisneros R, Hayes JL et al. (2002) Karyology, geographic distribution, and origin of the genus Dendroctonus Erichson (Coleoptera: Scolytidae). Annals of the Entomological Society of America, 95, 267—275. 146 Appendix Table A.l. The composition of di and tri nucleotide repeats found among 14441 contigs in the build 8 of the MPB EST database. Repeat Motif Dinucleotide AT TA TG/CA AC/GT TC/GA AG/CT GC CG Trinucleotide TAA/TTA AAT/ATT AAG/CTT TGA/TCA ATA/TAT ATC/GAT TTC/GAA ATG/CAT TTG/CAA ACA/TGT TCC/GGA AGA/TCT TGG/CCA CAG/CTG GAG/CTC AGG/CCT TAC/GTA TGC/GCA AAC/GTT GTG/CAC AGC/GCT ACC/GGT TCG/CGA AGT/ACT GAC/GTC ACG/CGT GGC/GCC TAG/CTA CGG/CCG GCG/CGC Total Number Fourn 560 420 349 294 221 210 44 25 52 51 51 48 44 38 38 36 35 27 26 25 25 25 24 22 22 22 21 21 20 19 12 11 10 9 9 8 3 2 Table A.2. The PCR primers of monomorphic EST- derived microsatellite markers for Dendroctonus ponderosa. Primer Name MPBC5J418 MPBC5_5860 MPBC5_6396 MPBC5_8974 MPBC6J 320 MPBC7_2797 MPBC78439 MPBC79712 MPBC7_10522 MPBC8_3 MPBC8_90 MPBC8_277 MPBC8_353 MPBC8_853 MPBC8_2065 MPBC8_2085 MPBC8_7377 MPBC8_7519 MPBC8_8887 MPBC8_9052 MPBC89623 MPBC8J 0 7 3 2 MPBC8_11497 MPBC8J1671 MPBC8J 1 7 2 4 MPBC8J1887 MPBC8_12559 MPBC8_14233 MPBC8 14256 M13-Tailed Primer2 AATTGCCACCGTCATTATCC CATCGATACGCAATTCACAA ATTTGGCTTGCAGTTGATTC TCATGTTCACGCACAAAACA GCACATATACATGCAAGACATTCA GCTAACAAACCTGCCGACAT TGAAGTCATTTCGCTGAACG TGCCCAGAAAAATGTGTCCT GGCAATCCAACCGAGTATGT CCCTTCTCCCACCACTAACA GGCTAACAACACTGCCCACT GAACAGGTTCCAGTGGGTGT CAAAGAACCCGTTTTCTGGA CCATCTCCATCAGCCCTAGA GTTGAACTACCTCCCGTCCA GATATCCATGTCCGCCAAAC TCAGCCTTTTCCTTTCCAGA GGGATCGCAACCAAACAG AATTGCGTTTTCTCCCATCA ATTTAAACCACTGTTAGTACA TTTTTCGACAACATAGCTTTA AACATGAACTGAAAAGCCATTG CGTGAGCGCTTAAAGTGATG GACAGTTGCCACAACCAGTG TTTTTGAGTGATGTTTCTTGGA TTTTTCGCTTTGTCCATAAAA CTCATTTCGGACGAGAAAGC TGGGATTTTTATGAAATTAACACATT CGGGGATTTAAGAAGCGAGA Reverse Primer ATTGGCTGGAAAAACACCTG CCCTGATTGCTATGCCACTT CACGCGGATTGGACTAGATT GGAACTGGGCAGCAAGTAAA CGAAAAAGGAAAGTGCCAAA TGCCTAAGAATTGGCTAGGG AGAGAAGCTTTCGTGCCTCA AAGGGCCAAGGAGTGAAATC TGTGATGGAAGAACCCATGA TTCATTCCCTCCTGCACTTC CCGCAAAAGCACACTAGCTT GAGAACGTGGTGGGCTTTAG GTGCTTCGCCTTAAGAATCG GAAGTGGCCGATGAAATGTT CCTTCCCTTGACTCTGTTCG ATGCCCAGTCATCTGACCTC CTATCTCCTTTGCCCCGATT CGCTTTGGTCAGCTTTTTCT TTTTTGCGGGTTTAAATCTAGG TGTATGGGACCAGTTGGTGA GATCTTGAAAGGCAGGTGGA GCTTATTTGCCAACGTCAAAC GCTTCGGTGACGTAAAAAGG TCAGTCGAACGAAAACCAAA CCGATTCAGATTCTAGTGATGATG AGTTACGCTTTTGCGCTGAT AAAAACTGCCGCCAGAACTA CCACCAATTTCAGGAGGAAA GGACTGCCATTTCCATCTGT 'M13 sequence (TGTAAAACGACGGCCAGT) was added to the 5' end of M13-tailed primers. Table A.3. Genotypic data of 16 MPB from the Quesnel sampling location at 50 polymorphic EST-derived microsatellite markers. Locus Beetle # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 M P B C 5 811 MPBC5 7119 M P B C 5 6823 M P B C 5 7419 280 287 287 281 260 284 287 284 281 281 281 260 281 281 284 284 128 128 128 128 125 128 128 128 128 131 131 128 131 128 128 128 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 203 200 200 203 200 203 200 200 191 200 200 200 200 200 191 200 280 284 287 281 287 287 287 287 287 281 284 260 281 284 284 284 131 131 131 128 131 128 128 128 131 128 131 128 131 128 131 131 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 203 203 203 203 200 203 203 200 203 203 200 203 209 204 203 209 M P B C 5 73 228 228 228 228 228 228 228 225 228 228 228 228 228 225 228 228 228 228 228 228 228 228 228 225 228 228 228 228 228 225 228 228 M P B C 5 6124 221 224 224 221 221 224 227 221 221 221 227 224 224 224 221 224 224 227 227 224 224 227 227 221 224 224 227 224 227 227 227 227 Table A3, continued Locus Beetle # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 M P B C 5 4313 M P B C 5 4357 M P B C 5 1480 261 267 264 267 267 267 267 267 267 267 267 267 267 267 261 267 243 243 245 243 237 245 237 243 235 235 235 235 239 239 235 235 260 260 260 260 260 260 260 260 256 260 260 260 260 260 260 260 264 267 267 267 267 267 267 267 267 267 267 267 267 267 264 267 243 243 245 243 237 245 245 243 235 239 239 241 239 239 239 239 272 260 260 260 260 260 260 260 260 260 260 260 260 260 260 260 M P B C 6 1403 M P B C 6 675 M P B C 6 656 196 200 200 198 196 196 196 196 200 196 200 198 198 198 198 198 177 174 177 180 177 174 177 174 174 171 174 171 171 171 171 171 280 268 268 268 259 268 268 265 259 259 259 259 268 259 268 268 198 200 200 200 200 200 198 198 200 200 202 202 198 198 198 198 177 177 180 180 177 177 177 174 174 171 174 171 174 171 174 171 280 268 274 271 268 268 268 268 262 268 268 262 274 268 268 268 Table A.3. continued Locus Beetle # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 MPBCC; 4i4iM P B C 6 3837 173 173 173 175 173 173 173 173 173 173 173 173 173 169 173 173 173 173 173 175 173 175 175 175 173 175 173 173 175 169 173 173 1 173 182 173 176 185 173 182 176 173 179 173 176 173 176 173 167 173 182 173 179 185 179 176 182 173 179 185 176 176 176 185 167 M P B C 6 7245 M P B C 6 1504 M P B C 6 4141 2 M P B C 6 893 227 227 227 224 212 227 212 227 227 212 227 227 227 227 212 227 214 214 214 220 214 214 214 214 214 214 214 214 220 214 214 214 239 243 243 243 243 243 243 243 239 243 243 243 243 243 243 243 161 161 161 161 163 161 161 161 165 165 165 165 165 165 163 163 227 227 227 227 227 227 227 233 227 224 227 227 224 230 227 227 220 214 214 220 214 214 214 220 214 214 220 220 220 214 214 220 239 243 243 247 243 247 243 243 243 243 243 243 243 247 243 243 Table A.3. continued Locus Beetle # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 M P B C 6 655 M P B C 7 24 M P B C 7 101 276 276 270 276 267 276 276 276 267 267 267 267 276 276 276 270 180 182 182 182 180 182 182 184 182 182 186 184 184 184 184 184 184 184 187 187 187 184 187 187 187 196 187 184 190 190 187 178 276 276 276 276 276 276 276 276 270 276 276 267 282 282 276 276 180 182 184 184 184 182 184 184 182 182 186 186 186 186 186 186 184 196 187 187 187 187 187 187 190 196 190 196 190 190 187 178 M P B C 7 1578 191 191 191 191 191 194 191 194 191 197 194 194 194 197 188 188 194 194 194 194 194 194 191 194 191 197 197 197 197 197 188 188 M P B C 7 1284 MPBC7' 11362 235 235 237 235 237 233 235 237 235 235 235 235 235 233 233 233 200 168 168 168 168 168 168 168 178 178 178 178 178 180 178 178 235 235 237 237 237 235 235 237 235 235 235 235 235 235 235 235 200 168 168 180 168 200 168 168 178 178 178 184 178 180 178 176 161 161 161 161 163 161 161 161 165 165 165 165 165 165 165 165 Table A.3. continued Locus Beetle # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 M P B C 7 1771 193 194 194 193 193 193 196 193 195 195 195 194 195 195 195 198 196 196 196 193 193 193 196 194 195 195 198 195 198 195 198 198 M P B C 7 548 MPBC7' 12514 M P B C 8 6649 M P B C 8 9385 M P B C 8 5651 283 283 280 280 283 283 283 283 286 283 286 286 283 286 283 280 232 232 232 232 238 232 232 232 246 238 244 244 244 244 238 246 275 281 275 275 281 275 275 275 275 281 281 275 281 275 275 275 194 198 194 194 194 198 198 198 198 198 194 194 198 194 198 198 283 286 283 283 283 283 286 286 286 286 289 286 286 289 283 280 232 238 232 238 238 238 232 238 246 238 244 244 244 244 244 246 315 315 321 315 315 321 318 318 315 315 315 315 318 318 318 318 321 315 324 315 315 324 321 318 321 321 321 315 318 318 318 318 281 281 281 275 281 275 275 281 275 281 281 281 281 281 281 281 198 198 198 198 198 198 198 198 198 194 198 198 198 194 198 198 Table A3, continued Locus Beetle # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 M P B C 8 4511 MPBra ! 12050 M P B C 8 3135 403 403 403 397 400 403 403 397 403 400 403 406 385 397 406 397 277 277 281 279 277 277 277 279 277 277 277 277 277 279 277 279 403 403 403 403 406 403 403 397 403 403 403 409 397 397 409 397 277 277 281 279 277 281 279 281 277 277 281 277 281 279 277 279 323 316 323 323 316 323 323 321 321 321 314 314 321 321 314 321 323 321 323 323 321 323 323 321 321 323 314 321 321 321 321 321 M P B C 8 3807 338 364 364 340 340 340 340 340 338 338 338 338 338 342 338 338 342 364 364 340 340 340 340 340 342 338 338 342 338 342 342 338 MPBC*5 11376 MPBCS! 12235 218 210 348 348 222 222 348 342 222 222 348 348 222 222 348 342 210 218 348 348 218 220 348 348 222 220 348 348 242 240 348 348 342 210 218 348 222 218 348 348 222 222 342 348 210 218 348 348 210 218 348 348 220 348 218 348 222 348 220 348 242 240 348 348 151 r-r— r-t— o O o o o O O © r- © © © © o o o o o o o © © © © © © © © © © © o © c-- r- r-- ^ ^ r - - i n r - - " ^ - r ~ - o o c N C N r - o o c N © r ~ - O N t - ~ - © © r - O N O N O N t - - O N O N t - - r - - i n t - - m t - ~ - ^ - — < r ^ - o o r - - c - - r - ^ o o c p - r - - O N r ^ . O N r - © © c - - O N r ~ - c - . r ^ O N C - r- in in m m 1/1 in m m m in in in m m m i/~> m in in m m m m IT) in in m in in co co co co co co co co CO CO CO CO CO CO CO CO in in in in in in m CN m in m CN O N O N in in in m in in in m m in in m m < * m in co co co CO co co co co CO CO CO CO CO CO CO CO c o c o c o c o c o c o c o c o c o c o c o c o c o c o c o C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C O C O C O C N C O C O C O C O C N C O C O C O C N C N C O C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N O N O N O N O N O N O N C N O N O N O N C N O N O N co co co co o O © © © CO © © CO © r- © © r-~ © © r-- o o o r- r-- oo o ON o CN CO NO ON o o o r-- o o o o o o oo o ©© co co co co co oo c-- © © © © © © © ON ON ON © 00 oo © © © ON CO CN CO CO CN CN CN CO CO o ON o © NO •3- CN NO oo © © o o o o o ON © © ON © ON ON O N ON © © co o o o o o o o o o o o o o o o o o o o o o o ( N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N O N O N O N O N O N O N O N * - « O N O N O N O N O N O N O N O N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N ' — i O N O N O N O N O N O N O N O N O N O N C ~ ~ i n O N O N r t N O N O N D N O N O N O N O N O N O N O t - - 0 O N O N O C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N a •S R O o o © > — I O N O N O N O N O N C N O N O N O N O N O N O N O N O N i n ^ T N O N O N O N O N O N O N O N O N O ^ O N O N O N O N O C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N m % © c N c o ^ t - i n N o r - o o O N -H CN co ^f in NO r ~ - t - - r ^ r ~ ~ r N - r - ~ o o o o o o < x 3 r ~ - r ^ - r - - r ~ - o o o o C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N r - - r - - r - r ~ - i n i n i n i n N o i n i n N O N o O N i n m C N C N C N C N C O C O C O C O ^ C O C O ' ^ - ^ - ^ ' C O C O C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N r - - r - - r - - c - ~ i n i n i n r - - i n c - - i n i n i n O N i n i n C N C N C N C N C O C O C O C N C O C N C O C O C O - ^ C O C O C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N C N CQ — ' C N c o ^ - i n N o r - o o O N © -H CN co -^- i n NO IT) ^o m vo Table A.4. Genotypic data of MPB from six sampling location in Western Canada at five chosen EST- derived microsatellite markers. Sample locations are Houston (HO), Mackenzie (MA), Grande Prairie (GP), Whistler (WH), Banff (BA), and Nancy Green (NG). Beetle sample HO01 HO02 HO06 HO09 HO10 HO 13 H014 H015 H019 H026 H031 H037 HO40 H041 H042 H044 H045 H047 H049 HO50 H051 Beetle sample MA04 MA05 MA06 MA09 MA10 MAI 3 MAM MAI 7 MA23 MA25 MA27 MA33 MA34 MA36 MA40 MA41 MA45 MA47 MA49 MA50 Locus 24 184 184 180 184 184 182 184 184 184 182 184 184 184 184 184 182 182 182 180 184 184 184 184 180 184 184 182 184 184 184 184 184 186 184 184 184 184 184 182 184 184 184 4357 237 237 237 243 237 237 245 245 245 245 245 243 237 237 237 237 245 243 237 243 245 245 237 237 245 245 243 245 245 245 245 249 243 243 243 237 243 245 243 237 245 245 884 173 170 164 170 170 170 170 170 170 170 164 170 170 170 170 170 170 170 170 170 170 173 173 170 170 170 173 170 170 170 173 170 170 170 170 170 170 173 173 170 170 170 6823 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 249 246 246 246 246 246 246 246 246 246 246 246 249 246 246 246 249 246 675 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 180 177 177 177 177 177 177 180 177 177 177 Locus 24 182 184 184 182 182 184 184 182 184 180 184 184 184 184 184 166 162 184 192 184 4357 186 184 186 184 184 186 186 184 184 182 184 184 184 184 184 166 166 184 192 186 243 237 237 245 237 243 243 245 237 237 243 245 245 243 237 245 243 243 237 243 243 243 243 245 249 243 245 245 243 243 245 247 245 243 237 243 243 245 243 245 884 170 170 170 170 170 170 170 170 173 170 170 170 170 170 170 158 170 170 161 170 6823 170 170 170 170 170 170 170 170 173 170 176 173 170 170 170 158 173 173 170 170 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 675 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 Table A.4. continued Beetle sample GP01 GP02 GP03 GP04 GP05 GP06 GP07 GP08 GPIO GP13 GP14 GP15 GP16 GP17 GP18 GP19 GP20 GP21 GP22 GP24 GP30 GP32 GP34 Locus 24 184 184 182 184 182 182 184 184 184 182 182 182 184 184 184 184 184 184 182 184 180 184 184 184 184 182 184 184 184 184 184 186 184 184 182 186 184 184 184 184 184 184 184 184 184 184 4357 243 243 247 243 243 237 243 245 243 249 245 237 243 243 243 243 243 237 243 243 237 237 237 243 243 247 245 243 245 245 245 243 249 249 237 243 243 243 245 245 237 243 243 237 237 237 884 170 170 170 173 170 170 170 170 170 170 170 170 170 170 170 170 170 170 170 170 170 170 170 170 170 164 173 173 170 170 173 170 170 173 170 170 170 170 173 170 170 170 170 170 170 170 6823 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 249 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 675 177 177 177 174 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 177 183 177 177 180 177 180 177 183 177 177 177 177 177 177 177 180 177 177 177 177 177 177 Beetle sample WH01 WH03 WH06 WH09 WH10 WH11 WH15 WH16 WH18 WH19 WH28 WH30 WH34 WH35 WH37 WH43 WH48 WH54 WH55 WH56 WH57 WH59 Locus 24 180 182 184 184 182 182 180 180 182 184 180 180 184 182 182 186 184 184 184 182 184 182 182 184 184 184 184 186 182 180 182 184 184 184 184 184 184 186 184 184 186 184 184 184 4357 237 243 237 243 245 243 237 243 237 243 237 245 243 237 237 237 243 237 237 237 237 237 245 245 237 243 245 243 245 245 245 243 245 247 243 247 245 237 247 243 245 243 245 243 884 170 170 173 173 170 173 170 170 170 170 170 170 173 173 170 170 170 170 170 173 173 170 173 170 173 173 173 173 173 173 173 173 170 170 173 173 173 173 173 173 173 173 176 173 6823 243 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 249 246 675 177 159 174 171 171 171 177 174 174 171 174 177 177 174 171 174 174 171 171 177 177 174 177 174 183 174 174 177 177 174 174 174 174 162 177 174 171 174 174 171 171 177 177 174 Table A.4. continued Beetle sample BA13 BA15 BA17 BA19 BA23 BA26 BA27 BA30 BA31 BA33 BA35 BA36 BA38 BA39 BA41 BA44 BA45 BA47 BA49 BA50 BA52 BA55 BA60 BA63 Beetle sample NG01 NG02 NG03 NG06 NG11 NG12 NG13 NG14 NG15 NG16 NG20 NG23 NG24 NG26 NG28 NG30 NG32 NG34 NG41 NG42 NG49 NG52 Locus 24 180 180 182 180 182 184 184 184 184 180 180 184 180 182 182 182 180 180 184 184 184 184 180 184 182 182 184 182 184 186 184 184 184 182 184 184 184 184 184 182 182 186 184 184 184 184 182 186 4357 237 245 237 237 243 247 237 243 237 243 245 237 243 237 237 237 245 245 243 243 237 245 237 243 243 245 237 243 245 247 240 243 237 245 245 245 245 237 245 245 245 245 247 243 243 247 243 245 884 173 170 170 164 170 173 170 170 170 170 170 170 164 170 173 170 170 164 164 170 173 167 170 170 173 173 173 173 173 173 170 173 170 170 173 173 170 173 173 173 170 170 173 170 173 173 173 170 6823 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 675 174 174 174 174 171 174 174 174 174 174 174 174 174 174 174 174 174 174 174 174 174 174 174 174 174 180 174 174 174 174 174 174 174 180 177 174 174 174 174 174 174 174 174 177 174 174 174 174 Locus 24 182 180 184 184 180 182 180 182 182 182 180 184 182 180 182 182 182 182 180 184 182 184 184 182 186 184 182 184 184 184 182 184 184 186 182 184 182 184 184 184 184 184 182 184 4357 245 237 237 237 237 243 243 237 243 237 243 245 243 245 243 237 245 243 241 243 243 245 245 237 245 245 243 243 245 237 245 249 243 245 243 245 243 245 245 245 241 245 243 245 884 170 170 170 170 173 170 164 173 170 170 170 170 170 170 173 167 170 170 170 170 170 170 170 170 173 173 173 173 173 173 170 170 170 173 173 170 173 170 170 173 173 170 173 173 6823 246 246 246 246 246 246 246 246 246 246 246 254 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 246 254 246 254 246 246 246 246 246 246 246 246 246 246 675 174 174 174 174 177 177 174 177 174 174 174 174 174 174 171 174 174 174 174 177 174 174 186 174 177 174 177 177 177 177 177 174 177 177 177 174 177 174 177 174 177 177 177 174