Evolutionary history and diversity of human-specific FAM72A paralogs
Metadata
Show full item recordAuthor
Kisselev, Ilya
Date
2023-12-14Citation
Kisselev, Ilya. Evolutionary history and diversity of human-specific FAM72A paralogs: insights from population genetics; A thesis submitted to the Faculty of Graduate Studies in partial fulfillment of the requirements for the Master of Science in Bioscience, Technology and Public Policy. Winnipeg, Manitoba, Canada: The University of Winnipeg, 2023. DOI: 10.36939/ir.202312181420.
Abstract
Gene duplication is a key driver of genetic diversity and adaptation, allowing genomes to develop complexity and redundant sequences that evolve along different trajectories. In human evolution, gene duplication played an important role: since divergence from the common ancestor with chimpanzees, humans have gained approximately 75 lineage-specific genes, influencing brain development, dietary adaptation, and immune regulation. The FAM72 gene family, with four paralogs (FAM72A-D) that arose after human-chimpanzee divergence, illustrates this process.
The evolutionary history and function of the FAM72 paralogs remain poorly described. The ancestral FAM72A protein drives early stages of somatic hypermutation in B cells by antagonizing UNG2. However, FAM72C-D paralogs have Trp125Arg amino acid substitution that prevents them from interacting with UNG2. This study hypothesizes that after the initial duplication from FAM72A to FAM72B, FAM72B duplicated to FAM72C and FAM72D. I hypothesize that opposing selective forces operate on FAM72A-B and FAM72C-D paralogs. Another hypothesis is that population-specific exposure to local environments during human evolution has driven the selection of population-specific adaptive haplotypes of FAM72A paralogs. The study used the 1000 Genomes dataset, testing selection through neutrality metrics and haplotype-based scores, and investigated functional divergence by comparing conserved amino acid sites and gene-wide LD patterns across human populations. Bayesian divergence time estimation between FAM72 paralogs was performed using the most common haplotypes in humans and chimpanzees. The hypothesized sequence of duplication events was supported by the phylogenetic analysis. The neutrality metrics identified FAM72C as recovering from a selective sweep, with other paralogs not showing signals of positive selection. Integrated haplotype scores of FAM72D suggested a recent selective sweep in African populations, and FAM72A-B showed high conservation. Linkage disequilibrium analysis highlighted functional regions, with FAM72A and FAM72B sharing active LD-enriched promoters, while FAM72C contained an active enhancer linked to immune cell function. Finally, multiple signatures of balancing selection were observed in an intronic region of FAM72C. The results suggest neutral or relaxed selection for FAM72A-B, but purifying selection following a selective sweep for FAM72C-D. The divergence of paralog pairs is evident in regulatory and functional shifts, notably with FAM72C’s unique immune cell associations. No clear signs of population-specific adaptation were identified, but FAM72B shows distinct haplotypes between East Asian and South Asian populations, hinting at either population bottlenecks or adaptive evolution. The findings show how gene duplication within the FAM72 gene family has contributed to genetic diversity and potential adaptability, with some members potentially shaping the evolutionary trajectory of immune function in human populations.