- May 2018
MITF binding site
A locus in DNA where the Melanogenesis Associated Transcription Factor (MITF) binds to exert its effects on gene expression.
MITF earned its name because it is a transcription factor associated with pigmentation. So, MITF binding sites are likely to be near genes involved in pigmentation.
The F<sub>ST</sub> is the fixation index, which describes genetic differences between populations by measuring genetic differences among and between populations.
An F<sub>ST</sub> value close to 1.0 indicates that the populations being compared are highly divergent (very different) from one another.
A lysosome is an organelle in animal cells where unwanted material gets digested. A lysosomal protein is a molecule that performs its function inside (or on the surface of) the lysosome.
N. G. Jablonski, G. Chaplin, The evolution of human skin coloration. J. Hum. Evol. 39, 57–106 (2000). doi:10.1006/jhev.2000.0403pmid:10896812
The work of Jablonski and Chaplin strongly supports the theory that melanin pigmentation in human skin is an adaptation to regulate the amount of UV radiation that gets into the epidermis, with different populations having different pressures based on their environments.
They argue that protection against lysis of nutrients (such as folate) was the primary selective agent that led to darker pigmentation of people living near the equator, because folate is closely linked to reproductive success in humans.
They also argue that skin pigmentation is so responsive to environmental conditions that skin pigmentation is not valuable when assess the genetic relatedness of human groups.
Populations at lower latitudes have darker pigmentation than populations at higher latitudes, suggesting that skin pigmentation is an adaptation
The work of Jablonski and Chaplin strongly supports the theory that melanin pigmentation in human skin is an adaptation that helps regulate the amount of UV radiation that gets into the epidermis, with different populations having different selective pressures based on their environments.
They argue that protection against UV breakdown of nutrients (such as folate) was the primary selective agent that led to darker pigmentation of people living near the equator, because folate is closely linked to reproductive success in humans.
Jablonski and Chaplin also argued that skin pigmentation is so responsive to environmental conditions that it is not valuable when assessing the genetic relatedness of human groups.
- Apr 2018
A GWAS analysis with linear mixed models, controlling for age, sex, and genetic relatedness (9), identified four regions with multiple significant associations
The authors do a Genome-Wide Association Study (GWAS) analysis, which is a study of whether any particular loci in the genome are associated with a particular phenotype (in this case, skin pigmentation levels).
GWAS analyses look at variants (alleles) across the whole genome, rather than focusing only on regions that are thought to be associated with a phenotype. This way, new regions of the genome can be discovered as being involved in the genetic architecture of a trait.
The authors control for age, sex, and genetic relatedness, because these are variables that could confound (confuse) the results by affecting the phenotype of interest, in this case skin pigmentation. To avoid confusion, the authors take these variables into account when they look for associations between SNPs and skin pigmentation.
using the Illumina Infinium Omni5 Genotyping array
The authors used the Illumina Infinium Omni5 Genotyping array, a specific type of SNP array.
The array has a set of oligos (short stretches of single-stranded DNA) on a chip that is complementary to the DNA right next to the SNP of interest. DNA taken from the person being genotyped gets broken up into small pieces and put on the chip, where it will stick to the complementary oligo.
Next, nucleotides (A, C, G, T) labeled with different color dyes get put on the chip and will bind to the DNA that is attached to the oligos. A scanner reads what color is found at each spot on the chip, which tells the investigator what nucleotide the person has at that locus—which is also known as what allele the person has.
Of these, there is limited knowledge about loci that affect pigmentation in populations with African ancestry (6, 7).
Two different research groups used the same population to look for genetic determinants of both skin and eye pigmentation in a population of people in Cape Verde. In this population, there has been extensive admixture (interbreeding) between European and African populations.
OCA2 was identified as being related to both skin and eye pigmentation, and another overlapping variant with this study, SLC24A5, was identified as being associated with skin pigmentation. Further work identified two additional genes associated with skin pigmentation; one (DDB1) was also a candidate gene in this paper.
only a subset of these genes have been linked to normal variation in humans (5)
Liu et al. perform a GWAS study similar to those performed in this paper, looking for genetic variants that correlate with skin pigmentation in a population of European people.
They identify 9 genes that may be associated with skin pigmentation in Europeans, one of which (HERC2/OCA2) overlaps with the genes identified in this study as being associated with skin pigmentation in Africans.
Nonhuman animals are used to study, or model, diseases and understand the role of genes or proteins in processes, such as skin pigmentation. Thanks to shared ancestry, there are many similarities between human and nonhuman animals at the cellular and genetic levels. Thus, animals can be used to do experiments than can't be done in people.
Structure, shape, appearance (as opposed to function).
Cells in the basal layer of the epidermis that form a barrier against environmental damage.
A cell that produces melanin.
Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173–1186 (2014). doi:10.1038/ng.3097pmid:25282103
Wood et al. analyze multiple independent studies to identify the SNPs that are most strongly associated with adult height (in Europeans).
They identify thousands of variants that are associated with height, and even with the top ~9500 SNPs they can only explain ~29% of height variance. Like skin pigmentation, height is a complex trait (a trait controlled by more than one gene), so it is interesting to compare the number of variants associated with height to the number of variants associated with skin pigmentation
R. Yu, R. Broady, Y. Huang, Y. Wang, J. Yu, M. Gao, M. Levings, S. Wei, S. Zhang, A. Xu, M. Su, J. Dutz, X.Zhang, Y. Zhou, Transcriptome analysis reveals markers of aberrantly activated innate immunity in vitiligo lesional and non-lesional skin. PLOS ONE 7, e51040 (2012). doi:10.1371/journal.pone.0051040pmid:23251420
Yu et al. compare gene expression (as well as immune cell presence) in lesional (lack of melanin) and nonlesional (containing melanin) skin of vitiligo patients. They identify 17 genes that have different expression in the different patches of skin, despite being on the same person and having the same genetic background.
The authors of the current paper look at expression of one of the genes identified in Yu et al., MFSD12, because they identify SNPs near and in this gene that are associated with skin pigmentation in their study.
S. Mallick, H. Li, M. Lipson, I. Mathieson, M. Gymrek, F. Racimo, M. Zhao, N. Chennagiri, S. Nordenfelt, A. Tandon, P. Skoglund, I. Lazaridis, S. Sankararaman, Q. Fu, N. Rohland, G. Renaud, Y. Erlich, T. Willems, C. Gallo, J. P. Spence, Y. S. Song, G. Poletti, F. Balloux, G. van Driem, P. de Knijff, I. G. Romero, A. R. Jha, D. M. Behar, C. M. Bravi, C. Capelli, T. Hervig, A. Moreno-Estrada, O. L. Posukh, E. Balanovska, O. Balanovsky, S. Karachanak-Yankova, H. Sahakyan, D. Toncheva, L. Yepiskoposyan, C. Tyler-Smith, Y. Xue, M. S.Abdullah, A. Ruiz-Linares, C. M. Beall, A. Di Rienzo, C. Jeong, E. B. Starikovskaya, E. Metspalu, J. Parik, R.Villems, B. M. Henn, U. Hodoglugil, R. Mahley, A. Sajantila, G. Stamatoyannopoulos, J. T. S. Wee, R.Khusainova, E. Khusnutdinova, S. Litvinov, G. Ayodo, D. Comas, M. F. Hammer, T. Kivisild, W. Klitz, C. A.Winkler, D. Labuda, M. Bamshad, L. B. Jorde, S. A. Tishkoff, W. S. Watkins, M. Metspalu, S. Dryomov, R. Sukernik, L. Singh, K. Thangaraj, S. Pääbo, J. Kelso, N. Patterson, D. Reich, The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016). doi:10.1038/nature18964pmid:27654912
This report describes the data set obtained by the Simons Genome Diversity Project (SGDP), which contains genome data from 300 individuals from 142 diverse populations, and reveals more features of human genetic variation. The SGDP focused on smaller populations than the 1000 Genomes Project.
1000 Genomes Project Consortium, A global reference for human genetic variation. Nature 526, 68–74 (2015). doi:10.1038/nature15393pmid:26432245
The 1000 Genomes Project was a massive effort toward understanding human genetic variation. This report describes the distribution of genetic variation across over 2500 individuals from 26 populations.
For each individual, there is data available from sequencing of the whole genome, deeper sequencing of the exome (to identify rare variants in coding genes), and for some individuals SNP microarray data is available.
F. Liu, M. Visser, D. L. Duffy, P. G. Hysi, L. C. Jacobs, O. Lao, K. Zhong, S. Walsh, L. Chaitanya, A.Wollstein, G. Zhu, G. W. Montgomery, A. K. Henders, M. Mangino, D. Glass, V. Bataille, R. A. Sturm, F. Rivadeneira, A. Hofman, W. F. J. van IJcken, A. G. Uitterlinden, R.-J. T. S. Palstra, T. D. Spector, N. G.Martin, T. E. C. Nijsten, M. Kayser, Genetics of skin color variation in Europeans: Genome-wide association studies with functional follow-up. Hum. Genet. 134, 823–835 (2015). doi:10.1007/s00439-015-1559-0pmid:25963972
Liu et al. perform a GWAS study similar to those performed in this paper, looking for genetic variants that correlate with skin pigmentation in a population of European people.
They identify 9 genes that may be associated with skin pigmentation in Europeans, one of which (HERC2/OCA2) overlaps with the genes identified in this current paper as being associated with skin pigmentation in Africans.
The building or breaking down of pyrimidines in cells—pyrimidines include the DNA/RNA bases uracil, thymine, and cytosine.
mouse phenotype database
A database of results from published studies related to mouse phenotypes.
When one gene (or variant of a gene) has multiple effects on seemingly unrelated traits.
These observations are consistent with the hypothesis that darker pigmentation is a derived trait that originated in the genus Homo within the past ~2 million years after human ancestors lost most of their protective body hair, though these ancestral hominins may have been moderately, rather than darkly, pigmented
The authors conclude that darker skin pigmentation possibly appeared ~2 million years ago, in the genus Homo (which we, as Homo sapiens, are part of), and that this pigmentation may have arisen as our ancestors lost their protective body hair, which would have previously protected them from some of the harmful effects of the sun.
MC1R, which is under purifying selection in Africa
Harding et al. sequence the MC1R gene in individuals from Europe and Africa, and by a variety of statistical tests they determine that the gene is under purifying selection (selective removal of alleles that are deleterious, in this case, alleles that lead to reduced melanin) in Africa, but not Europe.
This observation indicates that the genetic architecture of skin pigmentation is simpler (i.e., fewer genes of stronger effect) than other complex traits, such as height
Wood et al. analyze multiple independent studies to identify the SNPs that are most strongly associated with adult height (in Europeans).
They identify thousands of variants that are associated with height, and even with the top ~9500 SNPs they can only explain ~29% of height variance.
Their work also suggests that increasing the number of individuals in a GWAS analysis will initially suggest new variants, but will eventually reach a point of "saturation" where new variants will highlight the same genes that have already been seen.
To estimate the proportion of pigmentation variance explained by the top eight candidate SNPs at SLC24A5, MFSD12, DDB1/TMEM138 and OCA2/HERC2, we used a linear mixed model with two genetic random effect terms, one based on the genome-wide kinship matrix, and the other based on the kinship matrix derived from the set of significant variants.
The authors estimate how much pigmentation variation can be explained by their top predicted variants.
A mixed model has both fixed effects and random effects. In this case the authors looked at random effect terms based on looking at the whole genome as well as looking at only variants that were identified as significant. The mixed model gives an estimation of how much each of the variants contributes to skin pigmentation in their study.
consensus SOX2 motif
A consensus motif is a pattern of the most frequent nucleotides that are bound by a particular protein, in this case SOX2.
A numeric scale that specifies the acidity (low pH) or basicity (high pH) of a solution (or in this case, cell). The number is based on the molar concentration of hydrogen ions in the sample.
The pH of cells affects the activity of many enzymes and other proteins.
affects pigmentation by modulating melanosomal
Bellono et al. identify OCA2 as being essential for eye and skin pigmentation. The gene encodes a protein that is a transmembrane channel in the membranes of melanosomes which allows passage of chloride out of the melanosome, thereby regulating melanosome pH.
Bellono et al. find that this passage of chloride ions out of the melanosome is essential for melanin formation, and changing the function of OCA2 by mutations that have been identified in albinism (a condition characterized by lack of melanin) reduces melanin formation.
chloride transporter protein
A protein that spans the cell membrane, allowing chloride ions (and sometimes other ions) to enter or exit the cell under specific conditions.
Loss of variation in the DNA near a mutation that has increased in the population due to positive selection.
both SNPs interact with the promoters of DDB1 and neighboring genes in MCF-7 cells (46, 47)
Li et al. develop and perform ChIA-PET (Chromatin Interaction Analysis by Paried-End-Tag sequencing), a method to identify regions of chromatin that interact with each other at higher resolution than previous methods.
Using this method in MCF7 cells (a breast cancer cell line), they identify interactions between the location of several variants of interest to Crawford et al. with the promoter of the gene DDB1.
When genetic variants are referred to as "tightly linked," it means that they are close together and usually inherited together.
An ovarian follicle is a group of cells inside of the ovary that releases an egg cell during ovulation. These follicles must be maintained for a female mammal to retain fertility.
E3 ubiquitin ligases
An E3 ubiquitin ligase assists in transferring a ubiquitin onto a protein substrate. Ubiquitin is a protein that occurs ubiquitously in mammalian tissues, and is a regulatory molecule that can affect proteins in many ways, including marking them for destruction, causing them to move to a different part of the cell, or altering their ability to interact with other proteins.
These results indicate that mutation of Mfsd12 is responsible for the gray coat color of gr/gr mutant mice, and that loss of Mfsd12 reduces pheomelanin within the hairs of agouti mice.
The authors conclude that the gr/gr "grizzled" mice are gray because of a mutation in Mfsd12, which they identified by sequencing DNA from these mice, and that losing expression of Mfsd12 leads to the same coat color phenotype in wild-type agouti mice
Parts of a protein that span the cell membrane.
Taken together, these results indicate that MFSD12 plays a conserved role in vertebrate pigmentation. Depletion of MFSD12 increases eumelanin content in a cell-autonomous manner in skin melanocytes, consistent with the lower levels of MFSD12 expression observed in melanocytes from individuals with African ancestry. Since MFSD12 localizes to lysosomes and not to eumelanosomes, this may reflect an indirect effect through modified lysosomal function. By contrast, loss of MFSD12 has the opposite effect on pheomelanin production, reflecting a more direct effect on function of pheomelanosomes, which have a distinct morphology (3), gene expression profile (36), and, like zebrafish pterinosomes, a potentially different intracellular origin from eumelanosomes (37). While disruption of MFSD12 alone accounts for changes in pigmentation, the role of neighboring loci such as HMG20B on pigmentation remains to be explored.
The authors conclude that MFSD12 is important for skin pigmentation in vertebrates—they have data from humans, mice, and zebrafish, suggesting that this function is conserved in most animals.
MFSD12 functions in a cell-autonomous manner in melanocytes, meaning the protein acts on the cell that it is in. As they saw MFSD12 localizing with lysosomes, not eumelanosomes, it doesn't seem to function by directly affecting the production of melanin; rather they suggest that it could be modifying lysosomal function leading to different levels of melanin in the cells.
The authors note that while changes to MFSD12 have an effect on pigmentation, there are other nearby genes that might play a role as well.
CRISPR/Cas9 was used to generate a Mfsd12 null allele in a wild-type mouse background
The authors extend their previous experiments in mouse cells to make mice that are missing Mfsd12 in all of their cells.
They use CRISPR/Cas9 to make a null allele of the gene, which is a nonfunctional copy of a gene due to genetic mutation. They do this in mice that are otherwise wild-type, or have typical genetic and phenotypic characteristics.
A pseudogene is a stretch of DNA that looks similar to a gene, but has lost some or all of its function.
We silenced expression of the mouse ortholog of MFSD12 (Mfsd12) using small hairpin RNAs (shRNAs) in immortalized melan-Ink4a mouse melanocytes derived from C57BL/6J-Ink4a−/−mice
In this experiment, the authors use a genetic tool called shRNA (small hairpin RNAs) to turn off expression of the gene in mice that is homologous to one of the genes they identified in the previous section as being associated with skin pigmentation. An ortholog is a homologous gene from a different species.
An shRNA is an artificial RNA molecule that is shaped like a hairpin—a strand with a tight turn as it folds back on itself. shRNAs are processed by mammalian cells into small interfering RNAs (siRNAs), which suppress gene expression of a particular gene that it is complementary to—in this case, the authors use shRNA that will silence the gene Mfsd12.
An individual with two identical alleles for a given locus.
Bonferroni correction (or adjustment) compensates for the increased likelihood of rejecting the null hypothesis incorrectly due to testing multiple hypotheses (testing multiple hypotheses increases the chance that a rare event occurs).
This correction is performed by dividing the significance level desired (generally 0.05) by the number of hypotheses to be tested (for example, if testing two hypotheses), and testing each individual hypothesis against this new number (for example, 0.05/2 = 0.025).
primary human melanocytes
Primary refers to cells (in this case, human melanocytes) that have been taken from a tissue (human skin) and are growing in a dish without further alterations.
Integrated haplotype score: A statistical test to measure how far from the SNP of interest haplotype homozygosity extends on the ancestral compared to the derived allele. Haplotype homozygosity measures the likelihood of selecting two identical haplotypes at random from a population.
If a SNP is under selection, it will often occur with longer haplotype homozygosity than expected (on either the ancestral or derived allele), and extreme iHS values will result.
A synonymous variant is a change in the DNA that does not lead to a change in the amino acid sequence of the protein encoded by the gene.
This can happen because each amino acid is encoded by a stretch of three nucleotides in the DNA, called a codon, and there are multiple codons for the same amino acid.
A skin condition characterized by patches of skin losing pigment and appearing white.
time to most recent common ancestor (TMRCA)
The most recent common ancestor (MRCA) of a group of organisms is the most recent individual (in this case, a hominid) from which all the organisms in the group are directly descended.
The MRCA of a population is hard or even impossible to determine for a large population, but the time when this individual lived (the time to most recent common ancestor, TMRCA) can be estimated based on mathematical modeling and knowledge of the genetic variation in the population.
We observe a signature consistent with positive selection at SLC24A5 in Europeans on the basis of extreme values of
When looking at the variants in the gene SLC24A5 across different populations, the authors found that the pattern (or signature) of variants suggests that this gene is under directional, positive selection.
Positive selection is a process leading to a particular trait being more common in the population than other traits. This leads to a higher number of low-frequency alleles than one would expect by chance, leading to an extreme, negative value of Tajima's D statistic.
Finding a signature consistent with positive selection at this locus suggests that the particular variant that under selection may provide an advantage to the people with that variant.
within SLC24A5 (rs1426654) (14)
Prior research used positional cloning (a genetic tool used to identify regions of the genome before sequencing was widely available), morpholino knockdown (a way to reduce specific protein levels), and DNA and RNA rescue, to identify a mutation in slc24a5 as being responsible for the reduced pigmentation phenotype of "golden" mutant zebrafish.
They also analyzed human data and identified a SNP in the coding region of SLC24A5 that was present at highly different frequencies in European and African populations, suggesting that it may have been the target of selection, and may be associated with skin pigmentation.
A nonsynonymous mutation is a change in the DNA sequence that leads to a different amino acid being used in the encoded protein.
using local imputation of high coverage sequencing data from a subset of 135 individuals and data from the Thousand Genomes Project (TGP)
To perform fine-mapping, the authors need more data than they have from their SNP arrays. To get more precise data, they performed full-coverage sequencing of a small subset of people from their original populations, sequencing only the regions of the genome they identified with the GWAS. Sequencing allows researchers to discover the order of nucleotides across a given region of the genome.
The authors also used previously published work from a data set called the Thousand Genomes Project, which looked at the sequences of the whole genome for over 1000 people.
Local imputation uses patterns of association between SNPs in the data sets with higher coverage, as well as known patterns of linkage disequilibrium (nonrandom association of alleles in a particular population) between SNPs, to predict what alleles a person has at those regions. This means that it is possible to confidently predict the alleles for each of the >1500 people in the current study, even for loci with SNPs that aren't included in the original genotyping array.
We then performed fine-mapping
The authors use fine-mapping to narrow down the larger regions identified in their GWAS results, allowing them to identify specific SNPs that are likely to be causal, so they can study these SNPs further. Fine-mapping uses a variety of methods to pinpoint the precise location of a gene or regulatory region that is associated with variation in a phenotype.
We genotyped 1,570 African individuals with quantified pigmentation levels
The authors determined the genotypes (the set of alleles in an organism's genome) of the people for whom they had quantified skin pigmentation.
They use a SNP array for genotyping, which is a tool that allows researchers to study small differences between genomes. A SNP (pronounced "snip") is a single nucleotide polymorphism, or a change to a single base (nucleotide) at a particular DNA locus.
To identify genes affecting skin pigmentation in Africa, we used a DSM II ColorMeter to quantify light reflectance from the inner arm as a proxy for melanin levels in 2,092 ethnically and genetically diverse Africans living in Ethiopia, Tanzania, and Botswana
The authors wanted to study a wide variety of Africans, and used a handheld battery operated tool called the DSM II ColorMeter to get a quantification of how much pigment was in each person's skin.
The DSM II ColorMeter works by shining light at skin and detecting how much light reflects back, and at what wavelengths. They used a particular wavelength to determine how much melanin was present in the skin.
L. Teng, B. He, J. Wang, K. Tan, 4DGenome: A comprehensive database of chromatin interactions. Bioinformatics 31, 2560–2564 (2015). doi:10.1093/bioinformatics/btv158pmid:25788621
Teng et al. describe a database they have curated called 4DGenome. They have collected published data related to how chromatin interacts with itself and make it available through a database that other researchers can use as a centralized location for chromatin interaction data.
Roadmap Epigenomics Consortium, Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015). doi:10.1038/nature14248pmid:25693563
The Roadmap Epigenomics Consortium set out to generate a resource consisting of a large number of epigenomes (the modifications that occur on top of the genome and impact gene expression, such as H3K27ac, used in this paper) of human cells.
This analysis begins to explore the connection between genetic variants and epigenomic states.
F. Hormozdiari, E. Kostem, E. Y. Kang, B. Pasaniuc, E. Eskin, Identifying causal variants at loci with multiple signals of association. Genetics 198, 497–508 (2014).doi:10.1534/genetics.114.167908pmid:25104515
This paper describes the development of a new statistical framework to estimate the probability that variants cause phenotypes. This new method, called CAVIAR (Causal Variants Identification in Associated Regions), is an improvement over previous methods because it allows for the possibility that multiple variants in a region are causal.
The authors of this paper used CAVIAR in their study to identify variants that are associated with skin pigmentation.
Additionally, most candidate causal variants are in non-coding regions, indicating the importance of regulatory variants influencing skin pigmentation phenotypes.
The authors note that many of the variants they identified as causal are in noncoding regions, which indicates the importance of regulatory regions in controlling skin pigmentation (this means that the amount of specific proteins is important to skin pigmentation, not just having the functional proteins present).
suggesting it is identical by descent.
The authors conclude that this allele, rs4932620 (T), is identical by descent between Australo-Melanesians and Africans because the haplotype surrounding the allele is the same or similar. This suggests that they have a common ancestor.
Exome sequencing of an archived gr/gr DNA sample, subsequently confirmed by Sanger sequencing in an independent colony,
The authors look at the exome (all of the protein coding parts of the genome) by gene sequencing, to see what differences exist between gr/gr mice that are gray and wild-type mice with the usual agouti color.
They sequence a saved sample they had, and confirm that they can find the same mutation in a gr/gr mouse from a different colony, to be sure that the mutation they found in Mfsd12 wasn't a random mutation that arose in their colony.
For confirmation of the mutation, they use Sanger sequencing of the specific region around the mutation. Sanger sequencing is a method of sequencing shorter pieces of DNA, instead of the whole exome.
We targeted transmembrane domain 2 (TMD2) in the highly conserved zebrafish ortholog of mfsd12a with CRISPR/Cas9
Here, the authors investigate the effect of losing the zebrafish ortholog of MFSD12 by using a technique called CRISPR/Cas9.
Cas9 is an enzyme that cuts DNA, and is guided to particular DNA sequences by a guide RNA (gRNA). Researchers can use different gRNAs to make cuts to particular DNA regions in order to disrupt expression of genes.
The authors use CRISPR/Cas9 with gRNAs targeting the part of the zebrafish ortholog mfsd12a that spans the cell membrane so that the protein will no longer function properly. They then examined the zebrafish for what impact the knockdown of the gene has had, if any.
We assessed the localization of human MFSD12 isoform c (RefSeq NM_174983.4) tagged at the C terminus with the HA epitope (MFSD12-HA). By immunofluorescence microscopy,
The authors look for what part of the cell the MFSD12 protein they're interested in is located. To find the protein, they first "tag" it at the C terminus with an HA epitope.
The C terminus is the end of the protein. Tags are often put on the end of the protein because in some cases they are less likely to interfere with expression or function if they're at the end.
HA stands for human influenza hemagglutinin, which is a protein on the surface of human cells. A particular part of it, the HA epitope, can be used to tag proteins in cells because it's small, so it's unlikely to interfere with protein function or location.
An epitope is a part of a protein that is recognized by an antibody, which can be bound to fluorescent molecules so it can be visualized by a technique called immunofluorescence microscopy.
These data suggest that MFSD12 suppresses eumelanin content in melanocytes
The authors conclude that MFSD12 normally acts to reduce eumelanin content in melanocytes, because when they reduce the amount of MFSD12 in cells they see more eumelanin. Eumelanin is brown/black pigment.
A lentivirus is a type of virus that is often used in research to deliver DNA into cells because it is efficient at doing so across many types of cells. In this case, lentivirus delivered DNA that encodes shRNAs.
derived rs56203814 and rs10424065 (T) alleles
A derived allele is an allele that is different from that carried by the common ancestor of the populations being examined.
transmembrane solute transporters
Proteins in the cell membrane that facilitate the movement of small molecules or ions across the cell membrane.
Analyses of gene expression using RNA-sequencing data from 106 primary melanocyte cultures
RNA-sequencing, or RNA-seq for short, is a method to quantify the amount of specific sequences of RNA in a sample.
To perform RNA-seq, the authors extracted the RNA from primary melanocyte cultures (melanocytes growing in a dish after being taken from human skin). Next they made complementary DNA, or cDNA, that corresponds to each RNA fragment from the cells. The more RNA there is for a particular gene, the more cDNA there will be.
Then, the authors sequence the cDNA, and determine how many times they detect each molecule of cDNA. The number of times they see each cDNA (the number of reads) corresponds to how much RNA for that gene was in the sample, and the authors compare the amount of RNA for different genes in different groups of people from whom they collected melanocytes.
Mann-Whitney-Wilcoxon (MWW) test
The MWW test is a statistical test used to determine whether a value selected from one sample at random will be equally likely to be greater than or less than a value selected at random from another sample.
In this case, the test is used to check for allelic imbalance. In heterozygotes, the level of expression of each allele may not be equal; if one is expressed more than the other, this is called allelic imbalance. In homozygotes, expression from the two alleles is more even because they are the same variant.
We ranked potential causal variants within each locus using CAVIAR
Next, the authors ranked the known and predicted variants using a method called CAVIAR (CAusal Variants Identification in Associated Regions). CAVIAR is a statistical method used to determine the likely DNA variants causing differences in a trait. This method is better than previous methods used to determine causal variants, because it allows for the idea that there may be multiple causal variants in a region. This is likely to be the case for many traits, possibly including skin pigmentation.
MFSD12-HA localized to punctate structures throughout the cell. Surprisingly, these puncta, like those labeled by the endogenous lysosomal membrane protein LAMP2, but not the melanosomal enzyme TYRP1, overlapped only weakly with pigmented melanosomes (Fig. 6, E to G; quantified in Fig. 6H). Instead, MFSD12-HA co-localized with LAMP2 (Fig. 6E, quantified in Fig. 6H), indicating that MFSD12 protein localizes to late endosomes and/or lysosomes in melanocytes and not to eumelanosomes.
The authors see the tagged MFSD12 protein in punctate structures, or small spots, throughout the cells. They were surprised to find that these puncta barely overlapped with pigmented melanosomes.
They had expected to see MFSD12 localizing with (existing in the same spot as) TYRP1, an enzyme that is involved with melanin synthesis, but instead they found that MFSD12 was found in the same spots (co-localized) with LAMP2, a protein that functions in lysosomes.
A difference in expression between the two alleles at a particular locus.
For most genes, expression is usually equal from each of the two alleles a person has. If there is allelic imbalance, one allele will be expressed more than the other.
H3K27ac refers to a specific alteration to a DNA packaging protein that is associated with transcriptional activity, and often occurs on enhancer regions which are considered active when this mark is present.
H3 stands for Histone H3, a protein that DNA is wrapped around in chromatin. K27 refers to the 27th amino acid Lysine in the Histone H3 protein. ac stands for acetylation, a modification in which an acetyl group is added to a protein.
By studying ethnically, genetically, and phenotypically diverse Africans, we identify novel pigmentation loci that are not highly polymorphic in other populations. Interestingly, the loci identified in this study appear to affect multiple phenotypes.
The authors have identified loci that are associated with pigmentation that were not known before, because they were the first to study such a diverse group of Africans. They note that the loci they identified have multiple effects in people, so the role for these loci in skin pigmentation may not be the only reason they are associated with particular populations.
These results, combined with large FST values between Africans and Europeans at SNPs tagging the extended haplotype near DDB1 (e.g., FST = 0.98 between Nilo-Saharans and CEU at rs7948623, within the top 0.01% of values on chromosome 11, table S4) are consistent with differential selection of alleles associated with light and dark pigmentation in Africans and non-Africans at this locus.
The authors conclude that at this particular locus, their data suggest that Africans and non-Africans have been subjected to different selective pressures that have led to different haplotypes in the two populations.
Approximately 28.9% (S.E. 10.6%) of the pigmentation variance is attributable to these SNPs. Considering each locus in turn and all significantly associated variants (p-value < 5 × 10−8), the trait variation attributable to each locus is: SLC24A5 (12.8%, S.E. 3.5%), MFSD12 (4.5%, S.E. 2.1%), DDB1/TMEM138 (2.2%, S.E. 1.5%), and OCA2/HERC2 (3.9%, S.E. 2.9%). Thus ~29% of the additive heritability of skin pigmentation in Africans is due to variation at these four regions.
The authors conclude that around 29% of the heritability (the amount of phenotype that can be attributed to a genotype) of skin pigmentation in Africans is due to changes in DNA in just four regions, which is fewer regions than they might have expected for a complex trait.
Coalescent analysis indicates that the TMRCA of all lineages is 1.7 mya (95% CI: 1.5–2.0 mya) and the TMRCA of lineages containing the derived (T) allele is 629 kya (95% CI 426–848 kya) (Fig. 4). The deep coalescence of lineages, and the positive Tajima’s Dvalues in this region in both African and non-African populations (fig. S5), is consistent with balancing selection acting at this locus
Using their mathematical models, the authors conclude that the TMRCA of the T allele, along with the positive Tajima's D value (which indicates an excess of intermediate alleles rather than rare alleles), suggest that there was balancing selection on this locus (selection to maintain multiple alleles at this locus).
These results are consistent with selection for the rs1426654 (A) allele in African populations following introduction, although complex models of demographic history cannot be ruled out.
The authors found that it is possible that the variation at the rs1426654 locus is due to selection for the A allele in African populations. However, there are other possible explanations for the variation observed that they can't rule out based on their data.
Folate is a B vitamin, and is important for many aspects of health including the prevention of anemia and certain birth defects.
Folate can be broken down by UV radiation (sun exposure), leading to lower levels in people with high UV exposure.
The production of a pigment called melanin. Melanin gives skin and hair its color.
Many factors can contribute to differences between individuals of the same species. Some differences are due to differences in DNA; these are referred to as the "genetic basis" for variation.
Association tests using a permutation approach indicated that, of the 35 protein-coding genes with a transcription start site within 1Mb of rs7948623, expression of DDB1 is most strongly associated with a SNP in an intron of DDB1, rs7120594, at marginal statistical significance after correction for ancestry and multiple testing (Padj = 0.06
The authors use a statistical approach to determine which gene within 1 Mb (1 million basepairs) of a variant of interest, rs7948623, has expression levels most strongly associated with the variant.
1 Mb is used as the distance to look for genes within because most regulatory regions act on genes that start within 1 Mb.
Allele-specific expression (ASE) analysis
The authors use allele-specific expression (ASE) analysis, a method to determine how much expression occurs from each allele at a particular locus in an individual.
ASE looks at the amount of RNA that gets transcribed from each copy of a gene (the copy inherited from the mother and the copy inherited from the father).
The authors use this analysis to determine whether people who are heterozygous for the allele of interest (people who have different variants in their two copies of the allele) have different levels of expression from the two copies of the gene, which would indicate that this particular variant could affect the expression level directly.
A process by which organisms that are not closely related evolve similar traits independently.
A process by which a single gene can encode multiple proteins.
RNA splicing is the removal of introns from the transcript made from a gene, leaving only the exons to be expressed. In alternative splicing, exons are excluded, introns are included, or different junctions are used between the introns and exons, leading to proteins with a different amino acid sequence.
minor allele frequencies
The frequency of the second most common allele in a population for a given locus.
Addition or subtraction of one or more nucleotides in a DNA sequence.
Compound heterozygotes have two different mutated alleles at the same genetic locus.
Pearson Correlation Coefficient (PCC)
A measure of the correlation of two continuous variables.
DNase I hypersensitive sites
Regions of chromatin that are highly sensitive to cutting by DNase I, an enzyme that cleaves DNA at many locations in the genome.
Genomic regions are more sensitive to DNase I cleavage if their chromatin has lost its condensed structure, indicating that these regions would also be available for binding by transcription factors and subsequent transcription. Therefore, DNase I hypersensitive sites are associated with regions of the genome that control transcription.
ancestral (G) and (T) alleles
The ancestral allele is the allele that was carried by the common ancestor of the populations.
mRNA stands for messenger ribonucleic acid. mRNA is made based on the sequence of DNA that encodes it, and specifies the amino acid sequence that will be used by the ribosome to make a protein product.