How Genetic studies reveal new relationships, species
How Genetic studies reveal new relationships, species
I've become absolutely fascinated with the newest genetic capabilities being used to identify relationships between taxa; to reveal new taxa, and the apparent ease with which it's done (if you have the tools.)
I've read a lot about it, but still understand so little.
JShuey posted a link to an awesome article https://www.researchgate.net/profile/Da ... axinae.pdf
Take a look at that paper. What's neat is that they have trees showing relationships of taxa, as revealed by genetics. What's somewhat understandable, while still being somewhat frustrating, is that as I read it, the trees don't all express the same relationships.
I don't well understand what these trees do, or how they work. And, in looking at the Neighbor Joining Tree on Wikipedia it's all math.
Anyway, this paper uses three trees:
Neighbor Joining Tree
Maximum Likelihood Tree
Bayesian Inference Tree
Now, I'm going to stop there because any further discussion on my part is likely to be wrong. But hopefully our smarter members can jump in and tell us more about these relationship trees, and (in this paper) why they don't seem to be fully in agreement.
And while they're at it, maybe they can explain "Depending on which route we go, methods-wise, we’ll be generating nuclear genome data either at very coarse (several thousand markers randomly found across the genome) to very fine scale (millions of markers found across the genome). In doing so, we’ll also generate mtDNA data, but the nuclear data is much more informative for what’s going on" which is not in the paper.
I've read a lot about it, but still understand so little.
JShuey posted a link to an awesome article https://www.researchgate.net/profile/Da ... axinae.pdf
Take a look at that paper. What's neat is that they have trees showing relationships of taxa, as revealed by genetics. What's somewhat understandable, while still being somewhat frustrating, is that as I read it, the trees don't all express the same relationships.
I don't well understand what these trees do, or how they work. And, in looking at the Neighbor Joining Tree on Wikipedia it's all math.
Anyway, this paper uses three trees:
Neighbor Joining Tree
Maximum Likelihood Tree
Bayesian Inference Tree
Now, I'm going to stop there because any further discussion on my part is likely to be wrong. But hopefully our smarter members can jump in and tell us more about these relationship trees, and (in this paper) why they don't seem to be fully in agreement.
And while they're at it, maybe they can explain "Depending on which route we go, methods-wise, we’ll be generating nuclear genome data either at very coarse (several thousand markers randomly found across the genome) to very fine scale (millions of markers found across the genome). In doing so, we’ll also generate mtDNA data, but the nuclear data is much more informative for what’s going on" which is not in the paper.
- adamcotton
- Global Moderators
- Reactions:
- Posts: 969
- Joined: Tue Mar 22, 2022 12:24 pm
- Location: Thailand
Re: How Genetic studies reveal new relationships, species
This is a huge subject, and I am not strictly a DNA analyst; rather I work with them on various Papilionidae projects, partly to ensure they put the right names on their samples, many of which I provide for the studies, and to assist with non-DNA implications of the results of their analyses.
One point I can make for starters is about 'Neighbor Joining' analysis. This is a relatively simple analysis which can quickly give an approximate tree, but it has a relatively high rate of errors because it does not take into account reversals or convergent sequences. It just links the sequences that are most similar to each other, and cannot confirm an evolutionary relationship. Sometimes unrelated taxa are placed together by this method. Nowadays Neighbor Joining analysis is not used as the main method to produce a tree, but differences between this and other methods can point out interesting anomalies.
This Wikipedia page gives more useful information about the various methods:
https://en.wikipedia.org/wiki/Phylogenetic_tree
You also asked about mt and nuclear DNA. Probably you know that 'mt' stands for mitochondrial, the DNA in the mitochondria of the organism's cells, not part of the nucleus. The mitochondria help provide energy for the cell, but do not affect the appearance of the animal. The important difference here, apart from the small number of genes in mitochondria compared to the nucleus is that the whole of the mitochondria is inherited from the mother only, whereas nuclear DNA originates 50-50 from each parent. This means that using mtDNA can only imply information about the maternal lineage. For example in a hybrid the mtDNA is identical to that of the female parent, so there is no indication that the specimen is actually a hybrid at all.
As well as nuclear DNA being derived from both parents, the genes in the nucleus are the actual 'blueprint' for all the morphological characters of the organism, so differences in the nuclear DNA (if examined in detail) are somewhat equivalent to morphological differences, or at least they code for those differences. This is of course a huge simplification of something I know very little about, but for example in humans presence or absence of certain genes is linked to particular diseases, such as breast cancer.
Adam.
One point I can make for starters is about 'Neighbor Joining' analysis. This is a relatively simple analysis which can quickly give an approximate tree, but it has a relatively high rate of errors because it does not take into account reversals or convergent sequences. It just links the sequences that are most similar to each other, and cannot confirm an evolutionary relationship. Sometimes unrelated taxa are placed together by this method. Nowadays Neighbor Joining analysis is not used as the main method to produce a tree, but differences between this and other methods can point out interesting anomalies.
This Wikipedia page gives more useful information about the various methods:
https://en.wikipedia.org/wiki/Phylogenetic_tree
You also asked about mt and nuclear DNA. Probably you know that 'mt' stands for mitochondrial, the DNA in the mitochondria of the organism's cells, not part of the nucleus. The mitochondria help provide energy for the cell, but do not affect the appearance of the animal. The important difference here, apart from the small number of genes in mitochondria compared to the nucleus is that the whole of the mitochondria is inherited from the mother only, whereas nuclear DNA originates 50-50 from each parent. This means that using mtDNA can only imply information about the maternal lineage. For example in a hybrid the mtDNA is identical to that of the female parent, so there is no indication that the specimen is actually a hybrid at all.
As well as nuclear DNA being derived from both parents, the genes in the nucleus are the actual 'blueprint' for all the morphological characters of the organism, so differences in the nuclear DNA (if examined in detail) are somewhat equivalent to morphological differences, or at least they code for those differences. This is of course a huge simplification of something I know very little about, but for example in humans presence or absence of certain genes is linked to particular diseases, such as breast cancer.
Adam.
Re: How Genetic studies reveal new relationships, species
Thanks Adam, good link. And I will admit, your explanation and examples did help tie some things together.
Thinking this through: so for a hybrid (e.g., Ornithoptera allotei) the mtDNA would show it as either priamus or victoriae; the NDNA would show it's a hybrid. A recombinant hybrid (e.g., Papilio appalachiensis) both mtDNA and NDNA would show it as a hybrid.
The link you shared noted that these tree analyses have trouble with recombinant hybrids. I've read the papers on appalachiensis though can't quite understand how they arrive at the timeframe of historical recombination. How would science, using DNA, discern between a recombinant hybrid and two taxa in a hybrid zone that are constantly being "bombarded" by genetic swapping? Wouldn't they look genetically the same?
Thinking this through: so for a hybrid (e.g., Ornithoptera allotei) the mtDNA would show it as either priamus or victoriae; the NDNA would show it's a hybrid. A recombinant hybrid (e.g., Papilio appalachiensis) both mtDNA and NDNA would show it as a hybrid.
The link you shared noted that these tree analyses have trouble with recombinant hybrids. I've read the papers on appalachiensis though can't quite understand how they arrive at the timeframe of historical recombination. How would science, using DNA, discern between a recombinant hybrid and two taxa in a hybrid zone that are constantly being "bombarded" by genetic swapping? Wouldn't they look genetically the same?
Re: How Genetic studies reveal new relationships, species
So (Adam) in a paper you linked elsewhere on Ornithoptera https://www.nature.com/articles/srep11860#Sec5
Figure 2, Bayesian Tree, it appears to me, surprisingly, that O euphorion is closer related to O aesacus than to O priamus. Am I reading that correctly?
Figure 2, Bayesian Tree, it appears to me, surprisingly, that O euphorion is closer related to O aesacus than to O priamus. Am I reading that correctly?
- adamcotton
- Global Moderators
- Reactions:
- Posts: 969
- Joined: Tue Mar 22, 2022 12:24 pm
- Location: Thailand
Re: How Genetic studies reveal new relationships, species
According to this tree, yes O. euphorion is sister to aesacus, but you should bear in mind that this is not fact, it is a hypothesis based on the topography of THIS tree.
Almost certainly this is the best statistically supported tree produced by this analysis, but if you check the supplementary file (downloadable from the link for the paper) there are a number of different topographies based on individual genes, and it seems that this tree is based only on a small number of genes, so it is likely to have some anomalous results. In the Taxon sampling section Condamine et al. state "Despite intense efforts, DNA sequencing was arduous and few genes were amplified (Supplementary Table S4)."
Whole genome analysis, whenever that happens, would give a much more reliable hypothesis; but even that is just a hypothesis, not fact.
Adam.
Almost certainly this is the best statistically supported tree produced by this analysis, but if you check the supplementary file (downloadable from the link for the paper) there are a number of different topographies based on individual genes, and it seems that this tree is based only on a small number of genes, so it is likely to have some anomalous results. In the Taxon sampling section Condamine et al. state "Despite intense efforts, DNA sequencing was arduous and few genes were amplified (Supplementary Table S4)."
Whole genome analysis, whenever that happens, would give a much more reliable hypothesis; but even that is just a hypothesis, not fact.
Adam.
- adamcotton
- Global Moderators
- Reactions:
- Posts: 969
- Joined: Tue Mar 22, 2022 12:24 pm
- Location: Thailand
Re: How Genetic studies reveal new relationships, species
Chuck,
I think more people would see this thread if it was moved from 'Insect identification' to 'Open Topics', but since you initiated the thread I don't want to move it without your permission.
Adam.
I think more people would see this thread if it was moved from 'Insect identification' to 'Open Topics', but since you initiated the thread I don't want to move it without your permission.
Adam.
Re: How Genetic studies reveal new relationships, species
Hi Adam, feel free to move the thread if you believe it will benefit the forum. I try to not put everything under general catagories. I always search for new posts with "Quick Links", so if others don't and instead browse each category they may not, as you state, find it.adamcotton wrote: ↑Fri Dec 23, 2022 5:59 pm Chuck,
I think more people would see this thread if it was moved from 'Insect identification' to 'Open Topics', but since you initiated the thread I don't want to move it without your permission.
Adam.
- adamcotton
- Global Moderators
- Reactions:
- Posts: 969
- Joined: Tue Mar 22, 2022 12:24 pm
- Location: Thailand
Re: How Genetic studies reveal new relationships, species
I moved the thread, and hopefully one or two of the other knowledgeable members will contribute (maybe after the holidays).
Adam.
Adam.
Re: How Genetic studies reveal new relationships, species
A comment was made to me, and I've heard it before, "COI isn't useful for descriminating close taxa."
What's confusing to me, in playing with BOLD and comparing similar taxa, I see 100% matches, 98% matches, and on down. The 100% matches make sense to me, so far as my research, and the 98% match to a known taxon also makes sense- and differentiates them.
Yes, there are other sequences that can be used, but it seems COI works in most cases. Please critique.
What's confusing to me, in playing with BOLD and comparing similar taxa, I see 100% matches, 98% matches, and on down. The 100% matches make sense to me, so far as my research, and the 98% match to a known taxon also makes sense- and differentiates them.
Yes, there are other sequences that can be used, but it seems COI works in most cases. Please critique.
Re: How Genetic studies reveal new relationships, species
So - I'm no expert. But what I see is that DNA barcodes are quick and dirty tools that are great for identifying lineages that have evolutionarily diverged. And for the most part - barcodes are very useful and effective. But because the "barcode" snippet is so small, the actual relationships revealed between species are open to statistical interpretation because you are just seeing a small fraction of the picture.
Technology is now allowing whole genome comparison (or at least entire genes) - and because you are seeing a bigger picture (perhaps the entire genetic picture) the different analyses tend to produce the same results. But at the same time, because more of the genome is used, additional "new species" are being discovered that represent subtly different species. - https://www.pnas.org/doi/epdf/10.1073/pnas.1821304116
John
Technology is now allowing whole genome comparison (or at least entire genes) - and because you are seeing a bigger picture (perhaps the entire genetic picture) the different analyses tend to produce the same results. But at the same time, because more of the genome is used, additional "new species" are being discovered that represent subtly different species. - https://www.pnas.org/doi/epdf/10.1073/pnas.1821304116
John
Re: How Genetic studies reveal new relationships, species
Thanks John- I'll read later, that tree looks very scary! I'm glad I don't do Skippers.
The challenge for me at least is getting ANY analysis; whole genome is out of the question unless I want to pony up $10,000.
BOLD for my specimens is showing a 99.54% match to one taxon and 98.32% match to another taxon. According to the old "2%" rule, there is no speciation, but that 2% rule has been largely thrown out. But my specimens are also 100% match to a bunch in BOLD, so clearly there are clades or something.
If the match between two taxa is only, say, 50% that's pretty clear using only barcoding. What is the minimum delta now, if 2% isn't used? Or is 2% only not being used because nuclear reveals so much more? But then still, a 2% differentiation would be a concrete separation of taxa, right?
The challenge for me at least is getting ANY analysis; whole genome is out of the question unless I want to pony up $10,000.
BOLD for my specimens is showing a 99.54% match to one taxon and 98.32% match to another taxon. According to the old "2%" rule, there is no speciation, but that 2% rule has been largely thrown out. But my specimens are also 100% match to a bunch in BOLD, so clearly there are clades or something.
If the match between two taxa is only, say, 50% that's pretty clear using only barcoding. What is the minimum delta now, if 2% isn't used? Or is 2% only not being used because nuclear reveals so much more? But then still, a 2% differentiation would be a concrete separation of taxa, right?
- adamcotton
- Global Moderators
- Reactions:
- Posts: 969
- Joined: Tue Mar 22, 2022 12:24 pm
- Location: Thailand
Re: How Genetic studies reveal new relationships, species
The 2% difference in COI sequences is still used, but it is not recognised as a fixed point that indicates different species. In some taxa the difference between species is much lower, whereas in others it is higher. Certainly a difference over 2% does suggest that there could be two species involved. I know of some clearly distinct species which differ by only a few base pairs, well below the 2% threshold.
Adam.
Adam.
Re: How Genetic studies reveal new relationships, species
You need to think of COI as one of many characters to consider. When this was first beginning the 2% level was a clue that you should look at other things as well - like solid wing pattern differences, hostplants, larval characters, and so on. But remember - 2% is a lot - and if the COI is that different - then they two things have been separated for quite a whileChuck wrote: ↑Tue Nov 14, 2023 3:16 pm Thanks John- I'll read later, that tree looks very scary! I'm glad I don't do Skippers.
The challenge for me at least is getting ANY analysis; whole genome is out of the question unless I want to pony up $10,000.
BOLD for my specimens is showing a 99.54% match to one taxon and 98.32% match to another taxon. According to the old "2%" rule, there is no speciation, but that 2% rule has been largely thrown out. But my specimens are also 100% match to a bunch in BOLD, so clearly there are clades or something.
If the match between two taxa is only, say, 50% that's pretty clear using only barcoding. What is the minimum delta now, if 2% isn't used? Or is 2% only not being used because nuclear reveals so much more? But then still, a 2% differentiation would be a concrete separation of taxa, right?
But then, some weirdness has crept in. I can't find the paper, but in Calycopis, it looks like mitochondria got swapped back into a linage in some geographies (Nick Grishin and Bob Robbins lead this work). And john burns found that two very different species of skippers (based on wing pattern and genitalia) differed by just a smidge - like 0.1% if I recall. So - it's just another thing to factor in - but a solid differences indicated that two entities haven't interbred in a fairly long time.... A tiny difference may not mean anything.
john
Re: How Genetic studies reveal new relationships, species
With COI, I like the concept of "barcode gap". If you consistently see a, say, 1.2 or 1.3% difference between the barcodes of specimens from two populations (or forms, or what-have-you) and there are no barcodes "filling the gap", you likely have two separate species on your hands. Personally, I'd also want to support the barcode hypothesis with concrete morphological and/or biogeographical evidence.
I also like Klee Diagrams, but that may be because I'm not so interested in proposed evolutionary lineages (and as Adam has pointed out, sometimes the different trees conflict with each other) and more interested in evidence supporting the existence of separate species.
I also like Klee Diagrams, but that may be because I'm not so interested in proposed evolutionary lineages (and as Adam has pointed out, sometimes the different trees conflict with each other) and more interested in evidence supporting the existence of separate species.
Re: How Genetic studies reveal new relationships, species
Given that one could well argue that COI is better at differentiation than morphological comparisions it seems this gap is more reliable. Given that, I've read several papers lately that show COI for broad-ranging taxa that are broken into clades (or some such) and so long as it exceeds 1.2% (or pick a number) then Joe Anybody could jump and describe dozens of new species based on the gap alone.
- adamcotton
- Global Moderators
- Reactions:
- Posts: 969
- Joined: Tue Mar 22, 2022 12:24 pm
- Location: Thailand
Re: How Genetic studies reveal new relationships, species
There is some discussion whether a new taxon based purely on COI sequence constitutes 'description in words' under the ICZN Code or not when naming it. Some argue that the letters are abbreviations of words, whereas others say these alone should not count as a description 'in words'.
I think it is desirable to find morphological as well as sequence differences when naming something.
Adam.
I think it is desirable to find morphological as well as sequence differences when naming something.
Adam.
Re: How Genetic studies reveal new relationships, species
Do we have any BOLD experts?
Bold gives me a list of the top 100 matches and the % match to a given barcode.
It also gives me a map with the location of matches greater than 98%. I don't want this. I want a map for only the 100% matches.
Any idea?
Bold gives me a list of the top 100 matches and the % match to a given barcode.
It also gives me a map with the location of matches greater than 98%. I don't want this. I want a map for only the 100% matches.
Any idea?
Re: How Genetic studies reveal new relationships, species
OK, next question! Subspecies.
I know IUCN doesn't recognize subspecies, but we all know it has been commonly used for centuries.
However, as of late, most of what I'm seeing is ssp or populations elevated to (or recognized as) full species. I can't even recall the last time I saw a new ssp described.
How do genetics play into this? The description of Dryocampa kendalli notes a difference in COI of 1.3% (plus morphological and range) from D rubicundra to call it a new species. Papilio appalachiensis COI differs .33% from glaucus.
Is subspecies simply out of vogue? We don't know what valid differentiators might define a subspecies?
I know IUCN doesn't recognize subspecies, but we all know it has been commonly used for centuries.
However, as of late, most of what I'm seeing is ssp or populations elevated to (or recognized as) full species. I can't even recall the last time I saw a new ssp described.
How do genetics play into this? The description of Dryocampa kendalli notes a difference in COI of 1.3% (plus morphological and range) from D rubicundra to call it a new species. Papilio appalachiensis COI differs .33% from glaucus.
Is subspecies simply out of vogue? We don't know what valid differentiators might define a subspecies?
- adamcotton
- Global Moderators
- Reactions:
- Posts: 969
- Joined: Tue Mar 22, 2022 12:24 pm
- Location: Thailand
Re: How Genetic studies reveal new relationships, species
Lots of new subspecies are described nowadays, probably they get less visibility.Chuck wrote: ↑Thu Nov 30, 2023 2:26 pm OK, next question! Subspecies.
I know IUCN doesn't recognize subspecies, but we all know it has been commonly used for centuries.
However, as of late, most of what I'm seeing is ssp or populations elevated to (or recognized as) full species. I can't even recall the last time I saw a new ssp described.
IUCN is a conservation body, and it puts emphasis on species conservation rather than individual subspecies. If you meant ICZN, it treats subspecies as one of two taxon 'ranks' in the Species-group, so I guess you probably do mean IUCN.
Subspecies generally have a smaller difference in COI between them than between species, but it is often best to treat subspecies as visibly distinguishable from the nominotypical population in the large majority of specimens.Chuck wrote: ↑Thu Nov 30, 2023 2:26 pm How do genetics play into this? The description of Dryocampa kendalli notes a difference in COI of 1.3% (plus morphological and range) from D rubicundra to call it a new species. Papilio appalachiensis COI differs .33% from glaucus.
Is subspecies simply out of vogue? We don't know what valid differentiators might define a subspecies?
Difference in COI sequences is only an indicator that two taxa may be separate species or subspecies. Mostly it is a good indicator, but sometimes distinct species have almost identical COI sequences (e.g. Papilio eurymedon and P. rutulus).
Adam.
Re: How Genetic studies reveal new relationships, species
Thanks Adam.
"Smaller difference in COI". Noting the other factors- range, flight period, etc. how the heck then would one determine Sp vs SSP considering COI?
If we use 2% as a safe range for species (noting that many are now less than that), what might be the difference for a ssp? Must be less than 1/3 of 1%, considering glaucus and appalachiensis.
I expect the answer is "it depends" and "the other factors" but then still....we can argue morphology all day, argue about distant populations, etc...even combining all the factors, are there ANY guidelines for Sp vs SSP now?
"Smaller difference in COI". Noting the other factors- range, flight period, etc. how the heck then would one determine Sp vs SSP considering COI?
If we use 2% as a safe range for species (noting that many are now less than that), what might be the difference for a ssp? Must be less than 1/3 of 1%, considering glaucus and appalachiensis.
I expect the answer is "it depends" and "the other factors" but then still....we can argue morphology all day, argue about distant populations, etc...even combining all the factors, are there ANY guidelines for Sp vs SSP now?
Create an account or sign in to join the discussion
You need to be a member in order to post a reply
Create an account
Not a member? register to join our community
Members can start their own topics & subscribe to topics
It’s free and only takes a minute