Citing DNA Evidence

Having received "QuickSheet: Citing Genetic Sources for History Research," I am still a bit confused on deciding how to handle citations for comparing multiple atDNA matches.  For instance, using GEDmatch, I am comparing various atDNA cousin matches with two specific testers (i.e. "constants").  Do I create a seperate reference note for each comparison with the two constants? Or is there an simpler way to cite multiple comparisons?

For an Ancestry DNA Circle with 10 or more testers, should I create a citation for each match within the Circle, or can I just cite the Circle (which by its nature may be continuously changing)?

Should I bother to cite testers that are within a Circle, but do not fall within the parameters of matching both of the constants to whom I am comparing in order to make my case?

What about privacy issues? As one cannot see Ancestry DNA matches unless they are an account holder within the Circle (and it appears that Private trees may not even show up in the Circle), is the use of their various kit numbers a privacy issue? At what point does the reader just have to take my word for the matches, since few will be able to access this data to see for themselves?  In the text of the article, I have used first names only for referring to various testers to preserve anonymity.  But if I use kit numbers (for FamilyTree DNA and GEDmatch) or user i.d. (for Ancestry) in the citation, their anonymity is compromised to some degree.

Thanks for helping to untangle this new kind of spiderweb!

Erick Montgomery

Submitted byEEon Mon, 06/29/2015 - 11:36

Great questions, Erick!  You've also given me an idea for a new QuickLesson.  I'll likely post it later today—with a much fuller answer than I could give you in this forum.


Submitted byErickdmon Mon, 06/29/2015 - 20:40

Dear EE.  Thanks for your excellent review of my questions in QuickLesson 21. (And I must say, that was Quick....)  Now for the fun of constructing the citations!

I still have one question.  Should I create a separate full reference note for each atDNA comparison that I have used as evidence within the different databases (e.g. FamilyTree DNA, 23 and Me and Ancesty DNA) or tool (e.g. GEDmatch)? Or should I combine the reference notes for matches being compared to the constants against which (whom?) each match is being measured? (Actually that's two questions)

Knowing that editors value brevity, and my trouble with that concept, I am thinking in terms of trying to keep words to a minimum for a work that is already too long!

Many thanks,


Submitted byEEon Tue, 06/30/2015 - 09:51

Erick, you're right, editors like brevity. Then, to frustrate us, they tell us that we need to be much more explicit on this-or-that. All things considered, as both an editor and a writer, I'd say say that editors of peer-reviewed journals want us to cut verbal fat, but provide enough explicit details in our citations for them and readers to understand the who, what, when, and where of the source, the information, and the informant.

That said, without seeing the structure of what you are creating and the details you need to cite, it's just not possible for me to give you a concrete answer. We're back to where we were yesterday: The citation should support what is asserted in the text. Beyond that, these thoughts come to mind:

  1. Have you found, in a peer-reviewed journal, a similar example that you can adapt?
  2. Combining sources into a single citation has a long tradition behind it and is still acceptable—assuming you structure it so that the multiple sources (test kits? data-base entries?) are being cited to support the same one assertion.
  3. If you are at the point that your citations would take up considerably more space on the page than the text itself, you might consider presenting your data in a different format—perhaps as a table.
  4. When in doubt (and after point 1 has been addressed) do what you feel works best in your situation while meeting standards. If it's overlong, so be it. With complicated sources, your editors would much rather have too much detail than too little. Give them enough so they can understand what you're doing, and they'll then help you do any whittling that's necessary.

Submitted byErickdmon Tue, 06/30/2015 - 19:14

Many thanks for these additional pointers. I will take your sage advice, and see where it leads in crafting my citations. Getting me to think about DNA evidence like other evidence should serve my needs well in trying to cite my assertions. And clarifying that it is the assertion that I will be citing, rather than the individual match helps a great deal.


Submitted byLinda_Johnsonon Tue, 07/07/2015 - 18:52

Thanks, EE, for writing the Quick Lesson on citing DNA evidence. Would you please elaborate on the part of the quotation in it from the Genetic Genealogy Standards that says, "Genealogists share DNA test results of living individuals in a work of scholarship only if the tester has given permission or has previously made those results publicly available." In the case of a tester who seems to have dropped out of the genealogical community but has previously posted his test results to GedMatch (where they remain at present):

1. May I cite those results without his explicit permission?

2. What about information in a Gedcom he posted to Gedmatch showing his descent from our most recent common ancestor?

3. Would your answer to #2 apply to a Gedcom he posted to the testing company's web site (FTDNA in this case)?

4. And finally, if I should learn that the tester is no longer living, may I cite his test results and Gedcom information without anyone's permission at all?

Thanks very much for your trusted advice.

Linda Johnson

Submitted byEEon Wed, 07/08/2015 - 08:51


First the caveats: the use of genetics for personal research on historical people is still in its nascence; and the ethical and legal issues are developing. You might get sounder advice from  Dr. Blaine Bettinger at The Genetic Genealogist or Judy G. Russell at The Legal Genealogist.  That said, our own understanding here at EE is this:

  • If someone publicly posts their results, with their name attached, then they have publicized their genetic information. We are free to use it, and cite the publication.
  • If someone posted their results under a kit number or pseudonym, and did not attach their actual name to their public posting, then we are free to cite the kit number or the pseudonym, but not the person.
  • If we establish contact with a kit number and the person provides a name amid our private correspondence, we cannot use that person’s name in our distributed work without that person’s permission.
  •  If the person died without giving that permission, then we still cannot use that person’s name because he or she likely has siblings, children, or other close family members whose genetic privacy would be violated.

If a posted kit number in a database asserts a specific line of descent from a specific person, it is not reliable evidence for our research until and unless we confirm that every generational link is soundly proved.  So, if we cannot identify that person in our distributed research, how can we validly establish the claimed line of descent? 

This last question is the purpose of that “scholarship” clause of the Genetic Genealogy Standards. It would allow us to identify the person and other information on living individuals to a skilled editor who would verify the accuracy of our conclusion while being ethically bound to divulge that information to no one else.

Submitted byLinda_Johnsonon Wed, 07/08/2015 - 11:55

In reply to by EE

Thanks so much for your guidance about the privacy issues involved in citing DNA test results. If I may ask a follow-up question:

It's clear to me that your first two bulleted points apply to results posted at a site like Gedmatch. Do they also apply to the testing companies' web sites, such as the match lists and related pages at FamilyTreeDNA? It seems to me that the latter might not meet the threshold for "publicly posting" one's test results or Gedcoms because the FTDNA web site only displays information about a tester to his or her matches, not to anyone who cares to take a look.

Thanks very much.

Linda Johnson

Submitted byEEon Wed, 07/08/2015 - 21:13


So: at FTDNA or 23andMe, only our matches can see our data. But then we have a 1000+ matches. Would that be less of a "publication" than if, say, we printed 100 copies of a book and gave those away?

Submitted byLinda_Johnsonon Wed, 07/08/2015 - 23:34

I hadn't thought of it that way, but what you wrote makes perfectly good sense to me.

Thanks so much!

Linda Johnson