Genetics and Race

matt harbowy
14 min readApr 25, 2017
How I differ from my brother, genetically, via 23andMe (company).

Brothers

I am roughly 3/4 identical in DNA to my brother, who is my biological sibling. We share the same parents, and were born about four years apart. We are both 50% identical in all respects to one of each of our parents, and there is absolutely no doubt we are biological siblings. However, we have both received a different half of each of our parents.

Since we are both male, you can see the effect of crossover between my mother’s X chromosomes: we each received different segments over about half of our X chromosome.

My brother’s Y chromosome has not been sequenced substantially, but there is no reason to believe we are significantly different in this, being both haplogroup G2 (like Dad). My brother and father have not had their STR haplogroups sequenced, so I don’t know if we have 100% identical Y STRs.

One important note: the biggest difference between my brother and I is that the pair of chromosomes 15 and 17 are almost completely different between the two of us. In essence, he got one set of grandparents, and I got the other pair, though which pair (mom’s and dad’s mom and or dad) is not distinguishable from the data above.

Here’s the important conclusion to draw from the above picture: inheritance ain’t easy.

Now, add to this fact that whether or not a person is identified as “eastern” or “western” European is determined by single “snippets”- stretches of DNA that are only known at single pairs across thousands of base pairs. In essence, if you are identified as partially western European, you might have a stretch that looks like

— ????????W????????????W???????????????????WW?????????? —

where ? is unknown (one of A/G/C/T, but not sequenced by 23andme) and W is only one of A G C or T that corresponds to what other people from western European origin have at that exact position. But also, this more likely looks like

— xxxxxxxxWxxxxxxxxxxxxWxxxxxx?xxxxxxxxxxxxWWxxxxx?xxxx —

where “x” is the same (one of A G C or T at any position) in 99.999% of human beings. And also there might be

— xxxxxxxxWxxxxxxxxxxxxYxxxxxx?xxxxxxxxxxxxWZxxxxx?xxxx —

where W is common to people descended from a person born in what would later probably become France but in 1200, and Y is common to people descended from a person born in what probably would later be Germany but 10,000 years ago, and Z is only common in descendants of a person who emigrated from Bavaria in 1740.

Remember: 23andMe (similarly to FTDNA, Ancestry, and most other ancestry testing companies) doesn’t test stretches of DNA, it tests individual markers known to vary greatly between people worldwide. It also doesn’t test all the potential variations that are present in people worldwide.

Back to my brother and me. If you were to just look at Chromosome 15 and Chromosome 17, you could make some very uncomfortable conclusions about what race my brother and or I might be. Since we received completely different versions of SLC24A5, (one of us from my mother’s father, one of us from my mother’s mother, one of us from my father’s father, and one of us from my father’s mother, 2 each, though I don’t know who got which set of 2 just that we got different ones), the skin color set we inherited is essentially a completely different set than found in either of our parents or each other.

image of author and brother, 1996

Now that said, we both got some variation of white, but not the same genetic white, in fact, completely different white. From completely different parts of the world, white.

We also inherited completely different versions of BRCA1 and ERBB2, so we have completely different risks for cancer, and since we also have completely different combinations of the gene FBN1, it is unlikely we both have Marfan’s syndrome. This, despite the fact I was diagnosed with such as a teenager. (In retrospect, since we are both fairly healthy and in our 40s, it is unlikely that either of us had or have Marfan’s syndrome, but what do doctors know!)

In essence, based upon chromosomes 15 and 17, we could easily be from different parents based upon our genetic makeup. Based upon other chromosomes, we are completely identical. However, since we are 100% the product of both of our parents, it should be obvious through simple inspection that the subject of inheritance, as I said before, ain’t easy.

Anyone who demands accuracy in racial or ethnic assignment from genetic ancestry testing is making a completely unreasonable request- as humans, we are subject to an immense amount of mixing and variation, and races don’t have a straight family tree nor does any individual have “pure” descent from a limited set of racial “types”.

On “Accuracy” in DNA Testing

There’s only one “accuracy” that matters in DNA, and that’s whether or not a given SNP or mutation or genetic variant is called correctly by sequencing or by other form of genotyping. In my own practical experience with having my genotyping and sequencing done a number of times by different companies on different testing platforms, I’m pretty comfortable that for the most commonly detected variations, the accuracy is very, very good, on the order of 99.99% or better.

However, you cannot ignore the fact that there are large portions of the genome which are effectively “un-sequenceable”, either due to the fact that they involve lots of repeated base pairs or are in portions of the genome that are hard to “lay flat” to make a library of the segment. So, no matter what, there’s always going to be error and possibility of error in any genotyping effort.

There are a number of big companies and smaller competitors which have sprung up offering some form of ancestry analysis based upon the results of genotyping. I’ve made the argument that no company is currently providing accurate information on ethnicity, and there’s two conflicting reasons for this.

  1. You don’t (really) know who your daddy is.
  2. Where your daddy came from has nothing to do with your ethnicity.

Let’s talk about paternity. This is an extremely fraught topic, and there’s a lot of judgement, racism, xenophobia, class warfare, and other general nastiness buried in this topic that needs to have a bright, disinfecting spotlight shone upon it. Maury Povich and other daytime-talkshow-hosts have made decent bank on a format where otherwise ordinary people who have had sexual relations either deny or assert a paternity that may or may not be true. So, we can go to a very simple set of tests, which measure perhaps a couple hundred base pairs of DNA, which are subject to immense variation between different people. If the child has the same or similar sets of markers above a certain threshold, one can be certain that the child and the father are therefore related.

This tends to raise the likelihood of a particularly insidious source of error in science, which I like to call “false confidence”. If it is possible to know with absolute scientific certainty that a particular man is your father, people will begin to believe that they might be able to know more about their father’s father, their father’s father’s father, also with a similar degree of absolute certainty. But let’s chip at that certainty on paternity.

(note: a 2010 article by Razib Khan cites a 1–2% non-paternity rate among individuals who are reasonably certain of their paternity)

The assumption that you know who your father “is” rests upon a match between a known person and a identified collected sample from another person, asserted to be the father. Without both samples, the likelihood of a successful paternity test begins to plummet dramatically, because over time the random changes in DNA begin to add up, and the confidence limit at which you can say with near certainty “you ARE the father” begins to fade away and looks more like “you are NOT the father”. You’d think that the probability of one would be 100% minus the probability of the other, but that’s not how this testing works. Both probabilities don’t add up to 100%, because there’s usually a large excluded middle where “I don’t know” reigns supreme.

What does that have to do with the accuracy of genetic testing, you might ask?

Fair enough. Bear with me. Let’s talk about your ancestors for a minute. Let’s say, (unlike me…) your father and mother were both blond, grew up in German speaking households when they were children, and have an undying love of sausage and spaetzle for dinner. You might think that you have German ancestry, and you wouldn’t necessarily be wrong, but there’s a high likelihood that you are. You could trace your ancestry over the last 200 years to nearly any corner of the globe, since your parents might actually be descended from South African Boers, Canadian Hutterites, Cleveland Brewers, Scandinavian Raiders, Prussian Revolutionaries, or any number of distinct ethnographic groups spread across multiple continents. You might be shocked to find a tribe of people in Africa whose skin is as dark as night but whose eyes are as blue as the ocean, who have lived in Africa for hundreds of generations, and yet might be a distant relative of yours and have passed to you those deep blue eyes as their sole inheritance to your genetic makeup. Here lies the problem: each generations contribution to your ethnic makeup dilutes that contribution by at least half, such that by six to eight generations, the likelihood that the ancestor contributes any genetic material to you might approach zero.

How many years before the number of possible grandparents, assuming no one marries their cousin, exceeds the number of people on earth? (from “Family Jewels part 2” see also “part 1”)

Due to the random nature of inheritance, for example, while both my brother and I are 50% each of both our parents, and I can speak with absolute certainty that he and I have the same parents, he and I share about 75% of our genetic makeup. Therefore, if my father is half German, half Ruthenian, for example, my brother and I are not equally one-quarter German and one-quarter Ruthenian.

Now, go back a step. I said my father was half German, and half Ruthenian, but in reality he was neither of those things. His mother descended from German immigrants, for sure, and his father descended from Polish immigrants, for sure, but it’s not like the German people descended from a single Ubermensch way back when, no matter how many times my grandmother sang “Deutchland Uber Alles” under her breath. They are comprised of long lines of mixtures of traits as the face of Europe has been rewritten, invaded, and repopulated almost once every hundred years back for tens of thousands of years.

When a company like Ancestry, 23andme, or DNA Land is trying to use genetics to determine your ancestral makeup, they are relying on collections of hundreds of stories about where people claim to be “from”. Putting aside the known probability of “paternity failure” where the father who raised you contributed no genetic material to you, the amount of actual DNA you got from one of your ancestors is not a nice simply expressed percentage. Furthermore, there might be large variances where a given family line might have moved great distances over the course of a few generations, even well before the period of modern travel.

image: screen capture of 23andMe — DNA Genetic Testing & Analysis page. Green cone=me

There was a version 1.0 of this kind of ancestral study, where large numbers of stories were combined and compared between people who reported a particular ancestry were plotted on graphs which looked at a subset of mutation markers which happened to correlate with those reports. Each of these markers was then assigned that ethnicity, and you had the early versions of ancestry finder- one that reported you were 45% eastern European, or 8% Ashkenazi.

Very quickly, as more people took the test, it became apparent that people who had no history of say, Jews, in their family tree were popping up as a small percentage Ashkenazi. And here was the problem- for those people who have a large percentage of Ashkenazi, they do so because they have a long tradition of having stayed within tightly knit eastern European communities whose original settlement only dates back about 1500 or so years. Yet, all this time, like Tevye’s daughters, there were people who might leave the community, or less likely, join the community (if they weren’t disowned). But just because they might move away, doesn’t mean that they left their DNA behind. So for those that were reported as 1% Ashkenazi, it might be that 2000 or 3000 years ago the original settlers of eastern Europe might have had a number of cousins who joined up with the forebears of the Ashkenazi, and others might have just lived nearby.

So, you began to see the tests further refine- where more ancestry was melted away to remove those 1% results and provide results that had greater appeal to a broader spectrum of people. And, you might claim, that the results became “less accurate”- where this melted specificity is seen as someone taking away or denying a particular deep ethnographic secret from your past.

In response, a number of competing companies now have different algorithms for how they compute ancestry, and we are seeing a mix of v1, v2, and now v3 level approaches to how a given marker on a small stretch of DNA is attributed to a particular ancestry. Get enough hits, and you are listed as 1% Cherokee, or 2% Bantu, or 3% Han Chinese.

Here’s the important thing to note, though- you might be getting that 3% Han Chinese because your great-great-great-grandmother married a missionary’s kid, or because you have a grandmother that lived in Korea, or Northern India, or Burma. Or, it might be because you have an ancestor from 20,000 years ago whose cousins all moved to northern India but your folks remained behind in Mesopotamia, and though your lineage was rare in eastern Europe, just enough of a snippet survived. You cannot know with certainty how any small stretch of DNA wound up in your makeup, and at best, you are relying on a mix of unverifiable rumors combined with overoptimistic reading of what are believed to be absolutely certain DNA results.

I repeat my assertion: None of the companies selling ancestry results on ethnicity are selling science or accuracy. That’s not to say, don’t take the tests, or don’t attempt to learn something about your ancestry from these tests! They are very useful, provided you start with the assumption that they are wrong, and begin to tease out the truth about what they are actually saying.

The overwhelming number of these tests are based upon genotyping chips, which at most measure a couple million possible variations known to be common among humans. They are not a representative or complete survey of the actual variation among human beings, they just happen to be known differences that diverged in the last 100,000 years or so among the many tribes of people worldwide. Those people walked around the planet! Those people survived natural disasters, plagues, violence, and whatnot on a scale we cannot comprehend. For each variation you have, there is an EPIC story that you should take immense pride to begin to understand and know.

Time erases all trace.

When I first started investigating genetic genealogy about 6 years ago, I did a quick back-of-the-envelope calculation to see what would happen if the fragments of a particular ancestor’s DNA became shorter and fragmented randomly. In about 30 or so generations, or about 1000 years, it was possible that a particular fragment would have shrunk to the point of being just a single base pair. At some point, we are all ultimately cut from the same cloth. In the end, there is “nothing left behind” of each of us, no matter how many generations we sire- poof!

I think on reading that last paragraph, some people might feel a little saddened, and assuredly, it’s not my intention to bring people down. I think what we leave in our genes is a very small part of the picture, and dwindles rapidly with each generation- but, too, we leave behind our thoughts, and our philosophies, and the impact that we’ve had on everyone else. If we’ve been really good, we leave the world a better place- safer, healthier, more easily understood. Leaving behind art, or science- these are things that can last for much more than 1000 years, if you care enough and work hard enough to make them count.

We are all, genetically speaking, cousins, even if the paper trail is really difficult to ferret out. It’s pointless to be xenophobic or to hoard your relationships and your wealth when the inexorable act of dilution is constantly reshuffling the deck.

My current haplogroup assignment is G-Y13112. My closest male cousins at YFull are Swedish, and we don’t share a common ancestor in about 3500 years. Source: https://yfull.com/tree/G-Y3098/

Only just beginning.

When I began this journey of discovery, my ancestral family tree landed on G2, one of maybe a 100 or so possible tribes, just another man among the boringest of the boring Linear Pottery farmers. As we have moved away from gene chips to sequencing, I’ve discovered that I am most probably descended from an extremely rare surviving group of men, perhaps from Baden culture, who lived a couple of thousand years ago in central Europe, long before many of the most common actors in Europe ever arrived. More than just Ruthenian, my father’s father’s family represent a race of people, a subset of Ruthenians among whom only a few may still walk this earth. There’s probably no name for who this tribe is, and we may never know their name, and yet they survived thousands of onslaughts. How much more interesting is that, than some mutedly “eastern European” blob that is trying to be so many things to so many people. As long as individuals look to genotyping chips and the companies that sell these $99 specials, they are only going to get the thinnest of weak tea explanations.

If you are adopted, or think your great grandparents moved here from some small country where no good records were kept, having a little clue as to the possible origin of your family might be a priceless addition to uncovering the story of your family- I don’t want to downplay that. But as far as the overall picture, whether or not the percentage is 1%, 10% or even 100% from that tiny country doesn’t make that story any more or less accurate- it is a clue, it is evidence, and the closer that evidence gets to 100%, the more you may feel it tells the whole story.

Just don’t fool yourself. Treat all results with a pinch of skepticism.

For more reading material on this subject, please see my earlier article, https://medium.com/@hbergeronx/genetic-genealogy-f45ef584358

Parts of this article were published previously and were adapted for the current article, including here, here, here, and here.

matt harbowy is a white male, scientist, activist, and data management expert. He is one of the founders of the non-profit Counter Culture Labs, working to bring fairness and egalitarian ideals to people interested in learning about science and biotechnology. He is also a top writer on the question and answer site, Quora.

--

--

matt harbowy

no job too dirty for the f*%&ing scientists. --Burroughs