Melissa A. Wilson Sayres's picture
Time to read: 
10-12 minutes

How Rare is a Rare Disease?

By Melissa A. Wilson Sayres

There are more than 7000 rare diseases, also called orphan diseases. How does one decide the threshold for considering a disease to be rare? It depends. One study found more than 296 definitions of “rare disease.” In Europe, a disease is typically considered rare if it affects fewer than 1 in 2000 people, but the most widely used definitions place rare diseases as those that affect 4 or 5 out of 10,000 people. But in today's world of population growth and virtual communities, in many ways rare diseases are becoming common. 

Since 2008 a day has been set aside to raise awareness about rare diseases. This year Rare Disease Day is February 29, 2016. According to the National Institutes of Health:

“Rare Disease Day was established to raise awareness with the public about rare diseases, the challenges encountered by those affected, the importance of research to develop diagnostics and treatments, and the impact of these diseases on patients' lives.”


What causes rare diseases?

It is estimated that approximately 80% of rare diseases are due to genetic mutations, and more than 50% of the people with rare diseases are children. For physicians, patients, and advocates, the primary focus must be on treating the disease. Sometimes there may not be a treatment in the lifetime of the patient. So, we must ask ourselves, does knowing the genetic mutation leading to the rare disease help? There are a lot of strongly supported and nuanced answers to this question, but after talking with clinicians, patients, and advocates, I think the answer is a resounding, “yes.”

Perhaps the most compelling argument for learning the cause of a rare disease is made by following the progression of the story of Lilly Grossman, her family, and her genome, by Ed Yong. Although the article first came out in 2013, the story began in 1998, when Lilly’s symptoms first manifested. In 2015 we get an update, illuminating how genome sequencing, and subsequent identification of the causal mutation, enables patients and their advocates to build a community. And just a week later we learn more about how identifying mutations in this digital age facilitates connections between researchers, clinicians, parents, and patients

Now we come to a different set of questions: Given my child has a rare disease, how likely is it that we are alone? Is our orphan disease unique? Are there even other families with whom we can build a community? A little mathematics and information about mutation rates tells us: that we are not alone, and that there are, with near certainty, other families who have, or will be affected by the exact same mutation. These other families may be, well, rare, but they are out there.  


Calculating how many people have the same mutation

How can we know that families affected by rare diseases are not truly alone? Let’s work it out together

There are 23 pairs of chromosomes in the human genome. If you counted the number of nucleotides – the A's, T's, G's, and C's that make up our DNA – you would have approximately three billion nucleotide sites across those 23 chromosomes. And because each chromosome is part of a pair, you can multiply that number by two, for a total of six billion sites where a mutation can happen

A mutation occurs when one of those six billion A’s, T’s, G’s or C’s changes from whatever nucleotide it was to a different nucleotide. For example, if at a particular site in the genome, both parents have a C, but the offspring has a T, we would say that that position mutated. There are other types of mutations – insertions and deletions – but for now we’ll just focus on the type of mutation where a single DNA site changes from one nucleotide to another. 


Aren’t mutations bad?

Mutations are not necessarily bad. Unlike the X-men, your mutations probably won’t give you super powers, but some mutations do result in traits that help organisms survive longer or have more offspring. Mutations might even result in something good while we’re young that can affect our health when we’re older. But, looking across all six billion nucleotides, we think most mutations have very little effect. That’s actually a good thing. We can study rates and patterns of mutations across the human genome to learn about our own genetic ancestry, and even learn about the genetic history of our entire species.

However, if a mutation occurs somewhere that prevents fertilization, development, or survival, it is called a lethal mutation. Lethal mutations occur, but currently scientists think that a very small subset of the six billion positions in our genome are lethal if mutated. So, for now, let’s assume that pretty much all of the six billion DNA positions in our genome can be mutated.

So, the question is, if we could sequence the DNA of every person alive on the planet, how many people would have a mutation at the exact same position in the genome? Or, related to our original question, how many families do we expect to have a family member with the same orphan disease (assuming it is caused by a single mutation)?

In fact, we expect that across all humans, every nucleotide is mutated in about 200 people. That is 200 people who we expect share a mutations at the exact same position. And, here is the math to prove it:

There are a range of estimates, but for this exercise let’s say that the human mutation rate is 1.2 x10-8 mutations per nucleotide site per generation. This mutation rate is pretty small (and thankfully so), meaning that in each person we only expect to observe only a handful of new mutations relative to their parents. But, that handful of mutations adds up when you think of how many people are on the earth.

There are now about 7.4 billion people on earth (at the time of this post the estimate was 7,398,537,348-ish).

If we let the birth of every person alive represent a single generation event, then we can estimate the average number of new mutations at each position across all 7.4 billion people by multiplying the mutation rate per generation, by the number of generations:

(1.2 x 10-8) mutations/site/generation * 7,398,537,348 generations ≈ 89 mutations/site

This says that if we could look at the genome of all seven billion people, on average, we expect to observe 89 new mutations at each of the six billion individual positions across the genome. But we usually don't talk about each copy of a chromosome individually (the one you got from your genetic mother and the one you got from your genetic father), we just talk about a single chromosome, like chromosome 1. That is, we think about the genome as folded in half (that three billion number I first mentioned).

Thinking about the number of differences across the three billion pairs of sites, we expect that across the whole human population, there are about 172 people who experienced a new mutation at the exact same site.

That sounds like a lot of mutations, and it really is, but think about all of the people on the earth! When we consider all 7 billion of us together, a little math shows that, even though there are only a small number of new mutations in each individual, we expect about two hundred people to have a mutation at the exact same site. 


What does this mean for people with rare diseases?

First and foremost, this back of the envelope calculation strongly suggests that families with a member with a rare disease are probably not alone. If most rare diseases are genetic in origin, and are the consequence of a mutation at a single position, then we don’t expect any disease to be a solitary incidence. There are some caveats with our estimates, especially for families with a member who has a rare disease. Perhaps the biggest is that we didn’t account for the age of the people in our calculations. Given that more than 50% of people affected by rare diseases are children, our estimates will be off. If we only count the number of children in our estimate, there will be fewer than 178 people who we expect to have a mutation at the exact same position. But, we also only infer the number of mutations expected at any single site. For a given rare disease, the disease may be caused by one of many mutations in a given gene, which means that our calculation is an under-estimate.


Connecting over rare diseases

Although we have convinced ourselves that there are about two hundred other people in the world with the exact same mutation, it is not likely that they all live near each other, or speak the same language, or even have access to technology to learn about others going through the same experience. But, social media is making it easier for rare disease patients and their advocates to connect. In addition to personal blogs, there is a reddit subpage dedicated to discussions of rare diseases, families connect over Twitter using hashtags, and there are facebook groups that can facilitate interactions.

As a geneticist, a parent, and someone lucky enough to be alive in the time of social media, I’ve thought a lot about what I’d do if any child of mine developed a rare disease. After the initial shock, I am comforted to know that rare disease resources exist, that communities of informed advocates are continuing to grow, and, most of all, that we will not be alone. 

You can follow Melissa Wilson Sayres, Assistant Professor at Arizona State University, on twitter @mwilsonsayres


Further Reading:

The Girl Who Turned to Bone by Carl Zimmer The Atlantic 2013.

Community-funded rare disease genomics 2012

A systematic survey of loss-of-function variants in human protein-coding genes MacArthur et al. 2012 Science

4-Leaf Clover image by Cygnus921 via Wikimedia Commons


Post Comments

View Comments
Dean Suhr's picture
Submitted by Dean Suhr on Fri, 02/12/2016 - 6:42pm
Thanks, great article. One correction ... Rare Disease Day is the last day of February which is usually the 28th but in rare years it's the 29th. More here:
Melissa A. Wilson Sayres's picture
Submitted by Melissa A. Wils... on Sat, 02/13/2016 - 11:38am
Thanks for catching my typo! I don't have access to editing posts, but it will be corrected as soon as we can.
Dean Suhr's picture
Submitted by Dean Suhr on Fri, 02/12/2016 - 8:04pm
The US Federal government defines a rare disease as affecting less than 200,000 people in the US. We are very active in rare disease advocacy (at the disease specific and rare umbrella level) and I would say that no one here in the US quotes any other definition of a rare disease. I'd also point out that the NIH & FDA are looking at US prevalence so Malaria is a rare disease because we do not have it in the US in any great numbers, without regard to the millions who may have it in Africa - where it is common. An expanded perspective I would like to share about your discussion is that it focuses on de-novo mutations not inherited mutations. In x-linked diseases all males inherit the mutation. In autosomal recessive diseases 25% of offspring (on average ) from carrier parents will be affected. As an example, for MLD, the birth rate is about 1 in 40000, which equates to 1 in 100 people being a carrier for MLD. 50% of the carrier's offspring will be carriers even if the partner is not a carrier. And finally, there are typically many different mutations that result in a given rare disease. We know many of the rare diseases are single gene diseases but when we say that we do not mean to infer that they are single mutation diseases. For example, we know of over 200 unique mutations on the same gene that cause MLD.
Melissa A. Wilson Sayres's picture
Submitted by Melissa A. Wils... on Sat, 02/13/2016 - 11:40am
I agree completely. This is the first of what I hope will be a series talking about mutations, variations in frequency, and why understanding evolution is so important for medicine, in this case applied to rare diseases.
gasstationwithoutpumps's picture
Submitted by gasstationwitho... on Wed, 02/17/2016 - 2:18pm
Your computation assumes that the rare disease is caused by a single mutation in a single position, that all mutations in that position cause the disease, and that the mutant allele is dominant (not masked by the non-mutant allele on the other copy of the genome). In practice, rare diseases are much more complex. Many are recessive, needing two copies of the damaged allele; many require more than one gene to be damaged; and many can be caused by any of a number of different forms of genetic damage. You also assumed that all rare diseases are from novel mutations, not inherited from the parents. Some of your simplifications raise the estimate of the number of people with the same rare disease, some lower the estimate. You have probably greatly overestimated the number of people getting the same rare disease from novel mutations, but greatly underestimated the number of people inheriting rare diseases from their parents.
Melissa A. Wilson Sayres's picture
Submitted by Melissa A. Wils... on Sat, 02/27/2016 - 2:52pm
Thanks for being so interested in the post! I intentionally keep this as a simple, back of the envelope, computation to introduce people to the concept. It is already a long post, and there will be follow up posts that address the oversimplifications here.