Friday, May 22, 2015

RNA doesn't correlate with protein? Huh?

tl;dr: I don’t know why people say that RNA doesn’t correlate with protein. There are different contexts to this question, and some recent experiments may make the question a bit confusing, but overall, I’m pretty sure that most of the time, if you increase the amount of RNA for a given gene, you will end up with more of the protein encoded by that gene. I’m sure there are counter-examples, though–if you know of any, please fill me in.

In our group, when we present work on RNA abundances, we are often faced with the question: “Well, what about the protein?” (fair enough). This is usually followed by the statement “Because of course it is well known that RNA doesn’t correlate with protein.” Umm, what?

I have to say that I’m a bit puzzled by this bit of apparently obvious and self-evident truth. I thought that most people accept that the central dogma of DNA to RNA to protein is a pretty solid fact in most cases. So… if you have more RNA, that should lead to more protein, right? Shouldn’t that be the null hypothesis?

Apparently this notion has been around for a long time, though nowadays it is perhaps a bit more conceptually confusing due to a few recent results. Perhaps the biggest one was the Schwanhausser paper in which they compare RNA-seq to mass-spec and show that there is a distinct lack of correlation between mean RNA levels and mean protein levels across all genes (also the Weissman ribosome profiling paper). What this means, on the face of it, is that even if gene A produces more RNA than gene B, then it may be the case that there is more protein B than protein A. Fine. There are differences in protein translation rate and degradation rate, leading to these differences, no surprises there. Plus, Mark Biggins and Allan Drummond make the point that any measurement noise will lead to decorrelation even if things are very correlated, and their reanalyses seem to indicate that the correlation between RNA and protein may actually be considerably higher than initially reported.

The next example that’s a bit closer to home for me is whether RNA levels and protein levels correlate, even for the same gene, across single cells. Here, it gets a bit more complex, and one might expect a variety of behaviors depending on the burstiness of transcription, degradation rate of the RNA and the degradation rate of the protein. Experimentally, there are some cases in which the RNA and protein of a particular gene do not correlate in single cells (Taniguchi et al. Science 2010 is a particularly good example). This may be due to long protein half-life, which effectively smooths over RNA fluctuations. In our PLOS 2006 paper (Fig. 7), we showed that there can be a strong correlation between RNA and protein when the protein degrades fast, and that correlation goes down a lot when the protein degrades more slowly.

And of course there’s the whole world of post-translational modifications, like during the cell cycle, etc., in which protein activity and potentially levels change independent of transcript abundance. Well, dunno what to say about that, I’m biased to just think about RNA. :)

Nevertheless, overall, I think it’s pretty safe to assume most of the time that if you increase RNA abundance for a particular gene, you will end up with more of the encoded protein. I think that should be the null hypothesis. If anyone knows of any counterexamples, please let me know.

Oh, and by the way, in case you’re wondering, transcription also correlates with RNA.

13 comments:

  1. What all those thousands of researchers who "over express" a protein using a strong promoter (e.g. CMV). I think that's a pretty good hint for the RNA-protein correlation...

    ReplyDelete
    Replies
    1. 'What about..'

      About transcription - not neccessarily so - becuase there is this fine balance between transcription & decay. By looking solely at "steady-state" level, one cannot say anything about transcription or decay seperately. Because you can have increased transcription combined with rapid decay, or increased stability combined with reduced transcription, and in both cases you can end up with the same "steady-state" level.

      Delete
    2. Hi Gal, yep, I think the overexpression experiments are pretty solid evidence as well! You are of course right about transcription levels not being the whole story for transcript abundance. I guess what I meant is that if you increase transcription of a given gene, you will end up with more transcript of that gene. The relationship between genes is certainly confounded by half-lives, etc.

      Delete
  2. Hi Arjun,

    If we only increase the levels for a set of mRNAs without other changes, the corresponding protein levels must go up. The only exception is if the ribosomes are completely saturated by those transcripts, which is very unlikely.

    However, the picture is different if we study two conditions that differ substantially, i.e., two different cell types. They have different RNA-binding proteins, many mRNA have different/alternative UTRs with known regulatory roles in translational regulation, and protein degradation times are different. The latter is one of the most interesting results from the Jovanovic paper that Mark Biggins was asked to highlight. In this context, predicting that protein levels are linearly dependent on mRNA levels is analogous to predicting that mRNA levels are linearly dependent on DNA levels. In fact, mRNAs are linearly proportional to DNA levels in the case of gene duplications but of course this does not prove the more general point. A general example of translational regulation is early development of lower metazoans; lower metazoans generally shut-off transcription and use translation to make protein from the existing mRNAs in a highly developmentally regulated fashion.

    ReplyDelete
    Replies
    1. Hi Nikolai,

      You are of course right: context matters. The thing that bugs me is just the blanket statement that protein doesn't correlate with RNA, often in cases where it makes no sense. I think that the Jovanovic paper also shows that most proteins correlate with RNA. I think that's a fairly reasonable null hypothesis. I think the question of whether one *needs* to check that the protein indeed follows the RNA in any specific case is somewhat tricky and context dependent.

      I think it's also interesting to think about some bounds on translational regulation. To me, I think there can only be so much regulation at the level of translation, reason being that there's simply much less sequence space there to specify regulation. In higher metazoans, gene regulation requires a lot more bases to encode than is available in the UTRs of most RNA. Thus, I think it's pretty reasonable to assume that *in general* there is more regulation at the level of transcription, although I have not really done any analysis of this myself. Of course, there will be a lot of counterexamples.

      Those experiments in early development are really cool! I know in C. elegans you can shut off zygotic transcription and the thing still develops fairly normally to the 100 cell stage. Of course, some fraction of that is protein localization rather than translation. Any idea what percentage?

      Arjun

      Delete
    2. Eric Lecuyer et al. estimated that in Drosophila over 70 % of the mRNAs are localized and thus their translation will result in at least partially localized protein. All of the well studied localized mRNAs (e.g., bicoid, oskar and others) are also translationally regulated and this translational regulation is essential. The best global estimates that come to mind about translational regulation in early development are from Xenopus. They are all whole embryo estimates, thus independent of localization, and report very weak or no correlation between most mRNAs and proteins, with a few prominent exceptions. I have examined some of these data, and I am convinced that in this context the lack of correlation is not due to noise.

      I fully agree with you that these examples should not be interpreted as the general rule and certainly do not warrant requests to "check that the protein indeed follows the RNA". This latter request is frequently unwarranted, and I think even somewhat separate from the discussion on what is the level at which a particular gene is regulated. The fact that bicoid is translationally regulated does NOT mean that its mRNA level is not important and should not be studied.

      Delete
  3. Hi Arjun,

    In my discussions with immunologists, statements like "mRNA and protein doesn't correlate" often come up. The one example that is often given is that of NF-kappB, whose activity is induced after LPS stimulation by post-translational modifications that are independent of new transcription. The general consensus among immunologist is therefore that NF-kappaB's activity increases in the absence of an increase in its mRNA.

    But, if you check the mRNA levels after stimulation of the genes encoding the subunits of NF-kappaB, they tend to go up too. So, even in this example case of "lack of correlation" between protein (activity) and mRNAs, in the end there is a clear (time-lagged) correlation. It's a funny case when you think about it: people often (correctly) claim that "correlation does not imply causality", but in this case, they do seem to have accepted that "lack of causality implies lack of correlation". This latter statement is just as incorrect as the former one.

    I have similar feelings about the notion that "alternative transcripts are not correlated".

    A.

    ReplyDelete
    Replies
    1. Agreed, especially on the point about lack of causality not implying a lack of correlation. And about the alternative transcripts, for sure.

      Delete
  4. Hi,

    I was wondering if you have come across examples where you can reliably detect a protein but not mRNA transcript of the same gene? What sort of things would explain this? thanks.

    AE

    ReplyDelete
    Replies
    1. I don't *think* we've encountered such an example. We've definitely found cases where neither RNA nor protein are reliably detected, much to the chagrin of collaborators… :)

      Delete
    2. ok thank you.

      AE

      Delete
  5. Hi,
    This question troubles protein biochemists as well. I've spent many nights reading the literature to check the evidence - summarized in this white paper http://www.kendricklabs.com/WP1_mRNAvsProtein-New2014.pdf. At best there's a 40% correlation between protein and mRNA levels. The correlation is highest for housekeeping proteins and poorest. ~25%, for regulatory proteins like tyrosine kinases.

    Metazoan evolution has had 700 million years to mess with cell growth mechanisms. When you think about the protein synchronicity required for generation of the numerous cell types that make up any mammalian organism, the disparity is not surprising. Post-translational modifications are running the show.
    That’s my 2 cents anyway.

    ReplyDelete
  6. in response to perturbation, the mRNAs that change in abundance do not translate fast enough to be important at the protein level to survive the perturbation. evidence, most of the genes that transcribe these mRNAs can be deleted with out effect.

    ReplyDelete