Frayed Knot said:
			
		
	
	
		
		
			But in the case of DNA, what is the receiver?
		
		
	 
One example is the machinery that turns the DNA into a working protein.
		
 
		
	 
That's 
not what we're talking about here. We're talking about taking a strand of DNA, and getting the sequence of its bases into our heads, or onto a sheet of paper. In Stripe's definition that the "information" word can only be used to gauge the reduction in uncertainty of a receiver, this means that "information" is a term that applies to a process and not to a set of data. I'm willing to use his definition (even though it is not standard in information theory), so I'm having to carefully word all my statements. 
The "receiver" in this example is our brain, or the sheet of paper we write the sequence on. At the beginning of the process, I am completely uncertain about the sequence of DNA in the strand. At the end of the process, I will have read the information, which I am able to do with near complete precision - in fact, I have the luxury of taking as much time with it as I need, so I can reduce the potential for error arbitrarily. Tell me how low you want the error probability to be, and I can get under that by taking more time and doing it more carefully. Therefore, I can guarantee that any transcription errors will be low enough to be neglected in our analysis of information (how low does the error rate need to be to be negligible? I can get below that).
At the end of the process, I am near completely certain as to the sequence of DNA. Now the question becomes - what is the reduction in uncertainty about the DNA sequence after it's been read? This is the central core of information theory - the reduction in uncertainty is exactly equal to the entropy of the data that we started with. In other words, the information we get is equal to how unexpected the starting data set is, and a data set that has fewer patterns, is more like randomness, has more.
A pattern with more randomness results in more information.
	
	
		
		
			Unless there is such a thing as mutations.
		
		
	 
We are not talking about mutations, we are not talking about replication of DNA by a cell, we are talking about reading the DNA sequence.
	
	
		
		
			I'll say it again for a third time. One must always account for noise mutations.
		
		
	 
We did account for noise in our process of reading the data. The idea of mutations doesn't apply.
	
	
		
		
			
	
	
		
		
			Do you see the problem there? If that's the model you're using, you have to stick to it, so don't then try to refer to the information content of source data, because it's meaningless.
		
		
	 
The content can only be measured if you have all 3 (4) parts: encoded message, transmission, and decoded message (noise).
		
 
		
	 
In the scenario we're discussing, the message is the sequence of the DNA. Transmission is the act of reading that, and the decoded message is our understanding of what was in it at the beginning. The noise (errors) can be reduced to as low as we want them, low enough to be negligible. As the noise gets lower and lower, the information transmitted (reduction in the uncertainty in our understanding of what was there) approaches the limit of the entropy of the source data. Since we can get the errors as low as we desire, 
the information is equal to the entropy of the data.
I hope the non-technical people reading this thread can understand the key concepts.