Maximum,+Minimum,+and+Intermediate+Entropy

Maximum, Minimum, and Intermediate Entropy

The formula for Shannon's Entropy is -Sum (from i=0 to n) p(Xi)*log(Xi). So for example, let's say the young person after immunizations has this distribution (extreme case but this is just an example): 1,000 peptides with 30,000-40,000 intensity 8,000 peptides with 40,000-50,000 intensity 1,000 peptides with 50,000-60,000 intensity Then the entropy would be -(1000/10000*log(1000/10000)+8000/10000*log(8000/10000)+1000/10000*log(1000/10000))=0.639

I'm interested in the entropy of a distribution of elements. Let's say you have 10 elements that can be placed into 10 different bins. The maximum entropy would correspond to the state in which 1 element was in each bin (a very flat looking distribution). The minimum entropy would correspond to the state in which all of the elements were in one bin (a very lopsided looking distribution). Both of these distributions would look kind of boring. Then there would be many intermediate entropy distributions that fall inbetween these two extremes. Perhaps the most interesting distribution would be the one in which there are the maximum number of bins with different numbers of elements in them. I perceive this as the most organized, heirarchical type, interesting distribution, and I predict that this might be the type of distribution that would have an entropy relative to the maximum entropy that is most similar to life and "beautiful" things in general. I wonder there would be any relationship between the golden ratio and such a distribution. Perhaps the ratio of the entropy of this distribution compared to the maximum entropy for that distribution would correspond to the golden ratio or perhaps there is some other relationship. I am quite interested in this.

Maybe I could generate some code that would take n elements, create every possible distribution of n elements, find the distribution in which there are the greatest number of bins with different numbers of elements in them, and then I could compare the entropy of this distribution with the maximum entropy for that distribution. 10 elements might be too small, but with a program, I could test any number of elements if the problem still remains computationally feasible. How can I create every possible distribution given n elements? Perhaps I can map this to a counting problem as I have done in the past.

Let's start with a simple concrete example. How many ways can 10 elements be put into 10 bins? All 10 e (elements) could be put into the 1st b (bin). All 10 e could also be put into the 2nd, 3rd, 4th, etc. b, but this would not change the calculated value of the entropy. 9 e in one b and 1 e in another bin 8 e, 2 e 8 e, 1 e, 1 e 7, 3 7,3 7,2,1 7,1,1,1 6,4 6,3,1 6,2,2 6,2,1,1 6,1,1,1,1 5,5 5,4,1 5,3,2 5,3,1,1 5,2,2,1 5,2,1,1,1 5,1,1,1,1,1 4,4,3,1 4,4,2,1 4,4,1,1,1 4,3,1,1,1, 4,2,2 4,2,1,1 4,1,1,1,1,1,1 3,3,3,1 3,3,2,1,1 3,3,1,1,1,1 3,2,1,1,1,1,1 3,1,1,1,1,1,1,1 2,2,2,2,2 2,2,2,2,1,1 2,2,2,1,1,1,1 2,2,1,1,1,1,1,1 2,1,1,1,1,1,1,1,1 All 10 e in separate b

I think that's all of the possible distributions. So that corresponds to y possible arrangements for 10 elements. Also the arrangement that would have the greatest number of bins with different numbers of elements would be (7,2,1) or (6,3,1). Let's represent that all as numbers. 10 9,1 8,2 8,1,1 7,3 7,2,1 7,1,1,1 6,4 6,3,1 6,2,2 6,2,1,1 6,1,1,1,1 5,5 5,4,1 5,3,2 5,3,1,1 5,2,2,1 5,2,1,1,1 5,1,1,1,1,1 4,4,3,1 4,4,2,1 4,4,1,1,1 4,3,1,1,1, 4,2,2 4,2,1,1 4,1,1,1,1,1,1 3,3,3,1 3,3,2,1,1 3,3,1,1,1,1 3,2,1,1,1,1,1 3,1,1,1,1,1,1,1 2,2,2,2,2 2,2,2,2,1,1 2,2,2,1,1,1,1 2,2,1,1,1,1,1,1 2,1,1,1,1,1,1,1,1 1,1,1,1,1,1,1,1,1,1

Is there a way that we could have predicted that there are 13 distributions given 10 elements? Maybe this is some type of combinatorics problem?

Hmmm. . . Well, this may not be the optimal way to approach this problem, but I think I have a way of generating all of the unique distributions given n elements. It would involve several steps, and it does turn this into a counting problem.

I'm kind of going to count way more than I need to and then start throwing numbers out. Let's let each bin represent a binary number with as many digits as there are elements. The binary number representing all of the bins would have "number of bins"*"number of elements" so there would be 10*10=100 digits representing this 10 element scenario. This number would correspond to the number 2^100 in binary. Then just count from 0 to 2^100-1. Only keep numbers whose digits sum to the total number of elements. For the numbers left, split the number into 10 separate numbers (a number for each bin). Discard bins that contain no elements. Get the sum of the digits for the other bins. Sort the list of numbers from ascending to descending. Now of the remaining lists, remove any of them that are identical. The remaining lists should look just like the list I outlined above for 10 numbers.

Using the above process, I should end up with a list like the one I generated above.

Alright when I tried to count up to 2^100 and store this into an arraylist java ran out of memory so it looks like I'll have to come up with a better way.

This problem is essentially the problem: "What are all of the possible ways to sum to a given number?" And this problem is the same as partitioning an integer: http://en.wikipedia.org/wiki/Partition_%28number_theory%29#Partition_function Euler's generating function counts the number of ways of splitting an integer into a sum of positive integers, without regard to order.

Computing the partitions of n: http://www.mathpages.com/home/kmath383.htm

Sweet and now here's the java code to generate all of the partitions of an integer like I was looking for.

@http://introcs.cs.princeton.edu/java/23recursion/Partition.java.html

Great!

See Partition Class

Here's the class I made: IntermediateEntropyAnalyzer

Alright that code got me pretty far and has yielded some interesting results. So far I am seeing two different ratios.

(Entropy of Distribution with maximum number of bins with different elements and greatest entropy)/(Maximum entropy for a distribution with this number of elements) -> seems to approach the golden ratio

(Entropy of Distribution with maximum number of bins with different elements and least entropy)/(Maximum entropy for a distribution with this number of elements) -> seems to approach the 2

Likewise, I am seeing that for large numbers, the Partition class takes a very long time to compute everything. However, after seeing the pattern of distributions for different numbers, I can probably pick out the most organized distribution with the max and min entropy without generating every single possible distribution. For example, for a distribution with 50 elements, the following applies:

distribution with max entropy: 9, 8, 7, 6, 5, 4, 3, 2, 1, 1, 1, 1, 1, 1 distribution with min entropy: 14, 8, 7, 6, 5, 4, 3, 2, 1

Once I get the distribution with max entropy, the distribution with min entropy is pretty simple to get from that (just take the highest number and add all of the 1s except for one 1 to it so that there is only one 1 left).

Here's how I could generate the distribution with max entropy.

highest number so far = 1 if(there are enough 1s to combine to exceed the highest number so far by one) { combine 1s to exceed the highest number so far by 1 highest number so far ++ } else { we're done } return the list

I'll represent a distribution as "a pool of 1s" and "other numbers" to quickly perform operations on these distributions.

Alright with this new code I looked at the entropy of the first 10,000 possible distributions. I now see that neither the min or the max organized distribution entropy approaches the golden ratio or 0.5. They both seem to just approach some number a little above 0.5.

New Code OrganizedDistribution OrganizedDistributionGenerator

Perhaps the most heirarchical distributions would approach the golden ratio? haha Note that I looked at the distributions for 50 elements in more detail. It is interesting to compare the distributions with 0.9 ratios to those with 0.1 ratios and those with ratios near the golden ratio. Generally, the ones with entropy ratios at 0.618 and in the middle in general look more interesting than the other two.

How could one determine the most heirarchical distributions? Perhaps they are the ones with the greatest number of different repeated elements in which the smaller numbers are repeated more often than the larger numbers. I'm not sure exactly how you would quantitate this, but maybe there is a way.

I'm kind of curious about whether a distribution with a Fibonacci distribution of elements or a prime number distribution of elements would approach an interesting entropy ratio. Maybe I could try this out sometime.

3-5-12

I took a quick look at whether different types of distributions such as a prime number type, fibonacci type, or golden ratio distribution had entropies compared to max entropy ratios that approached 0.681. There is a possibility that the prime number type distribution approaches this number or maybe it just keeps decreasing forever and will fall below this number. I have not tested enough prime numbers to test this.

Files found here: L:\storage\Documents\Miscellaneous\Intermediate Entropy Analysis\diff distr

the first fifty million primes found here: http://primes.utm.edu/lists/small/millions/

If I did want to look at this further, perhaps I should just do some type of regression to get an equation that matches the current millions of points. Then I can take the limit of this function to see what it seems to be approaching.

3-11-12 Here's an e-mail I sent to John Lainson about tree branch diameters.

An idea struck me this morning. I've been trying to find the optimal type of entropy that life prefers. Then I was trying to kind of find the most optimal or organized or heirarchical types of distributions which might resemble life in some ways. Then this morning I was thinking I could just get the distribution from life directly. Before I wasn't sure how I would do this. Now I think I could just measure the diameter of all of the branches on a tree. In a tree some of the branches are thick, and then more are thinner, and then even more are thinner, and then even more are thinner, etc. This is a very organized, heirarchical, fractal like pattern. From this data, I could construct a histogram in which I have the frequency of branches that lie within different diameter size ranges. In fact, perhaps I can just find algorithms which produce tree like fractals rather than measuring an actual tree, and use those to do my analysis. From such an algorithm, I could see how the algorithm relates to fundamental mathematical concepts such as the Fibonacci sequence or prime numbers or whatever.

-Kurt

Here are some e-mails to Johnny and Cosmo about intermediate entropies, varying stimulation, and power laws. Here's an e-mail I sent to Cosmo about varying stimulation Here's a followup e-mail to Johnny about the optimal entropy for life

A google search of "entropy of a power law distribution" also seems to yield a great deal of interesting information.

I think the wikipedia article on scale-free networks has the information I need to make the distribution that I want.

Here's an article that evaluates whether different systems actually follow a power law. http://arxiv.org/pdf/0706.1062v2.pdf

3-31-12 I'll make a program that creates a scale-free network distribution (the y axis would be frequency and the x axis would be number of connections to a node) for a distribution with a certain number of elements. I will then calculate the entropy of this distribution, and then compare this entropy to the maximum possible entropy for a distribution with that many elements.

The formula for a scale-free network is P(k) = ck^(-gamma) P(k) = the fraction of nodes in the network having k connections to other nodes goes for large values c = 1 gamma = typically 2 < gamma < 3

Scale_Free_Network Java Class

It looks like the entropy of a scale free network decreases with increasing nodes down to a certain intermediate entropy asymptote value.



e-mail about effect of environment entropy on complex adaptive system entropy