Reading the post of Shane Legg “unreasonable-effectiveness” and reading the related article “The Unreasonable Effectiveness of Data” by Alon Halevy, Peter Norvig and Fernando Pereira I have the pleasure to see how the attention of researchers are moving in the investingation of huge data set as an important resource to solve problems.

I have a different opinion in the statement “unreasonable” , I think it is an inevitable consequence of the theoretical hypothesis emerging from empirical evidence .

In the article Norvig speak about text-translations , text – comprehension, web 2.0 , etc… I claim that huge sets of good data ( 1 terabyte of “0” is useless , which data is good ? A first step here ) is inevitably a good resource to speed up every “difficult” problem . In the previous post I show how the Solomonoff universal distribution change over a limit of available resource M and in this example is possible to see how important is to know the set of existing Data . The knowledge of a huge data set give the possibility to increase the knowledge of *R* and this let to know a General Real Distribution , a distribution in the real world , a sub-Universal distribution but more simple to compute ( for the difference from exponential behaviour see **The Shannon Discrepance 1 2 **) and correct for real-problems.

### Like this:

Like Loading...

*Related*