The Inevitable Effectiveness of Data

Reading the post of Shane Legg “unreasonable-effectiveness” and reading the related article “The Unreasonable Effectiveness of Data” by Alon Halevy, Peter Norvig and Fernando Pereira I have the pleasure to see how the attention of researchers are moving in the investingation of huge data set as an important resource to solve problems.

I have a different opinion in the statement “unreasonable” , I think it is an inevitable consequence of the theoretical hypothesis emerging from empirical evidence .

In the article Norvig speak about text-translations , text – comprehension, web 2.0 , etc… I claim that huge sets of good data ( 1 terabyte of “0” is useless , which data is good ? A first step here )  is inevitably a good resource to speed up every “difficult” problem . In the previous post I show how the Solomonoff universal distribution change over a limit of available resource M and in this example is possible to see how important is to know the set of existing Data . The knowledge of a huge data set give the possibility to increase the knowledge of R and this let to know a General Real Distribution , a distribution in the real world , a sub-Universal distribution but more simple  to compute ( for the difference from exponential behaviour see The Shannon Discrepance 1 2 ) and correct for real-problems.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s