tisdag 19 juni 2018

Is data science the same as statistics?

The confusion continues till date
"Statistics, according to the American Statistical Association, is the “science of learning from data”. So there is huge scope of confusing data science with statistics. Statistics is a data-driven science, but it focusses on developing theories based on data insights. In the early 1900s, William Gosset, under the pseudonym Student, used the Guinness brewery data to develop the famous Student’s t-distribution. Was he a data scientist? Important theories of statistics were developed by small data quite often. Take an interesting example from the 1930s. A woman colleague of the legendary statistician R.A. Fisher claimed that she could identify whether tea or milk was added first to a cup. In order to verify this, Fisher prepared eight cups of tea, of which milk was added first in four cups. The woman could correctly identify six cups, three from each group. Fisher analysed the data by his newly developed Fisher’s exact test. Half a century on, this ‘Lady Tasting Tea’ experiment would be treated as one of the two supporting pillars of the randomisation analysis of experimental data. There is no doubt that statistics was primarily data-driven. In 1997, C.F. Jeff Wu gave a famous lecture entitled “Statistics=Data Science?” at the University of Michigan. The confusion somewhat continues till date."

