Probably in recent years you have heard or read in some media the term Data Science, Data Analytics or Big Data. Forbes, for example, has published that the data scientist has been ranked as the best job in the United States for three years in a row, whose median salary is $ 110,000 a year (with an exchange rate of $ 2851 pesos / dollar) that is about 313.61 million pesos per year or 26 million pesos per month (Davenport & Patil, 2012).
If you live in Colombia you have read that Colombia is the ninth country in the world with a Big Data policy, after the United States (2012); Australia (2013); United Kingdom (2013); South Korea (2013), Japan (2013); European Union (2014), France (2014) and China in 2014 (National Planning Department (DNP), 2017. 1 Something you will notice in all our entries is that we constantly refer our writings to official sources, indexed articles, and books. I hope that they serve to complement the information you are looking for, it is also for you to know that we do not take away the information that we share from the hat, and that from these arise new concerns or suggestions that you can share with us 😊. [/ note ]
Alternatively, have you read about the Ministry of Information and Communication Technologies (MINTIC) of Colombia that opened a call in early 2018 to train 200 people in data and IT analytics (MinTIC, 2017)?
Now if you’re like me (I was a few years ago), everything I’ve just told you is meaningless to you. That and nothing is almost the same. I understand, if you write on Google: Data Analytics, you will see 666,000,000 results!
Well, the purpose of this blog is to try to answer questions such as the title of this first entry and many others that are related to this topic. Also, maybe you wonder, why do we open this blog? The answer is that we are passionate about this topic and we believe it is crucial to socialize in an easy to understand in a way that allows people (students, professionals, business people, retirees, really can be anyone since there is data everywhere) to get excited about it.
So let’s start with a brief description of several terms.
Data Science Fields
Source: Dahl Winters (2015)
What is Data Science? Is it something 100% new? No. It is associated with business analytics, business intelligence, data analytics, among other terms (we will soon have entries for these terms as well).
Data science has been applied for a long time but only recently was that name coined 2  In 2008, D.J. Patil and Jeff Hammerarcacher coined the term. At that time, the first led data analytics department on LinkedIn and the second on Facebook. [/ note] To what it have been applied? Foreman (2014) defines it as the science that transforms data, through mathematics and statistics, into valuable disclosures, decisions, and products. I would add others such as data engineering, pattern recognition and advanced computer learning, visualization, uncertainty modeling, data storage, and high-performance computing (HPC). Also, data analytics is related to data science since this is the one in charge of extracting those valuable intuitions or revelations of data, by using many tools, about which we will tell in another post.
And Big Data?
According to the United Nations in 2012, Big Data refers to the massive volume of data, both structured (e.g., databases) and unstructured (eg social networks, tweets, videos), which are too large and complicated to process with the databases and the traditional ‘software’ (UN Global Pulse, 2012). Doug Laney articulated three keywords to define Big Data: volume, speed, and variety (SAS, s.f.). The massive volume is difficult to understand in the language of systems engineers (massive volume = many terabytes or exabytes of information), but we can turn it into tangible objects that we know; For example, the total amount of the data of the 16 ministries of Colombia (1000 terabytes = Big Data) is 222,000 DVDs (DNP, 2016).
Big Data brings several challenges concerning storage, processing, security, among other aspects, and this is especially true because it grows exponentially and is extremely varied (databases, video or voice recordings, images, social networks, among others). It is estimated that every 48 hours of video are uploaded on YouTube, 527 web pages are created, 204,166,667 emails are sent, 3,600 photos are shared on Instagram, and 684,478 individuals share content on Facebook (Simon, 2013).
The world's hottest job?
Finally, according to the October 2012 edition of the Harvard Business Review, the data scientist had (and still has) the sexiest job in the world in the 21st century because the demand for these professionals exceeds the current offer, that is makes them valuable and that's why right now they are one of the highest paid professionals in the world. For example, in the United States it was estimated that for this year (2018) there would be a deficit between 140 and 190 thousand professionals in this branch (Simon, 2013). And what does this professional do, according to the IBM definition:
"What sets data scientists apart is business acumen, along with their ability to communicate the findings found to both IT and administrative people, so they can influence how an organization addresses a business challenge. Good data scientists will not only tackle business problems. They will choose the right problems that have the most value for the organization. ” (Own translation)
That is, a data scientist must gather skills and knowledge of the business (such as a business administrator or organization manager), a systems engineer, and a statistician.
The reality is that a single person is unlikely to know in depth about the last two areas, and even more difficult for them to know about various productive sectors. That is why today data scientists are groups of individuals from various disciplines who form a synergy to fulfill the role of the data scientist, applying data science and offering organizations the value of data. through data analytics.
- Davenport, T.H. & Patil D.J.(2012, octubre). Data Scientist: The Sexiest Job of the 21st Century. Harvard Business Review. Recuperado de https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century
- Departamento Nacional de Planeación (2016, marzo). “Colombia entra a las grandes ligas del Big Data”: Simón Gaviria Muñoz. Recuperado de https://www.dnp.gov.co/Paginas/%E2%80%9CColombia-entra-a-las-grandes-ligas-del-Big-Data%E2%80%9D–Sim%C3%B3n-Gaviria-Mu%C3%B1oz-.aspx
- Departamento Nacional de Planeación (2017, octubre). Colombia será el noveno país del mundo en tener una política de Big Data: DNP. Recuperado de https://www.dnp.gov.co/Paginas/Colombia-ser%C3%A1-el-noveno-pa%C3%ADs-del-mundo-en-tener-una-pol%C3%ADtica-de-Big-Data-DNP-.aspx
- Foreman, J. W. (2014). Data smart: Using data science to transform information into insight. John Wiley & Sons.
- Ministerio de Tecnologías de Información y Comunicaciones (2017, diciembre). 200 ciudadanos podrán formarse en analítica de datos y TI con la convocatoria de Científicos de Datos. Recuperado de http://www.mintic.gov.co/portal/604/w3-article-62098.html
- SAS. (s.f.). Big Data: what is it and why it matters. Recuperado de https://www.sas.com/en_us/insights/big-data/what-is-big-data.html
- Simon, P. (2013). Too big to ignore: the business case for big data. John Wiley & Sons.
- UN Global Pulse (2012). Big Data for Development: Challenges and Opportunities. Recuperado de http://www.unglobalpulse.org/projects/BigDataforDevelopmet