Publicidad
What is data science ?
What is data science ?
What is data science ?
Próximo SlideShare
Data scienceData science
Cargando en ... 3
1 de 3
Publicidad

Más contenido relacionado

Publicidad

Último(20)

Publicidad

What is data science ?

  1. 1 | P a g e What is data science and why it is important now? What is data science and why it is important now? Author – Bohitesh Misra (bohitesh.misra@gmail.com), September 2017 Data Science! Fundamentally, in layman terms, data scientists collect data from various data sources, clean them, organize the data and shape them to be able to analyze them. We can separate data into training and testing to assess and experiment the algorithm or model that is developed using statistics and apply them to any area or sector that we find suitable. Data mining helps end users extract useful business information from large databases. Asking the right questions Asking the right questions is extremely important, and hence apt communications skills is essential for data scientists. With the advent of technology and the internet, we now have access to data instantly and the technology to test our interpretation to make decisions rapidly and promptly. Data scientist Data scientists use their data and analytical ability to find and interpret rich data sources; manage large volume of data; merge data sources; ensure consistency of datasets; create visualizations in understanding data; build mathematical models using the data; and present and communicate the data insights and findings to business decision makers. "Data scientist" has become a popular buzzword with Harvard Business Review dubbing it "The Sexiest Job of the 21st Century" and McKinsey & Company projecting a global excess demand of 1.5 million new data scientists. Statistical models
  2. 2 | P a g e What is data science and why it is important now? How does data mining works? It works the same way a human being does. Basically, it uses historical information to learn for future. Mathematical models like linear algebra, probability, statistics and calculus, regression, clustering, predictive analysis are indispensable in data science. Python and R are preferred programming languages that have packages and libraries built specifically for data science which allow us to learn programming and start applying. I’ve begun with R and use basic libraries for text and data mining. Data Cleaning 80% of the work by data scientists is data cleaning. Data is sometimes available in preferred formats such as csv and xls, but you’ll find very little data directly available to be executed using programming. APIs, web scraping and SQL come in to the rescue of Data Scientists. Spark and Map-Reduce are used to clean and analyze large and distributed datasets. It’s everywhere! Data-driven solutions are being used everywhere, from e-commerce websites, social networking sites, financial visualization and interpretation. Data-driven practices are increasingly being employed by companies over the last few years. In fact, it would be difficult to find a sector in which data science cannot be used to take better decisions, and companies are slowly realizing this and adopting it. Want to learn it? I came across data science and decided it was the right fit for me and recently completed Executive Management Programme from Indian Institute of Technology Delhi in the same subject. Learning data science is very easy and convenient, with the large number of MOOCs and eBooks available for free online. I urge you to think about how it may be applied to you, whether it is your business where you can gather data in the form of reviews and opinions of
  3. 3 | P a g e What is data science and why it is important now? customers to make better data-driven decisions. You can use the data from movie review sites to choose your next movie. Data science for Startups Startups critically need a Data strategy around the collection, storage and usage of large data, in a way that data can serve the purpose behind the selling point of a startup and can also open-up additional potential monetisation avenues in the future. A common case can be recommendation engine, which can benefit from all kinds of information about the users: age, gender, purchases, offerings and discounts. Designing the platform in a way that improves information collection from its users, results in a big database that can be used to improve in better managing discount deals, improving advertising or even the user experience on the platform. A clear data strategy can provide startups with additional revenue scope and can also provide with a competitive advantage.
Publicidad