Thursday 12 September 2019

Tools and Components of Data Science

  Tools for Data Science
   Following are some tools required for data science:
  • Data Analysis tools: R, Python, Statistics, SAS, Jupyter, R Studio, MATLAB, Excel, RapidMiner.
  • Data Warehousing: ETL, SQL, Hadoop, Informatica/Talend, AWS Redshift
  • Data Visualization tools: R, Jupyter, Tableau, Cognos.
  • Machine learning tools: Spark, Mahout, Azure ML studio.

Data Science Components:


The main components of Data Science are given below:

1. Statistics: Statistics is one of the most important components of data science. Statistics is a way to collect and analyze the numerical data in a large amount and finding meaningful insights from it.
2. Domain Expertise: In data science, domain expertise binds data science together. Domain expertise means specialized knowledge or skills of a particular area. In data science, there are various areas for which we need domain experts.
3. Data engineering: Data engineering is a part of data science, which involves acquiring, storing, retrieving, and transforming the data. Data engineering also includes metadata (data about data) to the data.
4. Visualization: Data visualization is meant by representing data in a visual context so that people can easily understand the significance of data. Data visualization makes it easy to access the huge amount of data in visuals.
5. Advanced computing: Heavy lifting of data science is advanced computing. Advanced computing involves designing, writing, debugging, and maintaining the source code of computer programs.
6. Mathematics: Mathematics is the critical part of data science. Mathematics involves the study of quantity, structure, space, and changes. For a data scientist, knowledge of good mathematics is essential.
7. Machine learning: Machine learning is backbone of data science. Machine learning is all about to provide training to a machine so that it can act as a human brain. In data science, we use various machine learning algorithms to solve the problems.

Difference between BI and Data Science
BI stands for business intelligence, which is also used for data analysis of business information: Below are some differences between BI and Data sciences:

1)Business intelligence
Data Source: Business intelligence deals with structured data, e.g., data warehouse.
Method: Analytical (historical data)
Skills: Statistics and Visualization are the two skills required for business intelligence.
Focus: Business intelligence focuses on both Past and present data

2)Data science
Data Source: Data science deals with structured and unstructured data, e.g., weblogs, feedback, etc
Method: Scientific(goes deeper to know the reason for the data report)
Skills: Statistics, Visualization, and Machine learning are the required skills for data science.
Focus: Data science focuses on past data, present data, and also future predictions.



2 comments:

  1. This is my first time i visit here and I found so many interesting stuff in your blog especially it's discussion. thank you.
    Ciencia de Datos

    ReplyDelete
  2. You don't realize how quickly technology is changing. Data science is highly technical and is therefore in high demand. A career in data science will open up many lucrative job opportunities. So, if you have been wanting to start your career in Data Science, now is the best time to enroll in a data science program with one of the best data science training institute in noida.

    ReplyDelete

apply function in R

1) apply function: It takes 3 arguments matrix,margin and function.. Example: m<-matrix(c(1,2,3,4),nrow=2,ncol=2) m #1 indicates it is ap...