Saturday 29 August 2020

apply function in R

1)apply function:
It takes 3 arguments matrix,margin and function..

Example:
m<-matrix(c(1,2,3,4),nrow=2,ncol=2)
m
#1 indicates it is applied in row
apply(m,1,sum)

#2 indicates it is applied in column
apply(m,2,sum)
apply(m,2,mean)#2 indicates it is applied in column
apply(m,2,min)#2 indicates it is applied in column

apply(m,1,mean)#1 indicates it is applied in row

2)lapply function:
It takes list and function..

Example 1:
l<-list(a=c(1,2),b=c(3,4),c=c(5,6))
lapply(l,sum)
lapply(l,mean)

Example 2:
x<-c(4,5,9)
x
b<c("RAM","SAM","GAM")
b

3)sapply function:
It is same as lapply except it simplifies result..
It takes list and function..
Example 1:
s<-list(a=c(1,2),b=c(3,4),c=c(5,6))
sapply(l,sum)
sapply(l,mean)

4)tapply function: 
We need 3 arguments vector,factor of vector and functions..

age<-c(20,50,78,89,90)
age
gender<-c("m","m","f","m","f")
gender
f<-factor(gender)
f
tapply(age,f,sum)
tapply(age,f,mean)

where age is vector ,f is factor of vector and sum is the function

Note:sum,mean,min,max are predefind function....

factor in R

factor:
Numerical counterpart of character value is called factor
The factor function is used to create a factor. 
The only required argument to factor is a vector of values which will be returned as a vector of factor values..

#factor:numerical counterpart of character value is called factor

journey_type<-c("train","bus","flight","bus","flight","train")
 journey_type 
 journey_typefact<-as.factor(journey_type)
journey_typefact
class(journey_typefact)
as.integer(journey_typefact) 

#as.integer is displaying how our values as 1,2 and 3 are
#bus is 1, flight is 2 and train is 3 as due to alphabetic order as b come first so 1 and so on..

dataframe in R

dataframe:It is displayed in tabular format,it has proper var name and it is mix up data(heterogeneous)

Example:
Name  Age  Address
  jill        45    boston
nick      32    london
sam      25   india

Address:character
Name:Character
Age:Numeric

Note: R is case sensitive

#dataframe:displayed in tabular format, proper variable name and heterogeneous

Name<-c("jill","nick","sam")
Age<-c(45,32,25)
Address<-c("boston","london","india")
x<-data.frame(Name,Age,Address)
x
str(x)

Thursday 27 August 2020

Data Structures In R

Data structures is used to identify and represent the data..

There are different types of data structures in R such as

1) vector:It is homogeneous(same) type and having one dimension.

2) list:It is heterogeneous(different) type and having one dimension..

3)matrix :It is homogeneous type and having 2 dimensions..

4)data frame:It is heterogeneous type and having 2 dimensions..

5) factors:It is homogeneous type and having one dimension..

6)Arrays:It is homogeneous type and having n number of dimensions..


1)vectors: one dimensional arrays are vectors,homogeneous

syntax:

variable name<-c()

Example:

a<-c(24,56,78)

a

b<-c("one","two","three")

b

x<-c(TRUE,FALSE,TRUE,FALSE)

x

y<-10   #element vector(scalars)

Where a,b,x is the variable name and c is used for combine the value..

Accessing vector elements

vec<-c("a","d","w","e","g")

vec

vec[3]  #output:[1] "w"

vec[c(2,5)]  #ouput: [1] "d" "g"

2)Matrix:These are two dimensions data structures 

mat<-matrix(c(1,2,3,4),nrow=2,ncol=2)  #by default column wise

mat

ouput: 1  3

            2  4


mat1<-matrix(c(1,2,3,4),nrow=2,ncol=2,byrow=TRUE) # if true matrix filed by rows

mat1

ouput: 1  2

            3 4

Note:

Compare both the output to know the result

Accessing matrix elements

mat<-matrix(c(1,2,3,4),nrow=2,ncol=2) 

or

mat<-matrix(c(1:4),nrow=2,ncol=2)

mat[1,]  #returns 1st  row in matrix

mat[,1]  #returns 1st column in matrix

mat[1,2] #returns the elements in the first row of second column

Arrays:are multidimensional..

#dim=c(row,column,number of matrix)

x1<-array(1:12,dim=c(2,2,3)) 

x1

Access elements in  in arrays

#print 2 rows of 2 matrix

print(x1[2,,2])

#print 2 column of 3 matrix

print(x1[,2,3])

#print all column and rows of 3 matrix

print(x1[,,3])


Note:

Basic data types in R are character, numeric, integer, complex, and logical.

Objects may have attributes, such as name, dimension, and class.

String and Class in R

 LEVELS:Displays levels of each variable in the data

TABLE:Display the count of each levels present in the data

EMPLOYEE       PERFORMANCE

RAM                      VG (Very good)

SAMANYU             G  (Good)

JACK                     EXC(Excellence)

JILL                        VG (Very good)

JONE                     EXC (Excellence)

LEVELS:3(EXC,VG,G)

TABLE:EXC=2,VG=2,G=1

str:displays the datatype of each variable in the data

class:dataframe

Example: lp.csv 

Name  Age  Marks 

surya  29     98 

sivi   10      23 

somu  24    99

bapu  26    80

jack  47     89 

jone  34     79

solution: 

data1<-read.csv("lp.csv)

levels(data1$Marks)

table(data1$Marks)

str(data1)

class(data1)

Wednesday 26 August 2020

Calculate Mean,Median,Mode and standard deviation(SD) in R

 MIN(): returns minimum value 

MAX(): returns maximum value 

MEAN():It is calculated by taking the sum of the values and dividing with the number of values in a data series. 

MODE():R does not have a standard in-built function to calculate mode. So we create a user function to calculate mode of a data set in R. This function takes the vector as input and gives the mode value as output.

 MEDIAN()Arrange your numbers in numerical order. Count how many numbers you have. If you have an odd number, divide by 2 and round up to get the position of the median number. If you have an even number, divide by 2. Go to the number in that position and average it with the number in the next higher position to get the median. 

SD(): sd stands for standard deviation.The standard deviation of an observation variable is the square root of its variance.. SD is calculated as the square root of the variance (the average squared deviation from the mean). ... If a variable y is a linear (y = a + bx) transformation of x then the variance of y is b² times the variance of x and the standard deviation of y is b times the variance of x. 

RANGE():min and max value range returns a vector containing the minimum and maximum of all the given arguments. 

Example: lp.csv 

Name  Age  Marks 

surya  29     98 

sivi   10      23 

somu  24    99

bapu  26    80

jack  47     89 

jone  34     79

solution: 

data1<-read.csv("lp.csv)

min(data1$Marks)

max(data1$Marks)

mean(data1$Marks)

median(data1$Marks)

var(data1$Marks)

sd(data1$Marks)

range(data1$Marks)

summary function in r

 Summary functions produce a summary of all records in the found set  or sub summary values for records in different groups.

 Summary functions  are descriptive statistics of the data(only for numeric)

Note:

If it is categorical data then summary function will give counts of the variable(var)

Example: lp.csv

Name  Age  Marks

surya    29     98

sivi        10      23

somu     24       99

solution:

data1<-read.csv("lp.csv")

summary(data1)

summary(data1$Marks)


What is data exploration?

Data exploration is the process of exploring the data  and knowing in(input) and out(output) of your data for conducting further analysis.

1)HEAD(head):Displays top 6 obs(observation)

2)TAIL(tail):Displays bottom 6 obs(observation) by default

3)NROW(nrow):Displays  number of obs(observation)

4)NCOL(ncol):Displays number of var(variable)

5)DIM(dim) :Displays  number of obs(observation) and number of var(variable)


How to import file in Data science with R?

 Syntax:

object name<-read.csv("filename")

where object name can vary that means you can provide any name...

<- OR =(Assignment operator)

read(): It is  a function  which is used to import   the file in R..

suppose you have csv file then write     read.csv(" filename")

filename can be any name........

Example:

data1<-read.csv("lp.csv")

where lp.csv   is my file name

Que:How to run  R programming??????

Ans: ctrl+enter

Que:How to display the data

ans:View(data1)

where data1 is my  object name

More Examples are:

1)head(data1)

2)data2<-head(data1,100) #It will display top 100 observation(obs)

3)tail(data1)

4)data2<-tail(data1,50) #It will display bottom 50 observation(obs)

5)nrow(data1)

6)ncol(data1)

7)dim(data1)

Note:

# is used for comment line..

comment line is used for highlighting the text and it will not run the program

Tuesday 25 August 2020

Careers in data science

There are important rules in data science such as

1)Statistical analyst(freshers)

2)Consultant:Data Scientist

3)SME(subject matter expert) in data science

4)CDO(chief data officer)

What are the types of data analytics? OR What is data science analytics?

1)Descriptive:what has happened in the past?
2)Diagnostic:Why it has happened?
3)Predictive:What will happen?
4)Prescriptive:What can be done? OR Suggestion


Example:suppose you are facing some medical problem and you decided  to concern  the doctor.once you came to the doctor ,he will not provide medicine immediately .Firstly he will analysis your problem.

1)Descriptive:what has happened in the past?
Doctor asks you about your any past disease or any symptoms related to that

2)Diagnostic:Why it has happened?
Doctor will ask you 
a)Why you are facing that challenge? 
b) Did you neglect your health?

3)Predictive:What will happen?
Doctor tells you what will happen in near future if you are facing that disease

4)Prescriptive:What can be done? OR Suggestion
Here doctor guides you or give some suggestion how you will cure or recovery from your disease







Why Data Science is required?

1)To decrease the operation cost 
2)To increase the business revenue 
3)To take the effective business decision

Friday 27 March 2020

What are the different types of machine learning algorithms?

 Types of machine learning Algorithms
Machine Learning Algorithms can be divided into categories according to their purpose and the main categories are the following:
·         Supervised learning
·         Unsupervised Learning
·         Semi-supervised Learning
·         Reinforcement Learning

Supervised Learning

·          Supervised learning with the concept of function approximation, where basically we train an algorithm and in the end of the process we pick the function that best describes the input data, the one that for a given X makes the best estimation of y (X -> y). Most of the time we are not able to figure out the true function that always make the correct predictions and other reason is that the algorithm rely upon an assumption made by humans about how the computer should learn and this assumptions introduce a bias. 
·         Here the human experts’ acts as the teacher where we feed the computer with training data containing the input/predictors and we show it the correct answers (output) and from the data the computer should be able to learn the patterns.
·         Supervised learning algorithms try to model relationships and dependencies between the target prediction output and the input features such that we can predict the output values for new data based on those relationships which it learned from the previous data sets.

Draft

·         Predictive Model
·         we have labelled data
·         The main types of supervised learning problems include regression and classification problems

List of Common Algorithms

·         Nearest Neighbor
·         Naive Bayes
·         Decision Trees
·         Linear Regression
·         Support Vector Machines (SVM)
·         Neural Networks

Unsupervised Learning

·         The computer is trained with unlabeled data.
·         Here there’s no teacher at all, actually the computer might be able to teach you new things after it learns patterns in data, these algorithms a particularly useful in cases where the human expert doesn’t know what to look for in the data.
·         are the family of machine learning algorithms which are mainly used in pattern detection and descriptive modelling. However, there are no output categories or labels here based on which the algorithm can try to model relationships. These algorithms try to use techniques on the input data to mine for rulesdetect patterns, and summarize and group the data points which help in deriving meaningful insights and describe the data better to the users.

Draft

·         Descriptive Model
·         The main types of unsupervised learning algorithms include Clustering algorithms and Association rule learning algorithms.

List of Common Algorithms

·         k-means clustering, Association Rules

Semi-supervised Learning

In the previous two types, either there are no labels for all the observation in the data set or labels are present for all the observations. Semi-supervised learning falls in between these two. In many practical situations, the cost to label is quite high, since it requires skilled human experts to do that. So, in the absence of labels in the majority of the observations but present in few, semi-supervised algorithms are the best candidates for the model building. These methods exploit the idea that even though the group memberships of the unlabeled data are unknown, this data carries important information about the group parameters.

Reinforcement Learning

This method aims at using observations gathered from the interaction with the environment to take actions that would maximize the reward or minimize the risk. Reinforcement learning algorithm (called the agent) continuously learns from the environment in an iterative fashion. In the process, the agent learns from its experiences of the environment until it explores the full range of possible states.
Reinforcement Learning is a type of Machine Learning, and thereby also a branch of Artificial Intelligence. It allows machines and software agents to automatically determine the ideal behavior within a specific context, in order to maximize its performance. Simple reward feedback is required for the agent to learn its behavior; this is known as the reinforcement signal.
There are many different algorithms that tackle this issue. As a matter of fact, Reinforcement Learning is defined by a specific type of problem, and all its solutions are classed as Reinforcement Learning algorithms. In the problem, an agent is supposed decide the best action to select based on his current state. When this step is repeated, the problem is known as a Markov Decision Process.

In order to produce intelligent programs (also called agents), reinforcement learning goes through the following steps:
1.    Input state is observed by the agent.
2.    Decision making function is used to make the agent perform an action.
3.    After the action is performed, the agent receives reward or reinforcement from the environment.
4.    The state-action pair information about the reward is stored.

List of Common Algorithms

·         Q-Learning
·         Temporal Difference (TD)
·         Deep Adversarial Networks

Use cases:

Some applications of the reinforcement learning algorithms are computer played board games (Chess, Go), robotic hands, and self-driving cars.

Note:
List of Common Machine Learning Algorithms
·         Linear Regression.
·         Logistic Regression.
·         Decision Tree.
·         SVM.
·         Naive Bayes.
·         kNN.
·         K-Means.
·         Random Forest.

 


What are the types of Machine Learning ?


Machine learning is sub-categorized to three types:
·         Supervised Learning – Train Me!
·         Unsupervised Learning – I am self sufficient in learning.
·         Reinforcement Learning – My life My rules! (Hit & Trial)

1.    Supervised Learning: It is the learning in which the training data is labelled with the correct answers. e.g., “spam” or “ham.” The two most common types of supervised learning are classification (where the outputs are discrete labels, as in spam filtering) and regression (where the outputs are real-valued).
2.    Unsupervised learning: It is the learning in which we are given a collection of unlabeled data, which we wish to analyze and discover patterns within. The two most important examples are dimension reduction and clustering.
3.    Reinforcement learning: It is the learning in which an agent (e.g., a robot or controller) seeks to learn the optimal actions to take, based the outcomes of past actions.

Wednesday 25 March 2020

What are the Applications of Machine Learning ?

Arthur Samuel is an American pioneer in the field of computer gaming and artificial intelligence, coined the term "Machine Learning" in 1959 while at IBM.

Applications of Machine Learning :

1)Online Video Streaming (Netflix)

With over 100 million subscribers, there is no doubt that Netflix is the daddy of the online streaming world. Netflix’s speedy rise has all movie industrialists taken aback – forcing them to ask, “How on earth could one single website take on Hollywood?”. The answer is Machine Learning.
The Netflix algorithm constantly gathers massive amounts of data about users’ activities like:
  • When you pause, rewind, or fast forward
  • What day you watch content (TV Shows on Weekdays and Movies on Weekends)
  • The Date and Time you watch
  • When you pause and leave content (and if you ever come back)
  • The ratings Given (about 4 million per day), Searches (about 3 million per day)
  • Browsing and Scrolling Behaviour
2) Social Media 
·         One of the most common applications of Machine Learning is Automatic Friend Tagging Suggestionin Facebook or any other social media platform. Facebook uses face detection and Image recognition to automatically find the face of the person which matches it’s Database and hence suggests us to tag that person based on DeepFace. Facebook’s Deep Learning project deepface is responsible for the recognition of faces and identifying which person is in the picture. It also provides Alt Tags (Alternative Tags) to images already uploaded on facebook.  For eg., if we inspect the following image on Facebook, the alt-tag has a description.

3)Traffic Alerts   
Now, Google Maps is probably the app we use whenever we go out and require assistance in directions and traffic.  The other day I was travelling to another city and took the expressway and Maps suggested: “Despite the Heavy Traffic, you are on the fastest route“.  But, How does it know that?
It’s a combination of People currently using the service, Historic Data of that route collected over time and few tricks acquired from other companies. Everyone using maps is providing their location, average speed, the route in which they are travelling which in turn helps Google collect massive Data about the traffic, which makes them predict the upcoming traffic and adjust your route according to it.
4) Fraud Detection   
Experts predict online credit card fraud to soar to a whopping $32 billion in 2020. That’s more than the profit made by Coca Cola and JP Morgan Chase combined. That’s something to worry about. Fraud Detection is one of the most necessary Applications of Machine Learning. The number of transactions has increased due to a plethora of payment channels – credit/debit cards, smartphones, numerous wallets, UPI and much more. At the same time, the amounts of criminals have become adept at finding loopholes.

5)Transportation and Commuting (Uber)
If you have used an app to book a cab, you are already using Machine Learning to an extent. It provides a personalized application which is unique to you. Automatically detects your location and provides options to either go home or office or any other frequent place based on your History and Patterns.
It uses Machine Learning algorithm layered on top of Historic Trip Data to make a more accurate ETA prediction. With the implementation of Machine Learning, they saw  26% accuracy in Delivery and Pickup.
 6)Products Recommendations
Suppose you check an item on Amazon, but you do not buy it then and there. But the next day, you’re watching videos on YouTube and suddenly you see an ad for the same item. You switch to Facebook, there also you see the same ad. So how does this happen?
Well, this happens because Google tracks your search history, and recommends ads based on your search history. This is one of the coolest applications of Machine Learning. In fact, 35% of Amazon’s revenue is generated by Product Recommendations.

 7)Virtual Personal Assistants
As the name suggests, Virtual Personal Assistants assist in finding useful information, when asked via text or voice.  Few of the major Applications of Machine Learning here are:
  • Speech Recognition
  • Speech to Text Conversion
  • Natural Language Processing
  • Text to Speech Conversion

All you need to do is ask a simple question like “What is my schedule for tomorrow?” or “Show my upcoming Flights“. For answering, your personal assistant searches for information or recalls your related queries to collect info. Recently personal assistants are being used in Chatbots which are being implemented in various food ordering apps, online training websites and also in Commuting apps.
 8)Self Driving Cars
Well, here is one of the coolest applications of Machine Learning. It’s here and people are already using it. Machine Learning plays a very important role in Self Driving Cars and I’m sure you guys might have heard about Tesla. The leader in this business and their current Artificial Intelligence is driven by hardware manufacturer NVIDIA, which is based on Unsupervised Learning Algorithm.

NVIDIA stated that they didn’t train their model to detect people or any object as such. The model works on Deep Learning and it crowd sources data from all of its vehicles and its drivers. It uses internal and external sensors which are a part of IOT.  According to the data gathered by McKinsey, the automotive data will hold a tremendous value of $750 Billion. 

9)Google Translate

Remember the time when you travelled to a new place and you find it difficult to communicate with the locals or finding local spots where everything is written in a different language.
Well, those days are gone now. Google’s GNMT (Google Neural Machine Translation) is a Neural Machine Learning that works on thousands of languages and dictionaries, uses Natural Language Processing  to provide the most accurate translation of any sentence or words. Since the tone of the words also matters, it uses other techniques like POS Tagging, NER (Named Entity Recognition) and Chunking. It is one of the best and most used Applications of Machine Learning.



apply function in R

1) apply function: It takes 3 arguments matrix,margin and function.. Example: m<-matrix(c(1,2,3,4),nrow=2,ncol=2) m #1 indicates it is ap...