Saturday 29 August 2020
apply function in R
factor in R
Numerical counterpart of character value is called factor
The factor function is used to create a factor.
The only required argument to factor is a vector of values which will be returned as a vector of factor values..
journey_type
journey_typefact<-as.factor(journey_type)
journey_typefact
class(journey_typefact)
as.integer(journey_typefact)
#bus is 1, flight is 2 and train is 3 as due to alphabetic order as b come first so 1 and so on..
dataframe in R
Thursday 27 August 2020
Data Structures In R
Data structures is used to identify and represent the data..
There are different types of data structures in R such as
1) vector:It is homogeneous(same) type and having one dimension.
2) list:It is heterogeneous(different) type and having one dimension..
3)matrix :It is homogeneous type and having 2 dimensions..
4)data frame:It is heterogeneous type and having 2 dimensions..
5) factors:It is homogeneous type and having one dimension..
6)Arrays:It is homogeneous type and having n number of dimensions..
1)vectors: one dimensional arrays are vectors,homogeneous
syntax:
variable name<-c()
Example:
a<-c(24,56,78)
a
b<-c("one","two","three")
b
x<-c(TRUE,FALSE,TRUE,FALSE)
x
y<-10 #element vector(scalars)
Where a,b,x is the variable name and c is used for combine the value..
Accessing vector elements
vec<-c("a","d","w","e","g")
vec
vec[3] #output:[1] "w"
vec[c(2,5)] #ouput: [1] "d" "g"
2)Matrix:These are two dimensions data structures
mat<-matrix(c(1,2,3,4),nrow=2,ncol=2) #by default column wise
mat
ouput: 1 3
2 4
mat1<-matrix(c(1,2,3,4),nrow=2,ncol=2,byrow=TRUE) # if true matrix filed by rows
mat1
ouput: 1 2
3 4
Note:
Compare both the output to know the result
Accessing matrix elements
mat<-matrix(c(1,2,3,4),nrow=2,ncol=2)
or
mat<-matrix(c(1:4),nrow=2,ncol=2)
mat[1,] #returns 1st row in matrix
mat[,1] #returns 1st column in matrix
mat[1,2] #returns the elements in the first row of second column
Arrays:are multidimensional..
#dim=c(row,column,number of matrix)
x1<-array(1:12,dim=c(2,2,3))
x1
Access elements in in arrays
#print 2 rows of 2 matrix
print(x1[2,,2])
#print 2 column of 3 matrix
print(x1[,2,3])
#print all column and rows of 3 matrix
print(x1[,,3])
Note:
Basic data types in R are character, numeric, integer, complex, and logical.
Objects may have attributes, such as name, dimension, and class.
String and Class in R
LEVELS:Displays levels of each variable in the data
TABLE:Display the count of each levels present in the data
EMPLOYEE PERFORMANCE
RAM VG (Very good)
SAMANYU G (Good)
JACK EXC(Excellence)
JILL VG (Very good)
JONE EXC (Excellence)
LEVELS:3(EXC,VG,G)
TABLE:EXC=2,VG=2,G=1
str:displays the datatype of each variable in the data
class:dataframe
Example: lp.csv
Name Age Marks
surya 29 98
sivi 10 23
somu 24 99
bapu 26 80
jack 47 89
jone 34 79
solution:
data1<-read.csv("lp.csv)
levels(data1$Marks)
table(data1$Marks)
str(data1)
class(data1)
Wednesday 26 August 2020
Calculate Mean,Median,Mode and standard deviation(SD) in R
MIN(): returns minimum value
MAX(): returns maximum value
MEAN():It is calculated by taking the sum of the values and dividing with the number of values in a data series.
MODE():R does not have a standard in-built function to calculate mode. So we create a user function to calculate mode of a data set in R. This function takes the vector as input and gives the mode value as output.
MEDIAN(): Arrange your numbers in numerical order. Count how many numbers you have. If you have an odd number, divide by 2 and round up to get the position of the median number. If you have an even number, divide by 2. Go to the number in that position and average it with the number in the next higher position to get the median.
SD(): sd stands for standard deviation.The standard deviation of an observation variable is the square root of its variance.. SD is calculated as the square root of the variance (the average squared deviation from the mean). ... If a variable y is a linear (y = a + bx) transformation of x then the variance of y is b² times the variance of x and the standard deviation of y is b times the variance of x.
RANGE():min and max value range returns a vector containing the minimum and maximum of all the given arguments.
Example: lp.csv
Name Age Marks
surya 29 98
sivi 10 23
somu 24 99
bapu 26 80
jack 47 89
jone 34 79
solution:
data1<-read.csv("lp.csv)
min(data1$Marks)
max(data1$Marks)
mean(data1$Marks)
median(data1$Marks)
var(data1$Marks)
sd(data1$Marks)
range(data1$Marks)
summary function in r
Summary functions produce a summary of all records in the found set or sub summary values for records in different groups.
Summary functions are descriptive statistics of the data(only for numeric)
Note:
If it is categorical data then summary function will give counts of the variable(var)
Example: lp.csv
Name Age Marks
surya 29 98
sivi 10 23
somu 24 99
solution:
data1<-read.csv("lp.csv")
summary(data1)
summary(data1$Marks)
What is data exploration?
Data exploration is the process of exploring the data and knowing in(input) and out(output) of your data for conducting further analysis.
1)HEAD(head):Displays top 6 obs(observation)
2)TAIL(tail):Displays bottom 6 obs(observation) by default
3)NROW(nrow):Displays number of obs(observation)
4)NCOL(ncol):Displays number of var(variable)
5)DIM(dim) :Displays number of obs(observation) and number of var(variable)
How to import file in Data science with R?
Syntax:
object name<-read.csv("filename")
where object name can vary that means you can provide any name...
<- OR =(Assignment operator)
read(): It is a function which is used to import the file in R..
suppose you have csv file then write read.csv(" filename")
filename can be any name........
Example:
data1<-read.csv("lp.csv")
where lp.csv is my file name
Que:How to run R programming??????
Ans: ctrl+enter
Que:How to display the data
ans:View(data1)
where data1 is my object name
More Examples are:
1)head(data1)
2)data2<-head(data1,100) #It will display top 100 observation(obs)
3)tail(data1)
4)data2<-tail(data1,50) #It will display bottom 50 observation(obs)
5)nrow(data1)
6)ncol(data1)
7)dim(data1)
Note:
# is used for comment line..
comment line is used for highlighting the text and it will not run the program
Tuesday 25 August 2020
Careers in data science
There are important rules in data science such as
1)Statistical analyst(freshers)
2)Consultant:Data Scientist
3)SME(subject matter expert) in data science
4)CDO(chief data officer)
What are the types of data analytics? OR What is data science analytics?
Why Data Science is required?
Friday 27 March 2020
What are the different types of machine learning algorithms?
Supervised Learning
Draft
List of Common Algorithms
Unsupervised Learning
Draft
List
of Common Algorithms
Semi-supervised Learning
Reinforcement Learning
In order to produce intelligent programs (also called agents), reinforcement learning goes through the following steps:
What are the types of Machine Learning ?
Wednesday 25 March 2020
What are the Applications of Machine Learning ?
1)Online Video Streaming (Netflix)
- When
you pause, rewind, or fast forward
- What
day you watch content (TV Shows on Weekdays and Movies on Weekends)
- The
Date and Time you watch
- When
you pause and leave content (and if you ever come back)
- The
ratings Given (about 4 million per day), Searches (about 3 million per
day)
- Browsing
and Scrolling Behaviour
3)Traffic Alerts
- Speech
Recognition
- Speech
to Text Conversion
- Natural
Language Processing
- Text
to Speech Conversion
9)Google Translate
apply function in R
1) apply function: It takes 3 arguments matrix,margin and function.. Example: m<-matrix(c(1,2,3,4),nrow=2,ncol=2) m #1 indicates it is ap...
-
a)Structural diagrams Structure diagrams depict the static structure of the elements in your system.It shows the things in the system - ...
-
index.html <html> <head><title>Sum of two numbers</title></head> <body> <form method="pos...
-
Write a program to input marks of five subjects Physics, Chemistry, Biology, Mathematics and Computer, calculate percentage and grade acc...