Machine Learning
Machine Learning By Bharat Sreeram
Bharat Sreeram
bsreeram.datascience@gmail.com
sankara.deva2016@gmail.com
6309613028
Machine Learning Session 1: Dec-5th-2018
https://global.gotomeeting.com/play/recording/d7c5c19ef6122846296292df6437e4f025f2e1393e60e72bfab1fd4ff97d26e3
Machine Learning Session 2: Dec-6th-2018
https://global.gotomeeting.com/play/recording/07314ce010d0ee64cebf32c399ab3993f05764ecd2c363c81a32e194b610ca6a
notes:
https://drive.google.com/file/d/1aVG9-No4h6fi8GY9QcZmwZe1IiVoL_CV/view?usp=sharing
Machine Learning Session 3: Dec-10th-2018
https://global.gotomeeting.com/play/recording/19cafe83765b2160951e806110e6bbedbc17eaf227341c2477c73e4a400d276d
Machine Learning Session 4: Dec-11th-2018
https://global.gotomeeting.com/play/recording/34593c679b22f3d51f22c1e61ad2c2f752a636c677d387d425cbc96e9f4cb654
notes:
https://drive.google.com/file/d/1dH65bwfH-KgrfTzPNp8hIBy6nBn3SrDj/view?usp=sharing
Machine Learning Session 5: Dec-12th-2018
https://global.gotomeeting.com/play/recording/0b1c3ba66022757756589a1d99abb2276d489108e35fbed78f05ed0b667ecff7
notes:
https://drive.google.com/file/d/1itBDGEH-E6JV9TmGAzjz_TvMhDnmGEME/view?usp=sharing
Machine Learning Session 6: Dec-13th-2018
https://register.gotowebinar.com/recording/8624091316151458310
notes:
https://drive.google.com/file/d/16k7HIybwRHYe7YhdEFNKTKAZAhoMeTDu/view?usp=sharing
Machine Learning Session 7: Dec-18th-2018
https://register.gotowebinar.com/recording/772111545429734659
notes:
https://drive.google.com/file/d/1TQjkFmYqBvhmpbMXJqzIw4_S_lpsc2dl/view?usp=sharing
Machine Learning Session 8:
https://register.gotowebinar.com/recording/410121531262710785
Machine Learning Session 9:
https://register.gotowebinar.com/recording/8589759065729297677
Machine Learning Session 10:
https://register.gotowebinar.com/recording/1690573190596158466
Machine Learning Session 11:
https://register.gotowebinar.com/recording/2632740306457288715
Machine Learning Session 12:
https://register.gotowebinar.com/recording/6611190090724288007
Machine Learning Session 13:
https://register.gotowebinar.com/recording/6446329317299855105
Machine Learning Session 14:
Implementation of Linear Regression using R language is statistical approach
Above are 16 machine Learning Sessions including today.
Bharat Sreeram
bsreeram.datascience@gmail.com
sankara.deva2016@gmail.com
6309613028
------------------------------
Bharat Sreeram
bsreeram.datascience@gmail.com
sankara.deva2016@gmail.com
6309613028
--------------------------------------------------------------------
Machine Learning Session 1: Dec-5th-2018
https://global.gotomeeting.com/play/recording/d7c5c19ef6122846296292df6437e4f025f2e1393e60e72bfab1fd4ff97d26e3
Machine Learning Session 2: Dec-6th-2018
https://global.gotomeeting.com/play/recording/07314ce010d0ee64cebf32c399ab3993f05764ecd2c363c81a32e194b610ca6a
notes:
https://drive.google.com/file/d/1aVG9-No4h6fi8GY9QcZmwZe1IiVoL_CV/view?usp=sharing
Machine Learning Session 3: Dec-10th-2018
https://global.gotomeeting.com/play/recording/19cafe83765b2160951e806110e6bbedbc17eaf227341c2477c73e4a400d276d
Machine Learning Session 4: Dec-11th-2018
https://global.gotomeeting.com/play/recording/34593c679b22f3d51f22c1e61ad2c2f752a636c677d387d425cbc96e9f4cb654
notes:
https://drive.google.com/file/d/1dH65bwfH-KgrfTzPNp8hIBy6nBn3SrDj/view?usp=sharing
Machine Learning Session 5: Dec-12th-2018
https://global.gotomeeting.com/play/recording/0b1c3ba66022757756589a1d99abb2276d489108e35fbed78f05ed0b667ecff7
notes:
https://drive.google.com/file/d/1itBDGEH-E6JV9TmGAzjz_TvMhDnmGEME/view?usp=sharing
Machine Learning Session 6: Dec-13th-2018
https://register.gotowebinar.com/recording/8624091316151458310
notes:
https://drive.google.com/file/d/16k7HIybwRHYe7YhdEFNKTKAZAhoMeTDu/view?usp=sharing
Machine Learning Session 7: Dec-18th-2018
https://register.gotowebinar.com/recording/772111545429734659
notes:
https://drive.google.com/file/d/1TQjkFmYqBvhmpbMXJqzIw4_S_lpsc2dl/view?usp=sharing
Machine Learning Session 8:
https://register.gotowebinar.com/recording/410121531262710785
Machine Learning Session 9:
https://register.gotowebinar.com/recording/8589759065729297677
Machine Learning Session 10:
https://register.gotowebinar.com/recording/1690573190596158466
Machine Learning Session 11:
https://register.gotowebinar.com/recording/2632740306457288715
Machine Learning Session 12:
https://register.gotowebinar.com/recording/6611190090724288007
Machine Learning Session 13:
https://register.gotowebinar.com/recording/6446329317299855105
Machine Learning Session 14:
Implementation of Linear Regression using R language is statistical approach
Implementation
of Linear Regression
Using R Language.
Data file:
profiles.txt
Schema :
name,age,exp,qual,income
"name","age","exp","qual","income"
aaa,21,0,btech,20000
bbbb,22,1,btech,25000
ccc,21,0,mtech,25000
dddd,22,1,mtech,30000
ee,25,4,btech,40000
ffff,25,3.5,mtech,47000
gggg,25,4,mtech,50000
hhh,30,8,btech,80000
jjjj,31,8,mtech,91000
eejj,31,9,mtech,95000
task:
construct
relationship between age,exp,qual and income.
Y = X.beta
Beta = inv[t(X).X].[t(X).Y]
Using custom
function for Beta.(coefficient matrix).
> df =
read.csv('D:/mystuff/profiles.txt')
> df
name age exp
qual income
1 aaa
21 0.0 btech 20000
2 bbbb
22 1.0 btech 25000
3 ccc
21 0.0 mtech 25000
4 dddd
22 1.0 mtech 30000
5 ee
25 4.0 btech 40000
6 ffff
25 3.5 mtech 47000
7 gggg
25 4.0 mtech 50000
8 hhh
30 8.0 btech 80000
9 jjjj
31 8.0 mtech 91000
10 eejj 31 9.0 mtech
95000
> df$age
[1] 21 22 21 22 25 25 25 30 31 31
>
> df$q =
5
> df
name age exp
qual income q
1 aaa
21 0.0 btech 20000 5
2 bbbb
22 1.0 btech 25000 5
3 ccc
21 0.0 mtech 25000 5
4 dddd
22 1.0 mtech 30000 5
5 ee
25 4.0 btech 40000 5
6 ffff
25 3.5 mtech 47000 5
7 gggg
25 4.0 mtech 50000 5
8 hhh
30 8.0 btech 80000 5
9 jjjj
31 8.0 mtech 91000 5
10 eejj 31 9.0 mtech
95000 5
>
df$q[df$qual=='mtech']=8
> df
name age exp
qual income q
1 aaa
21 0.0 btech 20000 5
2 bbbb
22 1.0 btech 25000 5
3 ccc
21 0.0 mtech 25000 8
4 dddd
22 1.0 mtech 30000 8
5 ee
25 4.0 btech 40000 5
6 ffff
25 3.5 mtech 47000 8
7 gggg
25 4.0 mtech 50000 8
8 hhh
30 8.0 btech 80000 5
9 jjjj
31 8.0 mtech 91000 8
10 eejj 31 9.0 mtech
95000 8
> X =
data.frame(one=1, a=df$age, e=df$exp, q=df$q)
> X
one
a e q
1 1 21 0.0 5
2 1 22 1.0 5
3 1 21 0.0 8
4 1 22 1.0 8
5 1 25 4.0 5
6 1 25 3.5 8
7 1 25 4.0 8
8 1 30 8.0 5
9 1 31 8.0 8
10 1 31 9.0 8
>
class(X)
[1]
"data.frame"
>
>
class(X)
[1]
"data.frame"
> X =
data.matrix(X)
>
class(X)
[1]
"matrix"
> X
one
a e q
[1,] 1
21 0.0 5
[2,] 1
22 1.0 5
[3,] 1
21 0.0 8
[4,] 1
22 1.0 8
[5,] 1
25 4.0 5
[6,] 1
25 3.5 8
[7,] 1
25 4.0 8
[8,] 1
30 8.0 5
[9,] 1
31 8.0 8
[10,] 1 31 9.0 8
> dim(X)
[1] 10 4
>
> Y =
matrix(df$income)
> Y
[,1]
[1,] 20000
[2,] 25000
[3,] 25000
[4,] 30000
[5,] 40000
[6,] 47000
[7,] 50000
[8,] 80000
[9,] 91000
[10,] 95000
>
> coeffs
= function(x,y){
+ l = solve(t(x)%*%x)
+ r = t(x)%*%y
+ l%*%r
+ }
> beta =
coeffs(X,Y)
> beta
[,1]
one
-155369.530
a 7774.047
e -1078.645
q 1932.194
>
Deriving coefficients
using predefined functions:
> lmfit =
lm(income ~ age + exp + q , data=df)
> beta =
cbind(coefficients(lmfit))
> beta
[,1]
(Intercept)
-155369.530
age 7774.047
exp -1078.645
q 1932.194
>
Quadratic
polynomial model(to transform non linear to linear).
a=df$age
e = df$exp
q = df$q
a2=a^2
e2 = e^2
q2 = q^2
i=df$income
beta2 =
cbind(coefficients(lm(i ~a+ a2 + e+e2+q+q2)))
> #
deriviving cubic polynomial coefficients.
> a3 =
a^3
> e3 =
e^3
> q3 =
q^3
> beta3 =
cbind(coefficients(lm(i ~ a + a2 + a3 +
+ e + e2 + e3
+
+ q + q2 +
q3)))
>
> beta3
[,1]
(Intercept)
2691084.7350
a -304049.7494
a2 11227.3276
a3 -134.8921
e 15809.6330
e2 -2587.9205
e3 148.3180
q 2222.2222
q2 NA
q3 NA
>
In next class we will see , how to
measure accuracy of each model.
Machine Learning Session 15:
Developing predict function and testing accuracy.
> predict = function(x,b){
ycap = x%*%b
ycap
}
input matrix.
> X = cbind(1, df$age, df$exp, df$q)
> dim(X)
[1] 10 4
> X
[,1] [,2] [,3] [,4]
[1,] 1 21 0.0 5
[2,] 1 22 1.0 5
[3,] 1 21 0.0 8
[4,] 1 22 1.0 8
[5,] 1 25 4.0 5
[6,] 1 25 3.5 8
[7,] 1 25 4.0 8
[8,] 1 30 8.0 5
[9,] 1 31 8.0 8
[10,] 1 31 9.0 8
>
> ycap = predict(X, beta)
> dim(ycap)
[1] 10 1
> ycap
[,1]
[1,] 17546.43
[2,] 24241.83
[3,] 23343.01
[4,] 30038.42
[5,] 44328.04
[6,] 50663.94
[7,] 50124.62
[8,] 78883.70
[9,] 92454.33
[10,] 91375.68
>
How to test accuracy of model predictions.
based on closeness expectation between actual target value
and predicted value.
ex:
a is actual age
acap is predicted age.
then distance between actual and predicted.
> a=25
> acap=26
> abs(a-acap)/acap * 100
[1] 3.846154
here distance is 3.84
Then closeness between actual and predicted.
100 - distance.
> 100 - abs(a-acap)/acap * 100
[1] 96.15385
here closeness 96%, if expected closeness 90%,
then above prediction is good.
lets apply above accuracy measurement on our predictions.
ex: y(actual incomes) and ycap(predicted incomes)
> Y = cbind(df$income)
> Y
[,1]
[1,] 20000
[2,] 25000
[3,] 25000
[4,] 30000
[5,] 40000
[6,] 47000
[7,] 50000
[8,] 80000
[9,] 91000
[10,] 95000
>
> ycap
[,1]
[1,] 17546.43
[2,] 24241.83
[3,] 23343.01
[4,] 30038.42
[5,] 44328.04
[6,] 50663.94
[7,] 50124.62
[8,] 78883.70
[9,] 92454.33
[10,] 91375.68
>
> dist = abs(Y-ycap)/ycap * 100
> dist
[,1]
[1,] 13.9832956
[2,] 3.1275150
[3,] 7.0984295
[4,] 0.1278863
[5,] 9.7636619
[6,] 7.2318577
[7,] 0.2486241
[8,] 1.4151260
[9,] 1.5730205
[10,] 3.9663939
>
> closeness = 100 - dist
> closeness
[,1]
[1,] 86.01670
[2,] 96.87248
[3,] 92.90157
[4,] 99.87211
[5,] 90.23634
[6,] 92.76814
[7,] 99.75138
[8,] 98.58487
[9,] 98.42698
[10,] 96.03361
>
> closeness>=90
[,1]
[1,] FALSE
[2,] TRUE
[3,] TRUE
[4,] TRUE
[5,] TRUE
[6,] TRUE
[7,] TRUE
[8,] TRUE
[9,] TRUE
[10,] TRUE
>
> closeness[closeness>=90]
[1] 96.87248 92.90157 99.87211 90.23634 92.76814 99.75138 98.58487
[8] 98.42698 96.03361
> length(closeness[closeness>=90])
[1] 9
> pcnt = length(closeness[closeness>=90])
> pcnt
[1] 9
> n = length(Y)
> n
[1] 10
> acc = pcnt/n * 100
> acc
[1] 90
>
Developing function for accuracy testing.
---------------------------------------
> accuracy = function(y,ycap,closeness){
+ de = 100 - closeness
+ dist = abs(y-ycap)/ycap * 100
+ pcnt = length(dist[dist<=de])
+ n = length(y)
+ acc = pcnt/n * 100
+ acc
+ }
> accuracy(Y, ycap, 80)
[1] 100
> accuracy(Y, ycap, 85)
[1] 100
> accuracy(Y, ycap, 90)
[1] 90
> accuracy(Y, ycap, 95)
[1] 60
>
-------------------------------
Machine Learning Session 16:
Topic : Implemention of Regression models with Python Numpy.
Linear Regression implementation with Python Numpy.
--------------------------------------------
Data file: profiles.txt
"name","age","exp","qual","income"
aaa,21,0,btech,20000
bbbb,22,1,btech,25000
ccc,21,0,mtech,25000
dddd,22,1,mtech,30000
ee,25,4,btech,40000
ffff,25,3.5,mtech,47000
gggg,25,4,mtech,50000
hhh,30,8,btech,80000
jjjj,31,8,mtech,91000
eejj,31,9,mtech,95000
a = []
e = []
q = []
i = []
for line in lines:
w = line.strip().split(',')
age = float(w[1])
exp = float(w[2])
ql=5
if w[3]=='mtech':
ql=8
inc = float(w[-1])/1000
a.append(age)
e.append(exp)
q.append(ql)
i.append(inc)
print(a)
print(e)
print(q)
print(i)
output:
[21.0, 22.0, 21.0, 22.0, 25.0, 25.0, 25.0, 30.0, 31.0, 31.0]
[0.0, 1.0, 0.0, 1.0, 4.0, 3.5, 4.0, 8.0, 8.0, 9.0]
[5, 5, 8, 8, 5, 8, 8, 5, 8, 8]
[20.0, 25.0, 25.0, 30.0, 40.0, 47.0, 50.0, 80.0, 91.0, 95.0]
# preparing input matrix.
import numpy as np
X = np.c_[np.ones(len(a)),a,e,q]
print(X)
[[ 1. 21. 0. 5. ]
[ 1. 22. 1. 5. ]
[ 1. 21. 0. 8. ]
[ 1. 22. 1. 8. ]
[ 1. 25. 4. 5. ]
[ 1. 25. 3.5 8. ]
[ 1. 25. 4. 8. ]
[ 1. 30. 8. 5. ]
[ 1. 31. 8. 8. ]
[ 1. 31. 9. 8. ]]
# output(target) matrix
Y = np.c_[i]
print(Y)
[[20.]
[25.]
[25.]
[30.]
[40.]
[47.]
[50.]
[80.]
[91.]
[95.]]
def coeffs(x,y):
from numpy.linalg import inv
l = inv(x.T.dot(x))
r = x.T.dot(y)
return l.dot(r)
# coefficients of Linear model.
beta1 = coeffs(X,Y)
print(beta1)
[[-155.36953015]
[ 7.77404719]
[ -1.07864489]
[ 1.93219399]]
# preparing input matrix for quadratic model.
ones = np.ones(len(a))
a = np.array(a)
as = a**2
e = np.array(e)
es = e**2
q = np.array(q)
qs = q**2
XX = np.c_[ones,a,as,e,es,q,qs]
print(XX)
[[ 1. 21. 441. 0. 0. 5. 25. ]
[ 1. 22. 484. 1. 1. 5. 25. ]
[ 1. 21. 441. 0. 0. 8. 64. ]
[ 1. 22. 484. 1. 1. 8. 64. ]
[ 1. 25. 625. 4. 16. 5. 25. ]
[ 1. 25. 625. 3.5 12.25 8. 64. ]
[ 1. 25. 625. 4. 16. 8. 64. ]
[ 1. 30. 900. 8. 64. 5. 25. ]
[ 1. 31. 961. 8. 64. 8. 64. ]
[ 1. 31. 961. 9. 81. 8. 64. ]]
# coefficients for quadratic model.
beta2 = coeffs(XX, Y)
print(beta2)
[[ 4.27008000e+05]
[-3.17610019e+04]
[ 5.43459599e+02]
[ 9.43313010e+03]
[-6.16516751e+02]
[ 0.00000000e+00]
[ 4.00000000e+00]]
# input matrix for cubic model
a3 = a**3
e3 = e**3
q3 = q**3
XXX = np.c_[ones, a,a2,a3,e,es,e3,q,qs,q3]
print(XXX)
# coefficients for cubic model
beta3 = coeffs(XXX, Y)
print(beta3)
[[ 5.46816000e+05]
[-6.54644536e+04]
[ 2.48430214e+03]
[-3.09940604e+01]
[ 2.98121043e+03]
[-6.74362346e+02]
[ 3.80136769e+01]
[ 4.06400000e+03]
[ 1.46000000e+02]
[-4.60000000e+01]]
def predict(x,b):
return x.dot(b)
ycap1 = predict(X, beta1)
ycap2 = predict(XX, beta2)
ycap3 = predict(XXX, beta3)
# predictions by linear model
print(ycap1)
[[17.54643073]
[24.24183303]
[23.3430127 ]
[30.038415 ]
[44.32803993]
[50.66394434]
[50.1246219 ]
[78.88369631]
[92.45432547]
[91.37568058]]
# predictions by quadratic model
print(ycap2)
[[ -207.35603075]
[ 217.01820823]
[ -51.35603075]
[ 373.01820823]
[ 613.45510333]
[-1635.17213331]
[ 769.45510333]
[ -600.44815338]
[ 945.58551178]
[ -102.06914593]]
# predictions by cubic model
print(ycap3)
[[-1176.27672812]
[ -459.63846022]
[-1092.27672812]
[ -375.63846022]
[ 399.22115776]
[ 718.43581996]
[ 483.22115776]
[ 288.18008023]
[ -54.26574259]
[ -288.24732376]]
Accuracy by linear by model 80.0
Accuracy by quadratic by model 0.0
Accuracy by cubic by model 0.0
Linear model is best fit for given data.
--------------------------------
Above are 16 machine Learning Sessions including today.
Bharat Sreeram
bsreeram.datascience@gmail.com
sankara.deva2016@gmail.com
6309613028
------------------------------
Comments
Post a Comment