Machine Learning Linear Regression
In this post, we will learn Machine Learning Techniques Linear Regression using in Python.
Requirement
For this tutorial, following library should be installed in your system.- Pandas
- Quandl
- numpy
- sklearn
References:
Regression identifying data-set and importing it and making it into useful format.
Code snippet used in video:
Code snippet used in video:
import pandas as pd
import quandl
import math
df=quandl.get('WIKI/GOOGL')
df=df[['Adj. Open', 'Adj. High', 'Adj. Low', 'Adj. Close', 'Adj. Volume']]
df['HL_PCT']=(df['Adj. High']-df['Adj. Close'])/df['Adj. Close']*100.00
df['PCT_Change']=(df['Adj. Close']-df['Adj. Open'])/df['Adj. Open']*100.00
df=df[['Adj. Close','HL_PCT','PCT_Change','Adj. Volume']]
print(df.head())
Further code:
forecast_col='Adj. Close'
df.fillna(-99999,inplace =True)
forecast_out=int(math.ceil(0.01*len(df)))
df['label']=df[forecast_col].shift(-forecast_out)
df.dropna(inplace=True)
print(df.head())
print(df.tail())
Regression Training and Testing:
import pandas as pd
import quandl
import math
import numpy as np #Used in creating arrays etc as python doesn't supports array
from sklearn import preprocessing, model_selection , svm
from sklearn.linear_model import LinearRegression
df=quandl.get('WIKI/GOOGL')
df=df[['Adj. Open', 'Adj. High', 'Adj. Low', 'Adj. Close', 'Adj. Volume']]
df['HL_PCT']=(df['Adj. High']-df['Adj. Close'])/df['Adj. Close']*100.00
df['PCT_Change']=(df['Adj. Close']-df['Adj. Open'])/df['Adj. Open']*100.00
df=df[['Adj. Close','HL_PCT','PCT_Change','Adj. Volume']]
forecast_col='Adj. Close'
df.fillna(-99999,inplace =True)
forecast_out=int(math.ceil(0.01*len(df)))
print(forecast_out)
df['label']=df[forecast_col].shift(-forecast_out)
df.dropna(inplace=True)
print(df.head())
print(df.tail())
X=np.array(df.drop(['label'],1)) #Our features
y=np.array(df['label'])
X=preprocessing.scale(X)
df.dropna(inplace=True)
X_train, X_test, y_train, y_test=model_selection.train_test_split(X,y, test_size=0.2)
clf=LinearRegression(n_jobs=100)
clf.fit(X_train, y_train)
accuracy=clf.score(X_test, y_test) #Accuracy is squared error
print(accuracy)
Using Support vector regression classifier
clf=svm.SVR()
clf.fit(X_train, y_train)
accuracy=clf.score(X_test, y_test) #Accuracy is squared error
print(accuracy)
###Regression forecasting and predicting
Comments
Post a Comment