Salary Prediction Model

Using Docker Container

Babuakash
2 min readMay 27, 2021

Task -1

✍🏻 Pull the Docker container image of CentOS image from DockerHub and create a new container.

✍🏻 Install the Python software on the top of the docker container.

✍🏻 In Container you need to copy/create a machine learning model which you have created in Jupiter notebook.

✍🏻 Create a blog/article/video step by step you have done in completing this task.

First I have installed Docker on my centos by using the command.

yum install docker-ce — nobest

We can check if docker is installed or not by below command.

rpm -q docker-ce

Now start the docker

systemctl start docker — now

AT THIS POINT WE HAVE INSTALLED DOCKER ON OUR OS NOW TIME TO PULL THE CENTOS FROM DOCKER HUB.

For pulling centos from Docker Hub we can use command.

docker pull centos

Now we can check images in our OS by using

docker images

Now run the our container

docker run -it centos

Now inside this newly created container we have to install python and some dependencies.

yum install ncurses python3 -y

We have to install pandas library using pip3.

pip3 install pandas

Also install scikit-learn library to use linear regression and other functions.

pip3 install scikit-learn

At this point we have installed all dependencies.

Now comes to main part.

Download the salary dataset from Kaggle.

After downloading dataset now we have to copy the dataset file from Host machine to docker container .

So come to base terminal and use cp command.

docker cp /root/salary.csv your_container_id:/root/

Now dataset is copied in docker container and we can train our model using the python code.

Python code

import pandas as pd
import sklearnimport joblib
from sklearn.linear_model import LinearRegression
data = pd.read_csv(“/root/salary.csv”)

x = data[“YearsExperience”]
y = data[“Salary”]
print(x)
print(y)

# loading LinearRegression
model = LinearRegression()
x = x.values
x = x.reshape(-1,1)
print(“Predicted Salary = “)

#fiting our x andy into model
model.fit(x,y)
print(model.predict([[10]]))

#saving model by name result.pkl
joblib.dump(model,”result.pkl”)

we can run above code using command python3 test.py .

After running this code our model will be trained and we will get file of result.pkl file that is our created model and saved in this formate.

Now we can predict the values by giving input of year of experience and get result.

Hope you find this article Helpful !!

This is my github repository https://github.com/babuakash68/SummerInternship.git

For any query contact me by Linkedin

https://www.linkedin.com/in/akash-babu-mca-2019/

Thanks for Reading :)

--

--