Final-Internship-Project

This repository contains my internship project that I made using Streamlit and Python programming language.

Mode of Execution Used

Pycharm

--> Visit the official website of pycharm:

--> Download according to the platform that will be used like Linux, Macos or Windows.

--> Two versions of Pycharm are avilable-

Community version

--> Community version is open source and we can use it for free without any paid plan.

--> We can download this at the end of pycharm website.

--> After downloading community version we can directly follow the setup wizard and it will be setup.
Professional Version.

--> This is available at the top of website, we can directly download from there.

--> After downloading professional version, follow the below steps.

--> Follow the setup wizard and sign up for the free version (trial version) or else continue with the premium or paid version.

Using Pycharm

--> First, in pycharm we have the concept of virtual environment. In virtual environment we can install all the required libraries or frameworks.

--> Each project has its own virtual environment, so thath we can install requirements like Libraries or Framworks for that project only.

--> After this we can create a new file, various file types are available in pycharm like script files, text files and also Jupyter Notebooks.

--> After selecting the required file type, we can continue the execution of that file by saving it and using this shortcut shift+F10 (In Windows).

--> Output is given in Console while installation happens in terminal in Pycharm.

Streamlit Server

--> Streamlit is a python framework through which we can deploy any machine learning model and any python project with ease and without worrying about the frontend.

--> Streamlit is very user-friendly.

--> Streamlit has pre defined functions for all frontend components and we can directly use them.

--> To install streamlit in your system, just run this command-

pip install streamlit

Running Project in Streamlit Server

Make Sure all dependencies are already satisfied before running the app.

We can Directly run streamlit app with the following command-

streamlit run app.py

where app.py is the name of file containing streamlit code.

By default, streamlit will run on port 8501.

Also we can execute multiple files simultaneously and it will be executed in next ports like 8502 and so on.

Navigate to URL http://localhost:8501

You should be able to view the homepage of your app.

🌟 Project and Models will change but this process will remain the same for all Streamlit projects.

Deploying using Streamlit

Visit the official website of streamlit :
Now make an account with GitHub.
Now add all the code in Github repository.
Go to streamlit and there is an option for new deployment.
Type your Github repository name and specify the file name. If you name your file as streamlit_app it will directly access it else you have to specify the path.
Now also make sure you upload all your libraries and requirement name in a requirement.txt file.
Version can also be mentioned like this python==3.9.
When we mention version in the requirement file streamlit install all dependencies from there.
If everything went well our app will be deployed on web and you can share the link and access the app from all browsers.

About Project :

Complete Description about the project and resources used.

--> In this project I made a streamlit website in which you can apply multiple supervised learning algorithm on Credit Card Fruad dataset.

--> I also did Data Visualization to show the working of this algorithms on the dataset.

--> I have deployed this website using streamlit.

--> Visit Website from : ML Algorithms on Credit Card Dataset

Algorithm Used :

Supervised Learning

--> Basically supervised learning is when we teach or train the machine using data that is well-labelled.

--> Which means some data is already tagged with the correct answer.

--> After that, the machine is provided with a new set of examples(data) so that the supervised learning algorithm analyses the training data(set of training examples) and produces a correct outcome from labeled data.

i) K-Nearest Neighbors (KNN)

--> K-Nearest Neighbours is one of the most basic yet essential classification algorithms in Machine Learning.

--> It belongs to the supervised learning domain and finds intense application in pattern recognition, data mining, and intrusion detection..

--> In this algorithm,we identify category based on neighbors.

ii) Support Vector Machines (SVM)

--> The main idea behind SVMs is to find a hyperplane that maximally separates the different classes in the training data.

--> This is done by finding the hyperplane that has the largest margin, which is defined as the distance between the hyperplane and the closest data points from each class.

--> Once the hyperplane is determined, new data can be classified by determining on which side of the hyperplane it falls.

--> SVMs are particularly useful when the data has many features, and/or when there is a clear margin of separation in the data.

iii) Naive Bayes Classifiers

--> Naive Bayes classifiers are a collection of classification algorithms based on Bayes’ Theorem.

--> It is not a single algorithm but a family of algorithms where all of them share a common principle, i.e. every pair of features being classified is independent of each other.

--> The fundamental Naive Bayes assumption is that each feature makes an independent and equal contribution to the outcome.

iv) Decision Tree

--> It builds a flowchart-like tree structure where each internal node denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (terminal node) holds a class label.

--> It is constructed by recursively splitting the training data into subsets based on the values of the attributes until a stopping criterion is met, such as the maximum depth of the tree or the minimum number of samples required to split a node.

--> The goal is to find the attribute that maximizes the information gain or the reduction in impurity after the split.

v) Random Forest

--> It is based on the concept of ensemble learning, which is a process of combining multiple classifiers to solve a complex problem and to improve the performance of the model.

--> Instead of relying on one decision tree, the random forest takes the prediction from each tree and based on the majority votes of predictions, and it predicts the final output.

--> The greater number of trees in the forest leads to higher accuracy and prevents the problem of overfitting.

vi) Logistic Regression

--> Logistic regression is a supervised machine learning algorithm mainly used for classification tasks where the goal is to predict the probability that an instance of belonging to a given class or not.

--> It is a kind of statistical algorithm, which analyze the relationship between a set of independent variables and the dependent binary variables.

--> It is a powerful tool for decision-making.

--> For example email spam or not.

Dataset Used :

Credit Card Fraud Dataset

--> Dataset is taken from:

--> Contains Fraud data for Classification.

--> The dataset has 31 columns.

--> Dataset is already cleaned,no preprocessing required.

Libraries Used 📚 💻

Short Description about all libraries used.

To install python library this command is used-

pip install library_name

NumPy (Numerical Python) – Enables with collection of mathematical functions to operate on array and matrices.
Pandas (Panel Data/ Python Data Analysis) - This library is mostly used for analyzing, cleaning, exploring, and manipulating data.
Matplotlib - It is a data visualization and graphical plotting library.
Scikit-learn - It is a machine learning library that enables tools for used for many other machine learning algorithms such as classification, prediction, etc.
Seaborn - It is an extension of Matplotlib library used to create more attractive and informative statistical graphics.

Additional Resources 🧮📚📓🌐

To explore a broader range of my machine learning models, crafted during my internship, please visit my dedicated repository: https://github.com/madhurimarawat/Machine-Learning-Using-Python

Thanks for Visiting 😄

Drop a 🌟 if you find this repository useful.

If you have any doubts or suggestions, feel free to reach me.

📫 How to reach me:

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Credit_card_Preprocessing.ipynb		Credit_card_Preprocessing.ipynb
LICENSE		LICENSE
README.md		README.md
Streamlit_app.py		Streamlit_app.py
Updated_Credit_card.csv		Updated_Credit_card.csv
requirements.txt		requirements.txt

License

madhurimarawat/Final-Internship-Project

Folders and files

Latest commit

History

Repository files navigation