Summary

Capstone Project- Venue Based Similarity Measures

Compared two Neighbourhoods-East Central London & Birmingham based on Venues Categories
Performed Data Cleaning, Visualisation & Machine learning on the obtained Data
- Data Cleaning by using Pandas tool
- Data Visualisation using Matplotlib
- Machine Learning using Scikit-Learn
Used unsupervised learning algorithm of K-means clustering to form inside City Clusters - ECL: 6, Birmingham: 2
Discussed the application fo clustering with venues categories for each neighborhood

Venue-Based Similarity Measures Pitch.pptx

Venue-Based Similarity Measures in East-Central London & Birmingham Pitch

Used extensive python library to perform Exploratory Data Analysis & Visualisation
Developed a Multiple-Linear regression model with accuracy of 75% to correctly predict house prices in King County, USA.
Used train,cross validation, test-split and changing model from Linear to Polynomial, increasing the accuracy of the model to 94%

Performed Data-visualisation, preprocessing and normalization to ensure data is suitable for analysis
Developed different classifer model tfr comparisonwith K-nearest nighbour, Decision tree, Support Vector Machine (SVM) , and Logistic regression algorithms.
Compared models with F1 score, Jaccard, and log loss methods and concluded the best model.

Page updated

Google Sites

Report abuse