Python
Pandas
Numpy
Matplotlib
Seaborn
Folium
FourSquare API
Geolocator API
Compared two Neighbourhoods-East Central London & Birmingham based on Venues Categories
Performed Data Cleaning, Visualisation & Machine learning on the obtained Data
Data Cleaning by using Pandas tool
Data Visualisation using Matplotlib
Machine Learning using Scikit-Learn
Used unsupervised learning algorithm of K-means clustering to form inside City Clusters - ECL: 6, Birmingham: 2
Discussed the application fo clustering with venues categories for each neighborhood
Venue-Based Similarity Measures in East-Central London & Birmingham Pitch
Python
Pandas
Numpy
Beautiful Soup
Scikit Learn
Used extensive python library to perform Exploratory Data Analysis & Visualisation
Developed a Multiple-Linear regression model with accuracy of 75% to correctly predict house prices in King County, USA.
Used train,cross validation, test-split and changing model from Linear to Polynomial, increasing the accuracy of the model to 94%
Python
Pandas
Numpy
Scikit-Learn
Performed Data-visualisation, preprocessing and normalization to ensure data is suitable for analysis
Developed different classifer model tfr comparisonwith K-nearest nighbour, Decision tree, Support Vector Machine (SVM) , and Logistic regression algorithms.
Compared models with F1 score, Jaccard, and log loss methods and concluded the best model.