Hi, I'm Yaseen

Machine Learning Engineer @ Techverx, BS CS @ Malakand University

Paddling through the ocean that is Machine Learning.

Contact Me

Scroll Down

About Me

My Introduction

An undergraduate CS student of the class of 2023, with 25,000+ views on technical articles about AI and ML on Medium.

8 Machine Learning Projects
Completed

8 Articles
Written

2 Published
Papers

Download CV

Skills

My Technical Level

Python

90%

Java

80%

SQL

85%

PySpark

75%

R

70%

JavaScript

70%

Android

85%

MS Excel

70%

Photoshop

70%

Indesign

90%

NumPy

80%

pandas

90%

matplotlib

70%

scikit-learn

85%

OpenCV

75%

Tensorflow

80%

Pillow

65%

Spark MLlib

70%

Looker

75%

streamlit

80%

Pytorch

85%

seaborn

70%

Flask

40%

Linear and Logistic Regression

95%

Decision Trees

95%

Ensemble Models

90%

Clustering

65%

Convolutional Neural Networks

80%

Natural Language Processing

65%

Exploratory Data Analysis

90%

Multi-modal Learning

70%

Time Series

55%

AWS Sagemaker

65%

AWS EMR

75%

AWS Lambda

70%

Big Query

40%

Qualification

My Personal Journey

Education

Work

Bachelor of Science in Computer Science

3.73 out of 4 CGPA
Malakand University, KPK, Pakistan

2019-2023

Higher Secondary in Science

Govt Post Graduate Jahanzeb College, KPK, Pakistan

2017-2019

Secondary

Hira School & College, Mingora, KPK, Pakistan

2005-2017

Data Scientist Intern

Eluvio

June 2022 - August 2022

TODO

Teaching Assistant

Rutgers University School of Graduate Studies

September 2021 - May 2022

Taught R, SQL and Amazon Redshift and graded weekly assignments and exams for 78 students across two ` courses – “Data 101” and “Database Systems for Data Science”

Business Analyst

Quantiphi

October 2020 - August 2021

Researched and presented highlights of the US stimulus bills to internal stakeholders that informed Quantiphi’s Public Sector business strategy
Performed market research on 200 organizations in the US Education industry and came up with an effective go- to market strategy that converted four cold leads
Presented solution decks showcasing how machine learning can be incorporated into their existing processes to four leads, converting two of them
Analyzed and reported quarterly revenue figures to internal stakeholders using Looker dashboards
Initiated and led the creation of an internal repository to keep track of research advancements in machine learning; this was leveraged by 230 people in the organization including founders

Freelance Android Developer

IPLit Solutions LLP

February 2020 - March 2020

Developed and deployed an Android application for handsfree token printing for use in hospitals and clinics
Currently in use in two hospitals across the city

Project Intern

Fractal Analytics

June 2019 - July 2019

Built a model for classifying 50 products with a 80% accuracy that was delivered as part of the consumer behavior analysis project for a Fortune 500 company
Coded a script for scrapping images of representative products from e-commerce websites using Selenium and annotated 3500 images from the scrapped data to create a dataset for model training

Portfolio

My Projects

Food AI

Cross-Modal Representation Learning

Beat the baseline retrieval performance(here) for the Recipe1M cross-modal food recipe retrieval task by 80% by improving on the feature extraction pipeline

Improved retrieval performance by learning shared multi-modal representations using CCA and non-linear neural networks trained using Triplet Loss

Enhanced the explainability of the system by incorporating Vision Transformers and cross-modal attention when learning shared representations

Tech Stack

Research Papers Referred

Learning Cross-Modal Embeddings for Cooking Recipes and Food Images

Cross-Modal Retrieval and Synthesis (X-MRS): Closing the Modality Gap in Shared Representation Learning

CRecipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images

Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval

MCEN: Bridging Cross-Modal Gap between Cooking Recipes and Dish Images with Latent Variable Model

View Code View Report View Presentation

Movie Recommendation from Conversational Data

Natural Language Processing

Obtained a 3% improvement on existing results by implementing the paper here from scratch and performing hyperparameter tuning on all the three CF approaches: KNN, SVD and SVDpp.

Experimented with neural CF approaches employing Neural Matrix Factorization as an extension of the paper and obtained comparable results of RMSE=1.232 and MAE=0.9569

Tech Stack

Research Papers Referred

You Sound Like Someone Who Watches Drama Movies: Towards Predicting Movie Preferences from Conversational Interactions

View Code View Report View Presentation

Logo Detection

Convolutional Neural Networks

Reproduced and improved the results of open set and closed set logo detection from here by a factor of 12% using YOLOv5 detector

Obtained a classification accuracy of 22.56% for 47 classes of the Flickr-47 dataset using a logo classification architecture consisting of YOLOv5 and template matching focused on both abstract and textual logos

Tech Stack

Research Papers Referred

Open Set Logo Detection and Retrieval

View Code View Report View Presentation

Autoencoder Image Colorization

Convolutional Neural Networks

Built a 11-layer deep autoencoder neural network using residual connections that colorizes black and white images

Trained the network on 10,000 images from FloydHub and deployed online via Streamlit

Tech Stack

View Code

New York Taxi Fare Prediction

Big Data

Analyzed a 55-million-rows dataset on the cloud to determine varying trends in taxi fares across both location and time

Augmented the data with features that help analyze trips to and from airports and across different boroughs of NY City

Predicted taxi fares to an RMSE score of 4.28 by training a Random Forest model on the augmented dataset

Tech Stack

View Code

FPL Team-Maker

Exploratory Data Analysis

Developed and deployed a customizable application that uses pandas and Exploratory Data Analysis to suggest an optimal team to be entered into the Fantasy Premier League fantasy soccer game

Ranked in the top 2% in worldwide ranking among 8.2 million players by leveraging this application

Tech Stack

View Code

Undergrad Final Year Project

Natural Language Processing

Built a text simplification system that can work on text and simplify it by removing difficult-to-understand words

Modeled and trained Transformer models that internalized the semantics of and recognized complex words in input

Improved the performance of the application by preceding the transformer architecture with a Complex Word Identification (90.23% accuracy) model that flagged the complex words beforehand

Tech Stack

View Code

Abalone Age Prediction

Machine Learning - Regression

Determined the ages of abalones (snails) using classification techniques and leveraging their physical characteristics

Improved the accuracy of determining age using regression techniques and obtained a MAE of 0.936

Concluded that the dataset is not large enough to get the desired MAE of 0.5 implying correct age prediction

Tech Stack

View Code

Alien Shooter

Python Game Development

Expanded the ‘Space Invader’ game to include three modes of play: Arcade, Timed and Survival

Tech Stack

View Code

Reminder - Todo List

Android Development

Developed an Android application that acts as a combination of a reminder app and a notes app

Published the app on Google Play Store, and currently has 50+ installs with a rating of 4.6

Tech Stack

View Code

Research

My Publications

International Journal of Computer Applications

Vol. 178, No. 50 (43-49)

Abstract

Abalones are sea snails or molluscs otherwise commonly called as ear shells or sea ears. Because of the economic importance of the age of the abalone and the cumbersome process that is involved in calculating it, much research has been done to solve the problem of abalone age prediction using its physical measurements available in the UCI dataset. This paper reviews the various methods like decision trees, clustering, SVM using Tomek links, CGANs and CasCor used in an attempt to solve it. Furthermore, in contrast to previous research that saw this as a classification problem, this paper approaches it as a linear regression problem and analyses the results.

Read it!

International Journal of Computer Sciences and Engineering

Vol. 8, Issue 6 (1-5)

Abstract

Natural Language Processing is an active and emerging field of research in the computer sciences. Within it is the subfield of text simplification which is aimed towards teaching the computer the so far primarily manual task of simplifying text, efficiently. While handcrafted systems using syntactic techniques were the first simplification systems, Recurrent Neural Networks and Long Short Term Memory networks employed in seq2seq models with attention were considered state-of-the-art until very recently when the transformer architecture which did away with the computational problems that plagued them. This paper presents our work on simplification using the transformer architecture in the process of making an end-to-end simplification system for linguistically complex reference books written in English and our findings on the drawbacks/limitations of the transformer during the same. We call these drawbacks as the Fact Illusion Induction, Named Entity Problem and Deep Network Problem and try to theorize the possible reasons for them.

Read it!

Certifications

Extra Courses I have Undertaken

Certified Cloud Practitioner

Expiry Date: July 17, 2024

View Certificate

LookML Developer

Expiry Date: March 28, 2022

View Certificate

AWS Machine Learning Engineer Nanodegree

Expiry Date: Does not expire

View Certificate

IBM AI Engineering

Expiry Date: Does not expire

View Certificate

Image and Video Processing: From Mars to Hollywood with a Stop at the Hospital

Expiry Date: Does not expire

View Certificate

Deep Learning Specialization

Expiry Date: Does not expire

View Certificate

Blog

My Technical Articles

How I get financial aid on coursera?

Show your writing skills and try on your own…🔥By watching my answers, you will get a rough idea. (I got Introduction to Artificial Intelligence Course by IBM for free by writing this)

Read it!

A Philosophical Look at Climate Change

… And why its here to stay

Read it!

10 Points to Make it Big in the Data Industry

People want to make careers here. But they are often deafened by the noise that surrounds them.

Read it!

What Mainstream AI is (Not) Doing

The pandemic accelerated AI adoption — and made Big Tech richer — but did AI adoption happen in the places where it was needed?

Read it!

Introduction to PySpark via AWS EMR and Hands-on EDA

Performing EDA on NY Taxi Fare Dataset to see PySpark in action — because cloud computing is the next big thing!

Read it!

Fantasy Premier League x Data Analysis: Being Among the Top 2%

A brief overview of the application I built, in which I have employed data analysis to power my FPL team up the charts

Read it!

Kernel Regression from Scratch in Python

Everyone knows Linear Regression, but do you know Kernel Regression?

Read it!

Intro to Machine Learning via the Abalone Age Prediction Problem

The best way to dive into ML is to see it in action. Here it is!

Read it!

Contact Me

Get in Touch

Call Me

+92 (302) 9770-128

Email

yaseenuom6@gmail.com

Location

Mingora Swat, KPK, Pakistan

Hi, I'm Yaseen

Machine Learning Engineer @ Techverx, BS CS @ Malakand University

About Me

Skills

Development

Python

Java

SQL

PySpark

R

JavaScript

Android

MS Excel

Photoshop

Indesign

Frameworks

NumPy

pandas

matplotlib

scikit-learn

OpenCV

Tensorflow

Pillow

Spark MLlib

Looker

streamlit

Pytorch

seaborn

Flask

Machine Learning

Linear and Logistic Regression

Decision Trees

Ensemble Models

Clustering

Convolutional Neural Networks

Natural Language Processing

Exploratory Data Analysis

Multi-modal Learning

Time Series

Cloud Services

AWS Sagemaker

AWS EMR

AWS Lambda

Big Query

Qualification

Bachelor of Science in Computer Science

Higher Secondary in Science

Secondary

Data Scientist Intern

Teaching Assistant

Business Analyst

Freelance Android Developer

Project Intern

Portfolio

Food AI

Cross-Modal Representation Learning

Tech Stack

Research Papers Referred

Learning Cross-Modal Embeddings for Cooking Recipes and Food Images

Cross-Modal Retrieval and Synthesis (X-MRS): Closing the Modality Gap in Shared Representation Learning

CRecipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images

Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval

MCEN: Bridging Cross-Modal Gap between Cooking Recipes and Dish Images with Latent Variable Model

Movie Recommendation from Conversational Data

Natural Language Processing

Tech Stack

Research Papers Referred

You Sound Like Someone Who Watches Drama Movies: Towards Predicting Movie Preferences from Conversational Interactions

Logo Detection

Convolutional Neural Networks

Tech Stack

Research Papers Referred

Open Set Logo Detection and Retrieval

Autoencoder Image Colorization

Convolutional Neural Networks

Tech Stack

New York Taxi Fare Prediction

Big Data

Tech Stack

FPL Team-Maker