Projects

Here are some projects I have worked on for school, work, or fun.

Multi-modal Detection via Language Queries

Trained multi-modal text-vision models for detecting previously unseen objects in video frames based on natural language queries.

LLMs for Code-Mixed Hate Speech Detection

Fine-tuned 7B parameter LLMs (LLaMa), BERT-based models (XLM-R), and zero-shot LLM Prompting for Hindi-English hate speech detection.

Video Frame Prediction and Segmentation with SSL

Video Prediction with convolution-based SimVP and semantic segmentation with U-Net for modeling the physics of colliding objects.

Causal Counterfactual Forecasting (Causal ML)

Causal Inference x Machine Learning for forecasting counterfactual outcomes to answer what-if questions.

Implicit Feedback Recommender System (Big Data)

Collaborative Filtering Recommender System in PySpark on a dataset with >179M user interactions.

Data Science to unravel the Data Science job market

Statistical analysis of job data (Hypothesis Testing), Salary Prediction (Regression), and Job Title Prediction (Classification).

ML pipeline for covid-19 case prediction

Novel method of using Language Models and Machine Learning to predict COVID-19 cases and emerging variants.

Statistical Modeling of social interactions

Statistical Analysis of contact mixing patterns to devise interventions to curb the spread of infectious diseases.

Document Information Extraction & Generation

CV and NLP based pipeline to extract regulatory inquiries in image based PDFs and generate automated responses. Reduced the manual workload of typing over 1200 responses each month.

Tweet surges & covid-19 cases

Studied the relationship between Tweets and covid cases in India, analyzed the significant themes associated with the latent dimensions.

Email Fraud surveillance

Python service to scan the text content and metadata of emails to identify Business Email Compromise (BEC). We identified a 90 year old customer who had been compromised for over a year.