Hey, I'm Sarvar — an AI engineer and data scientist based in NYC (open to relocate).
Most of my career has been in pharma and biotech. At AbbVie I built agentic systems for statistical workflow automation — LLM pipelines, document Q&A, and a full-stack analysis platform used by biostatisticians day-to-day.
Before that, at NYU Langone I built modeling pipelines for wearable physiological time-series and developed early-signal classifiers for cognitive decline. At Regeneron I built diagnostic software for clinical data quality review.
I've also done ML research at the Feng Lab, evaluating CNNs and multimodal models like CLIP and Qwen-VL on image classification benchmarks.
Outside of those roles, I founded AcadyLearn — an AI app that turns uploaded documents into quizzes and flashcards, built end-to-end with LLM workflows and AWS infrastructure.
I also built an agentic workspace for public health data exploration at a hackathon — LLM-driven loops for profiling, cleaning, and analysis with CDC/Socrata integration. More projects on GitHub.
Visual Science QA via QLoRA Fine-Tuning of SmolVLM-500M
Can a 500M-parameter vision-language model be meaningfully fine-tuned for K–8 science MCQ under strict hardware constraints (5M trainable params, free-tier T4 GPU)? Applied QLoRA with four staged ablations covering rank, alpha, target modules, and scoring method. Reached 0.875 public leaderboard accuracy, up from a 0.819 baseline — with a key finding that LoRA alpha interacts with module scope: raising alpha hurts attention-only adapters but helps when MLP projections are also included.
Python
QLoRA
Vision-Language
Fine-Tuning
HealthLab Agent
An agentic workspace for public health data exploration built at a hackathon. Users upload CSVs or pull datasets directly from the CDC/Socrata catalog, then run LLM-driven loops for profiling, cleaning, and analysis — generating charts, statistics, and Markdown reports. Integrates PubMed to surface relevant literature alongside the data findings.
Python
FastAPI
Next.js
Pydantic-AI
Agentic
Automated Fetal Health Classification from Cardiotocography
Can ML models reliably flag high-risk pregnancies from CTG signals — reducing reliance on operator interpretation? Compared logistic regression, Lasso, and random forest on 2,126 CTG records (UCI) classifying fetal health as Normal, Suspect, or Pathological. Random forest achieved 94.6% accuracy and 96.9% balanced accuracy on the critical Pathological class, with abnormal short-term heart rate variability as the top predictive feature.
R
Classification
Random Forest
Lasso
Depression, Antihypertensives, and Uncontrolled Hypertension
Does antihypertensive medication use change how depression affects blood pressure control? Modeled the interaction using multivariable logistic regression and random forest on 39,467 NHANES participants (2005–2020) with survey-weighted analyses. Among medicated patients, each unit increase in depression score raised the odds of uncontrolled hypertension (aOR: +0.04, 95% CI: 0.02–0.06); logistic regression slightly outperformed random forest on AUC (0.79 vs. 0.78).
R
Epidemiology
Logistic Regression
NHANES
Physical Inactivity and Obesity Across U.S. States: A Longitudinal Analysis
Do states with higher inactivity rates see faster obesity growth over time? Applied linear mixed-effects models to 709 state-year observations from CDC BRFSS and Census ACS data (2011–2024). Obesity rose ~0.58 pp/year on average; 67% of total variance was attributable to stable between-state differences (ICC = 0.67). Physical inactivity and poverty both independently predicted higher obesity prevalence after controlling for time trends and clustering.
R
Mixed-Effects Models
Longitudinal
Public Health
Large-Scale Psychometric Evaluation of the Big Five Personality Measure
Does the open-source 50-item Big Five test hold up at scale? Evaluated reliability and validity using a 100,000-observation random sample from 1M+ international respondents (2016–2018). Cronbach's alpha ranged from 0.79–0.89 across all five traits, exploratory factor analysis with parallel analysis reproduced the expected five-factor structure, and low inter-trait correlations confirmed divergent validity.
Stata
Psychometrics
Factor Analysis
COVID-19 Case Fatality Rate Analysis
How did COVID-19 mortality risk vary across U.S. states over time? Calculated monthly CFR for all 50 states using NYT surveillance data, then visualized trends with faceted line plots and choropleth maps. CFR declined nationally across 2020–2023, with northeastern states peaking highest early in the pandemic.
R
ggplot2
dplyr
Maps
Music Listening Behaviors as Predictors of Depression
Do the genres you listen to predict how depressed you feel? Built a multiple linear regression model on 728 respondents from the MxMH Survey (Kaggle) using genre frequencies and demographics as predictors. The model explained 18.2% of variance in depression scores; age was protective, Classical listening was positively associated with depression, and Country showed a significant negative association.
R
Linear Regression
Mental Health