Work history
Data Analyst Intern
Sustainability data & ESG analytics
Engineered NLP pipelines to convert unstructured ESG disclosures (500+ retailers) into structured features for benchmarking models.
Built LLM normalization + embedding system (Instructor, 1024-d) with Qdrant-backed semantic retrieval.
Compressed embedding space (1024→15) via UMAP + HDBSCAN to cluster topics for retrieval-augmented ranking.
Optimized text preprocessing (regex filtering), reducing pipeline latency 20% and manual review 30%.
Data Science Intern
Fintech · MSME credit scoring
Built multimodal ML pipeline (speech, vision, transaction data) for MSME credit scoring, increasing ETL throughput 40%.
Leveraged OWLv2 for prompt-based object detection with non-maximum suppression to extract structured category signals; fine-tuned Whisper-small on regional speech data, achieving 31.5% lower Word Error Rate.
Developed anomaly detection dashboards (Tableau) and enforced data integrity via SQL-based validation.
Software Engineer Intern
Cloud & software quality assurance
Led backend/API reliability testing (200+ tests, BrowserStack), identifying and resolving 15+ production vulnerabilities.
Reduced latency 30% (5s→3.5s) by optimizing UI testing workflows and request handling.