(Merit Scholarship, Experiential Learning Award) (GPA – 3.9/4) Relevant Coursework: Databases; Advanced Statistics; Time Series, Data Science I, II & III; Data Visualization; Social Network Analysis
Electives: Econometrics; Probability; Discrete Mathematical Structures; Operations Research; Financial Management
Work & Experience
* Implemented classification of Bill summaries by fine-tuning BERT using Pytorch framework on data extracted from Congress.gov API * Analyzed data on vocations schools by geocoding addresses, mapping using QGIS and visualizing data with geopandas, ggplot on Python
* Obituaries Project: Assessed obituary texts using Named Entity Recognition, fuzzy matching, and record linkage techniques. (Git) * Evaluated name-based race prediction models through classification metrics and developed an ensemble model improving accuracy. * Gestational Diabetes Project: Combined and transformed datasets on R, and analyzed biomarkers data through visualizations. * Medicaid Project (ongoing): Developing an ETL pipeline to process claims of 90 million patients on Databricks with Spark SQL.
* Predicted health risks by developing a Machine Learning pipeline for XGBoost, Random Forest algorithms using scikit-learn framework. * Constructed panel datasets from claims data of multiple health providers using R and generated insights through analysis.
* Enhanced FDR’s 2021 Universal Healthcare proposal through data visualization aids on comparative analysis using Plotly in Python. * Built a Covid-19 surveillance dashboard on Tableau to analyze Covid case counts, deaths, oxygen supply requirements in India.
* Streamlined data flow, created heat map tool using SQL and Tableau, identified 4 high-risk chemicals preventing a loss of $2.5MM p.a. * Created a SARIMA model on sales volume time series data using R to forecast Polyethylene sales, improving accuracy by 8%. * Developed customer segmentation tool based on K-means clustering; dynamic pricing tool using large data sets on costs. * Established framework to analyze risk contributing factors of high-risk assets and stood as finalist of EM Global Analytics challenge. * Led a cross-functional team of 12 and managed automation of tools, reducing manpower by an estimated 220 hrs/mth at IAC. * Identified gaps in fertilizer packaging and provided leads for EMCIPL’s entry into an estimated 160k Tons polymer market.