Results-driven Data Engineer specialising in cloud-based distributed systems, real-time streaming pipelines, and scalable ETL architectures. Passionate about transforming raw data into intelligence that drives decisions.
Orchestrated scalable data infrastructure using Azure Data Factory and Databricks, automating ETL workflows ingesting 500M+ records daily with 99.97% uptime and full compliance adherence.
Engineered real-time streaming pipelines using Azure Event Hubs and Databricks Structured Streaming, processing 15TB+ telemetry data daily for immediate fault detection.
Designed end-to-end data pipelines on Azure Synapse Analytics ensuring alignment with scalable architecture standards.
Documented pipeline workflows and best practices in Confluence as SME, improving Agile team onboarding and operational consistency.
Machine Learning Engineer
Sayaam for All
JAN – MAY 2025
San Jose, CA
Optimised Snowflake architecture, reducing compute credits by 47% ($15K monthly savings) through advanced clustering and micro-partitioning.
Designed scalable backend data engines in Python and SQL to power recommendation systems, boosting user engagement by 31%.
Implemented CI/CD pipelines with Azure DevOps and Kubernetes, cutting deployment time from 6 days to 4 hours.
Software Engineer
University of Michigan
OCT 2023 – DEC 2024
Michigan, US
Automated ingestion pipelines using Azure Functions and Event Grid, consolidating 14 disparate sources into a unified data lake processing 2.5M records daily with 25% accuracy improvement.
Built PySpark transformation scripts in Databricks Notebooks ensuring high data quality for downstream analytics.
Delivered Power BI dashboards on Azure SQL enabling trend identification 4 weeks ahead of legacy methods.
Software Engineer
TietoEvry India Pvt Ltd
JAN – DEC 2022
Pune, India
Optimised SQL queries and schemas in Snowflake and Azure SQL, cutting report generation time by 71% and enabling real-time BI.
Championed IaC with Terraform across 8 microservices, reducing provisioning errors by 83%.
Engineered microservices on Azure Kubernetes Service handling 1M+ API requests daily with high availability.
ACADEMIC BACKGROUND
Education
Master of Science in Data Science
University of Michigan — Dearborn
JAN 2023 – DEC 2024
GPA: 4.0 / 4.0
Coursework: Artificial Intelligence · Deep Learning · Cloud Computing · Big Data Visualisation · Intelligent Systems
Bachelor of Engineering in Computer Science
Nagpur University
JUL 2018 – JUN 2022
GPA: 3.77 / 4.0
Coursework: Data Structures · Cloud Computing · System Design · Operating Systems
TECHNICAL ARSENAL
Skills & Technologies
Cloud Platforms
Azure95%
AWS88%
Google Cloud85%
Salesforce74%
Data Engineering
Snowflake93%
Databricks90%
Apache Airflow88%
Apache Kafka85%
Programming
Python / PySpark96%
SQL94%
C++ / Java80%
Bash / Shell84%
DevOps & Tools
Docker / Kubernetes88%
Terraform86%
Azure DevOps / CI-CD90%
Power BI / Tableau84%
Python
Azure
GCP
Docker
Kubernetes
Terraform
TensorFlow
Airflow
MongoDB
MySQL
Git
Flask
React
Node.js
FEATURED WORK
Projects
IEEE · IAVVC GERMANY
99.2%
DETECTION ACCURACY
0.35 ms / FRAME
// 01 — FEATURED
DeepCANvas — Vehicle CAN Bus Intrusion Detection
High-throughput ingestion via Azure Event Hubs absorbing 10K msg/s from vehicle CAN bus. Dual-layer neural network achieves 99.2% accuracy at 0.35ms per frame on Raspberry Pi. Selected for IEEE IAVVC Germany.
Azure Event HubsTensorFlowCAN BusDeep LearningRaspberry Pi
Hybrid Lakehouse integrating Azure Blob Storage for raw LiDAR point clouds and camera images with Snowflake for high-performance metadata querying — petabyte-scale storage with fast SQL analytics access.
Azure Blob StorageSnowflakeLiDARLakehouse
// 03
Personalised Graduation System
Real-time name pronunciation via Cloud Dataflow + TTS at 147ms latency. 15+ languages. 3 universities. 5,000+ students. 98% satisfaction rate.
Cloud DataflowTTS APIFlask
// 04
Payment Risk Alerting
Stream processing on AWS handling 10K+ transactions/second with sub-second suspicious activity alerting. Kinesis → Lambda → Redshift with Spark SQL pattern detection.
Amazon KinesisLambdaRedshift
// 05
Driver Maneuver Recognition
LSTM model classifying 14 driving maneuvers at 95% accuracy with 500ms early-prediction window. Deployed on Vertex AI with BigQuery fleet analytics.