LOADING COORDINATES...
DATA ENGINEER · ML ENGINEER · CLOUD ARCHITECT

Harshit
Arvind Barde

Results-driven Data Engineer specialising in cloud-based distributed systems, real-time streaming pipelines, and scalable ETL architectures. Passionate about transforming raw data into intelligence that drives decisions.

500M+
Records / Day
15 TB+
Data Processed
99.97%
Uptime Achieved
4.0
MS GPA
SCROLL
CAREER TRAJECTORY

Work Experience

Data Engineer
HCL Tech Company
MAY 2025 – PRESENT
Michigan, US
  • Orchestrated scalable data infrastructure using Azure Data Factory and Databricks, automating ETL workflows ingesting 500M+ records daily with 99.97% uptime and full compliance adherence.
  • Engineered real-time streaming pipelines using Azure Event Hubs and Databricks Structured Streaming, processing 15TB+ telemetry data daily for immediate fault detection.
  • Designed end-to-end data pipelines on Azure Synapse Analytics ensuring alignment with scalable architecture standards.
  • Documented pipeline workflows and best practices in Confluence as SME, improving Agile team onboarding and operational consistency.
Machine Learning Engineer
Sayaam for All
JAN – MAY 2025
San Jose, CA
  • Optimised Snowflake architecture, reducing compute credits by 47% ($15K monthly savings) through advanced clustering and micro-partitioning.
  • Designed scalable backend data engines in Python and SQL to power recommendation systems, boosting user engagement by 31%.
  • Implemented CI/CD pipelines with Azure DevOps and Kubernetes, cutting deployment time from 6 days to 4 hours.
Software Engineer
University of Michigan
OCT 2023 – DEC 2024
Michigan, US
  • Automated ingestion pipelines using Azure Functions and Event Grid, consolidating 14 disparate sources into a unified data lake processing 2.5M records daily with 25% accuracy improvement.
  • Built PySpark transformation scripts in Databricks Notebooks ensuring high data quality for downstream analytics.
  • Delivered Power BI dashboards on Azure SQL enabling trend identification 4 weeks ahead of legacy methods.
Software Engineer
TietoEvry India Pvt Ltd
JAN – DEC 2022
Pune, India
  • Optimised SQL queries and schemas in Snowflake and Azure SQL, cutting report generation time by 71% and enabling real-time BI.
  • Championed IaC with Terraform across 8 microservices, reducing provisioning errors by 83%.
  • Engineered microservices on Azure Kubernetes Service handling 1M+ API requests daily with high availability.
ACADEMIC BACKGROUND

Education

Master of Science in Data Science
University of Michigan — Dearborn
JAN 2023 – DEC 2024
GPA: 4.0 / 4.0
Coursework: Artificial Intelligence · Deep Learning · Cloud Computing · Big Data Visualisation · Intelligent Systems
Bachelor of Engineering in Computer Science
Nagpur University
JUL 2018 – JUN 2022
GPA: 3.77 / 4.0
Coursework: Data Structures · Cloud Computing · System Design · Operating Systems
TECHNICAL ARSENAL

Skills & Technologies

Cloud Platforms
Azure95%
AWS88%
Google Cloud85%
Salesforce74%
Data Engineering
Snowflake93%
Databricks90%
Apache Airflow88%
Apache Kafka85%
Programming
Python / PySpark96%
SQL94%
C++ / Java80%
Bash / Shell84%
DevOps & Tools
Docker / Kubernetes88%
Terraform86%
Azure DevOps / CI-CD90%
Power BI / Tableau84%
PythonPython
AzureAzure
GCPGCP
DockerDocker
K8sKubernetes
TerraformTerraform
TensorFlowTensorFlow
AirflowAirflow
MongoDBMongoDB
MySQLMySQL
GitGit
FlaskFlask
ReactReact
Node.jsNode.js
FEATURED WORK

Projects

IEEE · IAVVC GERMANY
99.2%
DETECTION ACCURACY
0.35 ms / FRAME
// 01 — FEATURED
DeepCANvas — Vehicle CAN Bus Intrusion Detection
High-throughput ingestion via Azure Event Hubs absorbing 10K msg/s from vehicle CAN bus. Dual-layer neural network achieves 99.2% accuracy at 0.35ms per frame on Raspberry Pi. Selected for IEEE IAVVC Germany.
Azure Event HubsTensorFlowCAN BusDeep LearningRaspberry Pi
// 02
Autonomous Vehicle Sensor Fusion Platform
Hybrid Lakehouse integrating Azure Blob Storage for raw LiDAR point clouds and camera images with Snowflake for high-performance metadata querying — petabyte-scale storage with fast SQL analytics access.
Azure Blob StorageSnowflakeLiDARLakehouse
// 03
Personalised Graduation System
Real-time name pronunciation via Cloud Dataflow + TTS at 147ms latency. 15+ languages. 3 universities. 5,000+ students. 98% satisfaction rate.
Cloud DataflowTTS APIFlask
// 04
Payment Risk Alerting
Stream processing on AWS handling 10K+ transactions/second with sub-second suspicious activity alerting. Kinesis → Lambda → Redshift with Spark SQL pattern detection.
Amazon KinesisLambdaRedshift
// 05
Driver Maneuver Recognition
LSTM model classifying 14 driving maneuvers at 95% accuracy with 500ms early-prediction window. Deployed on Vertex AI with BigQuery fleet analytics.
LSTMVertex AIBigQuery
RESEARCH

Publications

IEEE SELECTED
DeepCANvas — Neural System for Vehicle CAN Data Anomaly Detection
IEEE IAVVC Germany · 2024
PUBLISHED
iMee: Implementation of Customer-Oriented ERP
IJISRT · Vol 7 · Issue 4 · April 2022 · ID: IJISRT22APR1501
GET IN TOUCH

Contact Me

LOCATION
Michigan, United States
🚀
MESSAGE TRANSMITTED
Your message has been sent across the galaxy. I'll respond at lightspeed.