KAUSHIK SONGARWALA

Data Engineer & Analytics Expert

Seasoned AI Data Engineer focused on building AI and LLM-powered data pipelines and GEN AI cloud-native platforms. Delivered a $1M+ claims reconciliation pipeline with 99.9% uptime, integrating LLM models and generative AI workflows to reduce manual processing by 40%. Skilled in MLOps, feature engineering, real-time analytics, and scalable AI/ML architectures..

Experience

Data Engineer

Intellirev

May 2025 - Present
  • Designed end-to-end ETL pipelines using Python (Pandas, PySpark) and MongoDB, reconciling $1M+ in annual claims with 99.9% uptime
  • Integrated ML models into workflows, reducing manual reconciliation time by 40% and detecting 95% of billing anomalies
  • Architected HIPAA-compliant AWS environment handling 500+ monthly transactions, optimizing throughput by 30%
  • Collaborated with data scientists and Power BI developers to boost collected revenue by 15%

Data Engineer

Spectrum

Mar 2024 - Apr 2025
  • Built ETL pipelines and SQL transformations, improving throughput by 40% and reducing reporting delays by 2+ days
  • Created Tableau dashboards integrated with data lakes for executive-level visibility
  • Developed containerized FastAPI microservices with CI/CD, cutting deployment cycles by 50%
  • Standardized data models improving reporting accuracy to 99%

Data Engineer

BNY Mellon

Jul 2023 - Mar 2024
  • Engineered data integration pipelines across Snowflake, MySQL, and SAP, increasing efficiency by 30%
  • Delivered Tableau and Power BI dashboards leveraging Databricks, cutting latency by 35%
  • Implemented data governance with AWS IAM and audit logs ensuring compliance

Data Analyst

Awaywegoo

Jul 2019 - Jul 2021
  • Developed PostgreSQL analytics platform supporting 10K+ monthly queries
  • Built modular ETL pipelines using Alteryx and MySQL, reducing latency by 40%
  • Created reusable dashboard templates promoting analytics self-service adoption

Skills & Technologies

Python
SQL
AWS
Snowflake
Databricks
Airflow
PySpark
Pandas
MySQL
PostgreSQL
MongoDB
Tableau
Power BI
Machine Learning
ETL Pipelines
Git
Docker
FastAPI
SAP
GraphQL
Agile/SCRUM
Data Governance

Featured Projects

FRAUD DETECTION ACTIVE

Fraud Detection System

Interactive Streamlit application for real-time fraud detection using advanced machine learning algorithms. Features comprehensive data analysis, model training, and predictive capabilities with an intuitive dashboard interface.

View Live Demo →
DATA SOURCE ETL PIPELINE EXTRACT TRANSFORM LOAD INSIGHTS DATA PROCESSING ACTIVE

Synthetic Data Generation using Generative AI

Comprehensive data analysis and ETL pipeline built in Google Colab. Implements advanced data processing techniques, feature engineering, and visualization methods for extracting actionable insights from complex datasets.

Open in Colab →
INPUT HIDDEN LAYERS OUTPUT MODEL ACCURACY: 95.2%

Document Analysis using LLMs

Advanced machine learning model development and experimentation notebook. Includes model selection, hyperparameter tuning, cross-validation, and performance optimization workflows for production deployment.

Open in Colab →

Education

Stevens Institute of Technology

Master of Science in Information Systems

Aug 2021 - May 2023

GPA: 3.819 / 4.0

Get In Touch

I'm always interested in hearing about new opportunities, collaborations, or just connecting with fellow data enthusiasts. Feel free to reach out!

Kaushik Songarwala

Data Engineer & Analytics Expert

📍 Jersey City, NJ, USA 📱 +1 (551) 689-6916 📧 songarwala@gmail.com 💼 LinkedIn

Summary

Seasoned data engineer with a strong track record in designing and maintaining end-to-end ETL pipelines and cloud-native data platforms. Delivered high-impact solutions such as a $1M+ claims reconciliation pipeline on AWS that achieved 99.9% uptime and integrated machine-learning models to cut manual processing time by 40%. Also built scalable data integration and CI/CD workflows that boosted reporting throughput by 40% and reduced deployment cycles by half. Seeking to apply this expertise to accelerate data-driven decision making and operational efficiency for the organization.

Professional Experience

Data Engineer
Intellirev
May 2025 - Present
  • Designed and maintained end-to-end ETL pipelines using Python (Pandas, PySpark) and MongoDB, reconciling $1M+ in annual claims and payments with 99.9% data uptime, enabling real-time visibility across financial and supply chain functions.
  • Integrated machine learning models into ETL workflows, reducing manual reconciliation time by 40% and detecting 95% of billing anomalies before submission, improving automation accuracy and compliance.
  • Architected a HIPAA-compliant cloud environment on AWS (EC2, S3, Lambda) handling 500+ monthly data transactions, optimizing throughput by 30% while ensuring regulatory-grade data security.
  • Collaborated with data scientists, Power BI developers, and SAP stakeholders to operationalize ML-driven claim analysis, boosting collected revenue by 15% and surfacing actionable insights for supply chain and risk mitigation strategies.
Data Engineer
Spectrum
Mar 2024 - Apr 2025
  • Built and deployed ETL pipelines and SQL transformations in MySQL server, improving throughput by 40% and reducing reporting delays by 2+ days across business intelligence workflows.
  • Created Tableau dashboards integrated with data lakes, enhancing executive-level visibility into marketing and procurement performance.
  • Developed containerized FastAPI microservices and implemented CI/CD with GitHub Actions, cutting deployment cycles by 50% and ensuring consistent delivery of analytics applications.
  • Partnered with cross-functional teams to standardize data models, Power Automate flows, and SQL templates, improving reporting accuracy to 99% and reducing repetitive manual processes by 35%.
Data Engineer
BNY Mellon
Jul 2023 - Mar 2024
  • Engineered and automated data integration pipelines across Snowflake, MySQL, and SAP, increasing financial reporting efficiency by 30% and streamlining compliance workflows.
  • Delivered Tableau and Power BI dashboards leveraging Databricks real-time data streams, cutting latency of market risk insights by 35% and enabling faster executive response.
  • Implemented data governance and lineage tracking with AWS IAM, audit logs, and Power Automate notifications, ensuring secure, traceable, and compliant data flows.
Data Analyst
Awaywegoo
Jul 2019 - Jul 2021
  • Developed a PostgreSQL analytics platform supporting 10K+ monthly queries, accelerating GTM and supply chain decision-making processes.
  • Built modular ETL pipelines using Alteryx and MySQL, reducing batch processing latency by 40% and increasing data reliability for executive reporting.
  • Created reusable dashboard templates for investment teams, reducing repetitive requests and promoting analytics self-service adoption across teams.

Education

Master of Science in Information Systems
Stevens Institute of Technology
Aug 2021 - May 2023

GPA: 3.819 / 4.0

Technical Skills

Python SQL AWS Snowflake Databricks Airflow PySpark Pandas MySQL PostgreSQL MongoDB NoSQL Oracle Tableau Power BI Machine Learning Git Docker FastAPI SAP GraphQL Alteryx R Google Analytics Agile/SCRUM Linux/Unix Data Governance Business Analytics Operations Research Financial Analysis