Santhosh Reddy Mudiyala

Software Development Engineer | AI/ML Specialist | Distributed Systems Architect
New York, US.

About

Highly accomplished Software Development Engineer and AI/ML Specialist with a Master's in Computer Science, adept at designing, developing, and optimizing high-performance distributed systems and scalable AI/ML solutions. Proven track record in enhancing system reliability, reducing latency, and driving significant operational efficiencies across diverse platforms, including AWS and Azure. Eager to leverage expertise in cutting-edge technologies and deep problem-solving skills to innovate and deliver robust software solutions in challenging technical environments.

Work

Amazon Web Services
|

Software Development Engineer Intern

Seattle, WA, US

Summary

Engineered core components for AWS Glue Job Execution Service, focusing on compute provisioning and infrastructure scalability for serverless data workloads across distributed systems at scale.

Highlights

Developed core components of AWS Glue Job Execution Service (JES) in Java, optimizing compute provisioning for serverless data workloads across distributed systems at scale.

Designed and implemented a telemetry-driven Availability Zone (AZ) selection service using AWS Lambda, DynamoDB, and CloudWatch, reducing Warm Pool refresh overhead by 20% and enhancing infrastructure provisioning scalability.

Implemented a fault-tolerant dynamic AZ picker with weighted round-robin routing and multi-level failover, reducing fallback frequency by 40% and strengthening service reliability.

Built a real-time load-testing framework to simulate burst traffic across 100K+ EC2 instances, identifying critical bottlenecks and validating system performance under peak production load.

Stony Brook University
|

Graduate Research Assistant

Stony Brook, NY, US

Summary

Developed and deployed scalable AI/ML systems for healthcare, focusing on opioid risk prediction and clinical decision support with measurable clinical impact.

Highlights

Built a scalable opioid risk prediction system by training deep learning models on 5M+ EHR records, achieving 0.940 AUROC and 0.825 F1 for high-risk patient identification.

Designed a multi-agent AI system using LangChain, GPT-4, and Azure AI Foundry to retrieve patient history and drug interaction data, generating opioid risk explanations and intervention recommendations with sub-second latency.

Deployed a large-scale ML inference pipeline at Stony Brook University Hospital, integrating predictive models with real-time clinical dashboards via Flask APIs, enabling 50+ clinicians to identify high-risk patients 3x faster than manual review.

ITSS Global
|

Senior Software Developer

Bangalore, Karnataka, India

Summary

Led the development and optimization of a high-throughput distributed payment processing service, significantly enhancing performance, stability, and transaction reliability for 5M+ daily transactions.

Highlights

Built a distributed payment processing service in C++ using gRPC and Oat++, processing 5M+ daily transactions and reducing P99 latency from 50ms to under 10ms through hot-path optimization and asynchronous I/O.

Optimized payment processing service by introducing memory pools and custom allocators, reducing heap fragmentation and cutting timeout-related payment failures by 25% under peak load.

Engineered a fault-tolerant integration layer using PostgreSQL and Apache Kafka across 5 services, resolving message-ordering issues and reducing failures by 40% with zero message loss.

Diagnosed and fixed 15+ latent C++ memory defects using AddressSanitizer and Valgrind, improving production stability and achieving 99.9% uptime with Docker, Kubernetes, and GitHub Actions CI/CD.

Temenos
|

Software Development Engineer

Hyderabad, Telangana, India

Summary

Designed and implemented event-driven microservices for payment processing and account management, serving 200K+ users across a distributed fintech platform while ensuring high availability and zero-downtime releases.

Highlights

Designed event-driven microservices for payment processing and account management using Java, Spring Boot, and Apache Kafka, serving 200K+ users across a distributed fintech platform.

Engineered a multi-tier data layer using PostgreSQL for ACID-compliant transactions, MongoDB with sharding, and Redis caching, achieving sub-100ms response times.

Provisioned and managed AWS infrastructure (EC2, Lambda, API Gateway, Docker, Kubernetes) to ensure high availability through auto-scaling and CloudWatch monitoring.

Automated CI/CD pipelines using Jenkins, Docker, and Kubernetes across 20+ microservices, integrating unit/integration tests with 90%+ coverage, reducing deployment time by 40% and enabling zero-downtime releases.

Education

Stony Brook University
New York, NY, United States of America

Master of Science

Computer Science

Courses

Distributed Systems

Operating Systems

Theory of Database Systems

AI

Machine Learning

Data Science

NLP

Languages

English

Skills

ML & GenAI

PyTorch, Transformers, LLM Fine-Tuning, RAG, LangChain, MLOps, TensorFlow, Hugging Face, Google ADK.

Languages

C, C++, Java, Python, Go, JavaScript, TypeScript, Kotlin, SQL.

Frameworks

Spring Boot, Flask, Django, React.js, Node.js, Express, Pandas, NumPy, JUnit, Mockito.

Databases

PostgreSQL, MySQL, MongoDB, Cassandra, Redis, Spark, FAISS.

Technologies

Docker, Kubernetes, Terraform, Jenkins, Git, Kafka, RabbitMQ, GraphQL, Linux, Grafana, CUDA, CI/CD.

Cloud

AWS (EC2, Lambda, S3, RDS, DynamoDB, Bedrock, SQS, CloudWatch, CloudFormation), GCP, Azure (AI Foundry, AKS).

Projects

Distributed Transaction System

Summary

Engineered a high-throughput transaction system in C++ using Multi-Paxos for consensus and a Two-Phase Commit protocol over gRPC, achieving 4,000 TPS with consistent state machine replication. Guaranteed zero data loss and 99.99% transaction consistency by implementing log-based recovery mechanisms, rigorously validated against node crashes using fault injection testing.