

Dynamic Senior Software Engineer with a proven track record at Multicoreware, excelling in kernel optimization and deep learning frameworks. Demonstrated leadership in managing teams and enhancing project outcomes. Proficient in C++, pyhon and skilled in CI/CD practices, driving efficiency and stability in complex systems. Passionate about advancing AI technologies through innovative solutions. Dynamic Associate Lead with strong skills in C++ programming, Python development, and kernel optimization. Proven ability to lead teams, optimize workflows, and enhance deep learning frameworks.
Support and bug fix for Customer NN engine:
Project Goal: Fix bugs on CNN layer mismatch and write test suit to cover sanity and
regression tests for the layers and kernels .
Roles and Responsibility:
Understanding c and VecC kernels, load and store and basic instructions
• Fix bugs on Kernel mismatch and pipeline issues
• Worked on Onnx framework for reference pipeline
• Test suit development
• Testing failure cases and fixing it
Programming Languages: C, C++ IDE: Eclipse
Kernel Development for ONNX ops
Project Goal: The objective is to develop and test ONNX ops on AI accelerator
implemented in Halide language.
Roles and Responsibility:
• Implemented elementwise binary ops likr add, mul, sub in halide for 5D inputs
and tested the accuraccy with pytest
• Included quantization (scale per channel) for s8 dtype for the above ops.
• Implemented conv transpose pytest along with scale per channel for s8 dtype.
Programming Languages: Python, C++ IDE: Visual Studio
Porting kernel on DSP
Project Goal:
The objective is to build & optimize various CNN algorithms for a specific DSP with
real time performance. Roles and Responsibility:
• Implemented various kernels – 16bit Maxpool and gaussian kernels
• Optimizing the Kernels close to the theoretical estimate
• VLIW/SIMD based optimization
Programming Languages: C, C++. IDE: Visual Studio
Kernel development for Pytorch ops
Project Goal: The objective is to develop and test Pytorch 1.10 single ops on AI
accelerator.
Roles and Responsibility:
• Handling multiple dimensions for logsumexp
• Worked on subgraph generation and pass validation (force fallback and control
edge).
• Using Autocode for mm op
• Nearest1D forward and backward variants for upsample op
• Writing Gtest for single ops - Tanh, Hardsigmoid, Exp, Addcmul, CrossEntropy
and BinaryCrossEntropy with Logits.
Programming Languages: Python, C++ IDE: Visual Studio
Title: Associate Lead