* Lead Face Tracking enabling AR masks and try-ons across Instagram, Messenger, FB camera reaching billions of people; avatar machine learning solutions that ship on our Oculus platforms, Workrooms. Initiatives across AR/VR to evolve and productize facial animation, synthetic faces, body tracking capabilities in partnership with FRL-Research, FAIR, Workrooms, Avatars Product teams, Messenger, Instagram, and Oculus Hardware.
* Lead Autonomous Driving Tracking team delivering keynote demos, technology showcases and software solutions
* Architected, defined and shipped VisionWorks CUDA library with a team spread across the world. VisionWorks is designed to enable autonomous cars, drones, and other embedded systems and is the first library fully supported OpenVX. Developed and shipped Structure From Motion (SFM), Stereo Depth Extraction and Object Tracking built on VisionWorks (gave various talks at Embedded Vision Alliance, SIGGRAPH and GTC Conference)
* Driven architecture benchmarking of CV solutions across the industry to define future GPU architectures
* Lead to key business wins through vehicle and pedestrian detection technology, structure from motion, accelerated SLAM primitives, supported key customers including Tesla, Audi, BMW, Baidu, Google with Computer Vision Libraries (VisionWorks, OpenCV4Tegra, OpenCV with GPUs) and GPU architecture and optimizations.
* Delivered innovative computational camera features on Tegra platforms from Panoramic stitching, HDR, Video stabilization, Face Detection, Tap-to-track any object autofocus features delivering significant performance scalability over the alternatives.
* Drove OpenCV4Tegra and OpenCV on GPU for over 6 years, contributing NEON optimizations, soft cascades face detection and other GPU acceleration solutions back to open source
* Initiated very first synthetic data driven autonomous car testing for full system computer vision/machine learning pipeline
- Defined and delivered CUDA FFT libraries, supporting key customers including Matlab and Oil & Gas industries
- Taught 3-day CUDA training workshop in ITU Supercomputing Center, Istanbul.
- Implemented and shipped more accurate, better batching schemes and higher performance for key transform sizes, Bluestein for larger prime transform sizes, and built and extensive automated test framework for robustness.
- Designed multi-GPU distributed FFT software architecture using Peer-to-peer and MPI.
- Design and development of encoder/decoder hardware modules. Designed motion adaptive quantization block architecture for encoder.
Implemented reference picture buffer management implementation for H264 encoder.
- Implemented low-level software for decryption of audio/video signals.
- Implemented h.264 reference decoder (incomplete)
- Design of interfaces and specifications for GPU based transcoder system targeted for mobile devices.
- Designed a block based Motion Estimation algorithm and implemented the algorithm and its applications on CUDA.
- Architected error resiliency and recovery Cmodel. Guided hardware team in implementing this design.
- Designed and implemented error recovery and resiliency schemes for Video Processing unit. With this scheme, concurrent engines synchronize and recover from bitstream errors and/or deadlocks.
- Implemented MPEG2 and H.264 codec real-time programming on the video hardware core.
- Designed and implemented inverse transform and scaling, and weighted prediction on second generation SIMD architecture (involved SIMD assembly/intrinsic programming, macroplayer assembly, Tensilica core C programming). Maintained and optimized h.264 decoder.
*Received SONY Outstanding Achievement Award in 2002.
- Designed and implemented H.264 codec on Equator’s VLIW architecture.
- Lead a team of 6 people in designing and implementing firmware for DVD-MPEG decoder chip
* Implemented support for DVD-VR (video recording) format, Extended Data Services (XDS)
* Worked as a team leader in a team of 5 people in specifying, designing and implementing E-STD support for Sony DVD recorders. As a team, successfully brought the product from specifications to the production level on time while carrying continuous communication with the internal customer in Japan.
- Implementation and hardware verification of Audio State Machine and Sub-picture control, DVD-Audio format Audio-Still-Video (ASV) download and playback, and audio playback control
- Implemented messaging and “partial” thread task switching in nanoOS real-time kernel. Designed and implemented a test suite to verify the functionality of the features.