cv

Experience

mirage Mirage (FKA Captions)

Research Engineer, Member of Technical Staff July, 2023 - Present

  • Contributed to development of Mirage, a 13B parameter text+audio+image-to-video generation model. Designed novel architectural components, implemented distributed training pipeline on 1000+ H100 GPUs achieving 40% MFU, and developed context-parallel training for sequence lengths up to 150k tokens.
  • Optimized Mirage Video inference achieving 4x faster startup times and 2x overall speedup through fp8 quantization, custom attention kernels, torch compilation, and inference-time caching with minimal quality degradation.
  • Optimized Mirage Audio model, a 10B parameter text+audio-to-audio model, achieving 30% training throughput improvement and 4x inference speedup through efficient kernels, torch compilation, and HSDP while maintaining quality.
  • Designed and implemented novel audio-to-landmarks architecture for lip-sync generation through extensive architecture exploration and ablation studies.
  • Designed and implemented automated evaluation framework for continuous checkpoint assessment using pub/sub architecture, integrating metrics computation, Weights&Biases logging, and automated video artifact uploads to cloud storage.
  • Evaluated GPU providers and made key infrastructure decisions enabling efficient multi-node training at scale.

yandex Yandex School of Data Analysis

ML course tutor assistant February, 2022 - June, 2023

  • Served as ML course tutor at Yandex School of Data Analysis, giving lectures on NLP and computer vision, and designing coursework on distributed LLM training (GPT-2XL, OPT-6.7B).

gradient gradient Gradient & Persona: AI Photo & Video mobile editors

Computer Vision Engineer August, 2022 - May, 2023

  • Designed and implemented novel image encoding method for personalized Stable Diffusion generation, replacing expensive DreamBooth fine-tuning with single-shot encoding from few images. Approach enabled identity-preserving generation without model fine-tuning; similar methods were later published and widely adopted in 2024.
  • Developed real-time GANs running on mobile devices at HD quality, 60fps while maintaining sub-2MB model size through aggressive quantization and architecture optimization.
  • Optimized Stable Diffusion inference achieving 30% speedup through architectural modifications, custom modules, and efficient sampling strategies.
  • Trained production models for re-aging, body reshaping (images and video), and body segmentation, iterating rapidly on novel architectures and dataset curation strategies.
  • Deployed models to iOS using CoreML and server infrastructure using TorchScript, optimizing for both on-device and cloud inference.

itechart iTechArt

Machine Learning Engineer February, 2022 - August, 2022

  • Designed and implemented an image classification service using the gRPC endpoint client/server architecture and the FastAPI framework.
  • Utilized Uvicorn and Prometheus in conjunction with Docker and Supervisord to create a robust and scalable solution.
  • Developed and implemented custom model architectures using C++, resulting in up to a 45\% reduction in model latency.
  • Generated synthetic datasets to supplement real data, leading to an increase in model accuracy of up to 10%.
  • Successfully distilled the CNN model into a model that was 3 times smaller while maintaining nearly identical evaluation metrics.
  • Created and curated custom datasets from unstructured client data using Pandas and SQL.
  • Trained numerous time-series models for demand forecasting, reducing forecast MAPE by 20%.
  • Constructed production pipelines with AirFlow to convert raw data into a feature vector, feed it into the model, and forecast the product demand.

yandex Yandex

Software Engineer Intern May, 2021 - November, 2021

  • Developed rule-based and NLP-based solutions for affiliations parsing.
  • Developed data annotation service with Flask framework. Wrapped it into Docker and deployed to the server.
  • Developed similarity metrics for searhing similar logs.

Publications

  • Seeing Voices: Generating A-Roll Video from Audio with Mirage - A. Sundararaman, A. Adishesha, A. Jaegle, D. Bigioi, H. Song, J. Kyl, J. Mao, K. Lan, M. Komeili, S. Athar, S. Babayan, S. Beliasau, W. Buchwalter. arXiv:2506.08279, 2025. [link]
  • How to Fine-Tune Very Large Model if It Doesn’t Fit on Your GPU - Technical article on distributed training techniques for large language models. 600+ reactions, 2022. [link]

Education

ysda Yandex School of Data Analysis

Master's degree level Machine Learning developer academic program September, 2020 - June, 2022

  • Relevant coursework: Efficient Deep Learning Systems, Reinforcement Learning, Computer Vision, NLP, Recommendation Systems.

bsu Belarusian State University

Bachelor of Computer Science September, 2018 - August, 2022

  • Awarded a full scholarship and stipend by the government per entrance exam results.