cv

Experience

yandex Yandex School of Data Analysis

ML course tutor assistant May, 2021 - Present

  • Give lectures on machine learning including NLP and CV
  • Conduct seminars and check homework.
  • Implement LLM (GPT-2XL, OPT-6.7B) distributed training and inference pipelines: model-parallel, data-parallel, pipeline parallelism, memory offloading.

gradient gradient Gradient & Persona: AI Photo & Video mobile editors

Middle Computer Vision Engineer August, 2022 - May, 2023

  • Make a huge research on image generation, especially with Stable Diffusion model. Played a key role in developing brand-new method of encoding into its latent space.
  • Conduct various experiments with different Stable Diffusion down-stream tasks like custom fine-tuning, introduction of new modules, curating task-specific datasets, papers implementation. Accelerated image generation by 30%
  • Generate and curate custom datasets. Use CLIP, BLIP, StyleGAN, pix2pix models for processings. Resulted in obtaining datasets which helped to train new models.
  • Train brand-new re-aging img2img filters both server and realtime.
  • Train new versions of image warping body-tune models both for images and videos and lightweight body segmentation models. Resulted in better postprocessing on inference.
  • Train dozens of new beauty filters, develop new loss functions for training. Resulted in better quality of model outputs.
  • Deploy models both on IOS and server using torch.jit and coreml.
  • Participate in regular learning meetups, sharing the insights from recent ML papers.

itechart iTechArt

Machine Learning Engineer February, 2022 - August, 2022

  • Designed and implemented an image classification service using the gRPC endpoint client/server architecture and the FastAPI framework.
  • Utilized Uvicorn and Prometheus in conjunction with Docker and Supervisord to create a robust and scalable solution.
  • Developed and implemented custom model architectures using C++, resulting in up to a 45\% reduction in model latency.
  • Generated synthetic datasets to supplement real data, leading to an increase in model accuracy of up to 10%.
  • Successfully distilled the CNN model into a model that was 3 times smaller while maintaining nearly identical evaluation metrics.
  • Created and curated custom datasets from unstructured client data using Pandas and SQL.
  • Trained numerous time-series models for demand forecasting, reducing forecast MAPE by 20%.
  • Constructed production pipelines with AirFlow to convert raw data into a feature vector, feed it into the model, and forecast the product demand.

yandex Yandex

Software Engineer Intern May, 2021 - November, 2021

  • Developed rule-based and NLP-based solutions for affiliations parsing.
  • Developed data annotation service with Flask framework. Wrapped it into Docker and deployed to the server.
  • Developed similarity metrics for searhing similar logs.

Education

ysda Yandex School of Data Analysis

Mater's level degree Machine Learning developer academic program September, 2020 - June, 2022

  • The two-year Yandex program was created in 2007 and has become Russia’s leading data analysis program.
  • Passed courses: Python, C++, Algorithms and Data Structures, Probability and Statistics, Machine Learning, Computer Vision, Natural Language Processing, Deep Learning, Reinforcement Learning, Efficient DL systems.

bsu Belarusian State University

Bachelor's degree, Faculty of Applied Mathematics and Computer Science September, 2018 - June, 2022

  • Took part in Educational-Scientific Conference of Students on Recent Methods of ML and Data Analysis.
  • Took part in Annual Belarusian State University Conference of Students.
  • Prepared educational tasks for SIRIUS school center.
  • Was member of faculty volleyball team.

Skills

Programming languages: C++, Python.

Frameworks: PyTorch, torch.distributed, torchlightning, huggingface, FastAPI, Flask, scikit-learn, OpenCV, numpy, pandas, catboost, xgboost, coreml, bitsandbytes.

Machine Learning: Computer Vision, NLP, classical ML, distributed training, data-parallel training, model-parallel training, model deployment, model distillation, model compression, memory footprint reduction.

Tools: docker, docker-compose, git, Kubernetes, Terraform, Airflow, Prometheus, Postgres, Elasticsearch, Kibana, Logstash, gRPC, TensorRT, ONNX, Hadoop, Spark, Flink, Kafka.