Software engineer focused on data-intensive backend systems, ML tooling, and open-source scientific software. MSc Data Science student at Warsaw University of Technology.

Projects

Payment Event Processing Pipeline — Scala Streaming Backend

Scala 3Cats EffectFS2RedpandaPostgreSQLMongoDB

Scala 3 backend for deterministic JSONL replay, Redpanda-backed stream ingestion, PostgreSQL enrichment, explainable risk decisions, and idempotent MongoDB persistence.

  • Built a backend system with JSONL, paced replay, and Redpanda input modes behind one source abstraction.
  • Implemented parsing, validation, normalization, customer enrichment, eligibility checks, and deterministic risk scoring.
  • Used PostgreSQL for customer profiles and MongoDB for processed transactions, eligibility violations, alerts, and risk history.
  • Structured the code around hexagonal architecture, narrow ports, functional streaming, MUnit tests, Docker Compose, and GitHub Actions CI.

Aegis AI — GCP SRE/ChatOps Platform

Google CloudTerraformCloud RunGKEPub/SubBigQueryFirestoreSlackGemini

GCP prototype for cross-project incident detection, Slack alerting, metric-backed follow-up answers, and auditable incident storage.

  • Built a cross-project SRE/ChatOps workflow connecting GKE workload logs, Cloud Logging sinks, Pub/Sub, Cloud Run services, and Slack.
  • Used Firestore for incident sessions and BigQuery for incident lifecycle events, reporting views, and SLO evidence.
  • Integrated Cloud Monitoring and Gemini so engineers could ask Slack follow-up questions with real incident context and metrics.
  • Provisioned Hub and Client infrastructure with separate Terraform stacks, least-privilege IAM, Secret Manager, and documented demo runbooks.

NMAR — R package for estimation under nonignorable nonresponse

CRANRCI/testsDocsSimulation studies

CRAN R package: unified nmar() API and method comparisons in simulation studies.

  • Implemented estimators from the literature behind a unified nmar() API.
  • Built reproducible simulation studies for method comparison and validation.
  • Packaged for CRAN with documentation, vignettes, CI, and tests.

Mamut — AutoML toolkit for tabular classification

PythonPyPIscikit-learnOptunaEnsemblesReports

AutoML workflow for tabular classification: preprocessing, hyperparameter optimization, model comparison, ensemble search, and generated reports.

  • Built preprocessing pipelines for imputation, scaling, encoding, skew correction, outliers, and optional feature reduction.
  • Supported model search across common classifiers with Bayesian or grid search.
  • Added dynamic ensemble search with hard/soft voting, HTML reports, notebook plots, and optional SHAP.
Other projects
Real-Time Finance Pipeline — Dockerized big-data stack with NiFi/Kafka, HDFS, Spark, Hive, and HBase GitHub
QuantumRAG — RAG benchmarking prototype with FAISS, Qiskit, and SQuAD evaluation GitHub
DermNet — DINOv2 embeddings for clustering GitHub
DoomRL — PPO/A2C agents for ViZDoom GitHub

Leadership

President, Data Science Club (WUT)

2024–2025

  • Organized talks/workshops; hosted guests from Google, ING, Allegro.
  • Worked with a student team on outreach and events.

Co-organizer, ensembleAI hackathon

2024-2026

  • Sponsors, logistics, venue coordination, on-site operations.

Capitalize (student venture, Enactus WUT)

Demo app shipped to Google Play (testing track)

  • Backend features/APIs (FastAPI).
  • Python scripts for basic telemetry analysis from Amplitude exports.

Awards

  • 2nd place - Enactus Poland National Competition (Capitalize), 2023
  • Finalist - Consult IT business/technology hackathon (SGH Warsaw School of Economics), 2023
  • Laureate - AGH “Diamond Index” Olympiad in Physics, 2022
  • Finalist - National Technical Knowledge Olympiad (OWT), 2022

Skills

Backend / Systems

  • Scala, Java, Python, R
  • Cats Effect, FS2, FastAPI, Spring Boot
  • Git, Linux, CI/testing, Docker

Data / Streaming

  • SQL, Spark
  • Kafka/Redpanda, NiFi
  • Hive, HDFS, HBase

ML / Evaluation

  • PyTorch, scikit-learn, Transformers
  • Optuna, NumPy, Pandas
  • Model evaluation, reporting, experiment workflows

Languages

  • Polish - native
  • English - C2 (CAE Grade A)
  • German - basic

Contact

Open to software engineering roles in data-intensive backend, data engineering, and machine learning systems.