About

Karim Naguib

Economist and applied Bayesian statistician with 10+ years building hierarchical models for longitudinal and survival data. Current focus: patient-level digital-twin models of oncology tumor dynamics and real-world evidence integration at AstraZeneca. Prior work spans field experiments in East Africa and South Asia, real-estate market design, and health-behavior interventions.

Skills

Methods

Bayesian Modeling. Hierarchical modeling, state-space and longitudinal models, Gaussian processes, MCMC diagnostics and prior sensitivity

Survival Analysis. Multistate and illness-death models, competing risks, joint longitudinal–survival (“digital twin”) models

Validation. Cross-validation and predictive assessment (LFO-CV, PSIS-LOO)

Causal Inference. Experimental design, instrumental variables, difference-in-differences, regression discontinuity

Machine Learning. Deep reinforcement learning, particle filters, Monte Carlo methods, Bayesian neural networks

Programming

Expert. R, Tidyverse, Stan, Julia, Flux.jl, Turing.jl

Proficient. C/C++, Ruby, Python, SQL

Agentic Workflows for Statistical Modeling

Built and maintain an agentic-workflow layer on Claude Code, shared across the AstraZeneca modeling team via an internal plugin marketplace. Components include: collaborative mathematical derivation — working through statistical mathematics with Claude as a derivation partner before code is written; persistent memory for model-specific context; domain-specific sub-agents for Stan diagnostic triage, prior sensitivity review, and targets pipeline exploration; MCP-integrated skills for pipeline operations and Domino job management; reusable skill and plugin templates codifying team conventions for Bayesian workflow.

Tools

RStudio · VS Code · cmdstanr · targets workflow management · AWS EC2/S3/ECR · Azure · Docker · SLURM · Databricks · Domino · Git & GitHub · Snowflake · PostgreSQL

Education

Boston University — Ph.D. in Economics

Boston, MA · 2014

Fields: development, health, and experimental economics

The American University in Cairo — B.S. in Computer Science (Minor: Mathematics)

Cairo, Egypt · 1999

Experience

Senior Data Scientist — AstraZeneca

Aug 2023 — present

Bayesian digital-twin modeling for oncology clinical trials.

  • Patient-level forecasting of Phase 3 outcomes from emerging Phase 2 evidence, historical trial data, and real-world data, supporting Phase 3 initiation decisions (Ph3ID).
  • Hierarchical state-space tumor-dynamics models (Stein-Fojo framework) with population/trial/patient hierarchy, non-centered parameterization, and Student-t priors for prior-data conflict in sparse cohorts; validated with leave-future-out cross-validation.
  • Multistate (illness-death) survival models decomposing progression-free survival, jointly fit with the tumor-dynamics model so patient-level latent burden drives progression hazard mechanistically.
  • Modular Stan architecture and targets-based pipeline for reproducible trial analyses; routine work on R-hat, ESS, E-BFMI, divergences, and prior sensitivity.
  • Agentic-workflow layer — collaborative mathematical derivation, diagnostic sub-agents, MCP-integrated pipelines — adopted by the modeling team for deriving and debugging hierarchical Bayesian models.

Independent Researcher

Nov 2022 — Aug 2023

Reinforcement learning for intervention optimization.

  • Built a Julia package for a partially observable MDP solution using Monte Carlo tree search to optimize intervention implementation and evaluation decisions.
  • Bayesian hierarchical models for future-state prediction inside the POMDP; particle filters for faster online planning.
  • Project writeup: The Funder’s Meta-Problem.
  • Implemented various deep RL agents and models using Julia and Flux.jl.

Senior Data Scientist — Opendoor

May 2021 — May 2022

  • Designed randomized experiments to estimate demand and supply elasticities for pricing policy across regional real-estate markets.
  • Hierarchical Bayesian models and Gaussian-process priors over spatial/temporal structure to pool across markets while preserving local heterogeneity.
  • Worked across experiment design, inference, and decision-facing deployment.

Independent Researcher / Consultant

May 2019 — May 2021

  • Built a structural Bayesian model representing a behavior model, using data from a large-scale social experiment conducted in Kenya.
  • Built a hierarchical model to evaluate a scaled economic intervention program in Bangladesh.
  • Gaussian processes to model correlation in intervention effects by spatial proximity; hierarchical model of COVID-19 transmission heterogeneity under stay-at-home restrictions.
  • R package for bounded identification of counterfactuals in DAGs under imperfect compliance.

Economist — Evidence Action

May 2014 — May 2019

  • Designed and analyzed large-scale randomized evaluations for Evidence Action’s flagship programs: deworming across East Africa and the at-scale replication of the No Lean Season seasonal-migration subsidy in Bangladesh.
  • On No Lean Season, contributed to the analysis that failed to replicate the earlier pilot’s effect on migration and consumption; the finding informed Evidence Action’s decision to close the program — an instance of rigorous evidence driving a hard organizational decision.
  • Study design, power analysis, field data-collection supervision, and Bayesian and frequentist analysis for operational and policy audiences.

Software Design Engineer in Test — Microsoft

Apr 1999 — Mar 2007

  • Testing framework for the Windows Debugging Tools and command-line utilities; C/C++, systems-level testing.