shivamk3r.com

Projects

Selected systems work.

Representative projects across benchmark design and applied computer vision.

Computer vision benchmark

ShapeCodeBench

A synthetic benchmark for evaluating whether multimodal models can reconstruct executable drawing programs from rendered geometric scenes.

  • Designed a deterministic image-to-program task with a small Python-like drawing DSL and render-based scoring.
  • Released a frozen eval_v1 split, baseline model results, paper source, and reproducibility artifacts.
  • Published the project as an arXiv preprint with a permanent Zenodo archive.
PythonComputer visionProgram synthesisBenchmarkingarXiv

Multimodal benchmarking

VLMBench / multimodal model evaluation

An evaluation tool for comparing multimodal language models with a consistent prediction-to-ground-truth methodology.

  • Evaluated model outputs across image and text tasks using repeatable scoring flows.
  • Built around provider APIs, cloud storage, structured samples, and result analysis.
  • Focused on making model comparisons auditable and easier for teams to reason about.
PythonGCPOpenAI APIGemini APIEvaluation design

Computer vision and applied ML

DSM reconstruction ML system

A patented machine-learning system for generating Digital Surface Models from imagery, reducing reliance on external elevation sources.

  • Built data collection and preprocessing workflows for large-scale model training.
  • Worked across computer vision modeling, evaluation, and production integration.
  • Extended operational coverage for geospatial workflows where elevation data was limited.
PythonPyTorchOpenCVGCPComputer vision