top of page

AI Engineer (Computer Vision & Multimodal)

We’re seeking an AI Engineer with strong hands-on experience in computer vision and multimodal (vision + voice) systems—from data preparation and model training to scalable inference and production deployment. You’ll build and ship end-to-end AI solutions: curate datasets, fine-tune and evaluate models, optimize inference pipelines, and operationalize services on cloud infrastructure.

Job Requirements

  • Bachelor’s degree in Computer Engineering / Software Engineering / Electronics / Computer Science or equivalent (Master’s preferred).

  • 0-3 years building production AI systems, with a focus on computer vision (PyTorch/TensorFlow).

  • Strong Python engineering skills; experience with REST/JSON APIs (FastAPI/Flask), and service design.

  • Hands-on with cloud deployment for AI (AWS/GCP/Azure), containers (Docker), orchestration (Kubernetes/Cloud Run/ECS), and GPUs.

  • Experience training and serving CV models (ResNet/EfficientNet, YOLO/RetinaNet, U-Net/nnU-Net, Vision Transformers), and familiarity with ASR/TTS pipelines (e.g., Whisper, torchaudio).

  • Practical knowledge of model optimization (ONNX/TensorRT, quantization), and data tooling (Pandas, NumPy, OpenCV, ffmpeg).

  • Proficiency with MLOps tooling: experiment tracking (MLflow/W&B), model registry, CI/CD, monitoring/logging (Prometheus/Grafana/Cloud Monitoring).

  • Strong grounding in evaluation methodology: metrics, ablation studies, error/bias analysis, and reproducible research practices.

  • Comfort with cloud storage, queues, and databases; experience integrating AI services into existing systems.

  • Computer vision: 2 years (Preferred)

  • Nice-to-Have:

    • Experience with multimodal (vision + workflows; Hugging Face ecosystem.

    • Knowledge of real-time/edge inference, TensorRT-LLM, vLLM.

    • Background in signal processing or audio engineering; speaker diarization, voice cloning ethics.

    • Experience in regulated domains with HITL workflows and documentation.

Main Job Duties

  • Model development and inference

    • Implement, fine-tune, and evaluate CV models (classification, detection, segmentation, tracking) using PyTorch/TensorFlow.

    • Build robust inference services (REST/JSON) with batching, streaming, and hardware acceleration (GPU, TensorRT/ONNX).

    • Train and adapt voice/ASR/TTS models and integrate with vision pipelines (multimodal workflows).

  • Data and evaluation

    • Own dataset lifecycle: collection, cleaning, labeling/specs, augmentation, and versioning.

    • Define metrics and test sets; run offline/online evaluations (accuracy, latency, throughput, calibration) and error analysis.

    • Develop data transformation and feature pipelines; maintain data quality checks and bias/fairness assessments.

  • Production engineering

    • Containerize and deploy models to cloud (AWS/GCP/Azure) using Docker/Kubernetes/Cloud Run/ECS.

    • Implement CI/CD, experiment tracking, model registry, A/B canaries, rollout/rollback strategies.

    • Build monitoring for drift, performance, and cost; automate retraining or active learning loops.

  • Systems integration

    • Design APIs and modules, integrate with upstream/downstream systems, and ensure reliable contracts and observability.

    • Collaborate with product, design, and backend teams to turn requirements into measurable deliverables.

  • Performance and optimization

    • Optimize training/inference (quantization, pruning, distillation, mixed precision); leverage ONNX/TensorRT/torch.compile.

    • Profile and tune data loaders, GPU utilization, caching, and I/O.

  • Compliance and safety

    • Implement data governance, privacy, and security best practices; maintain audit trails and documentation.

    • Establish HITL workflows and guardrails for clinical or safety-critical contexts where applicable.

Your Next Step Starts Here

A space to grow, learn, and contribute to purposeful products.

bottom of page