AI Engineer (Computer Vision & Multimodal)
We’re seeking an AI Engineer with strong hands-on experience in computer vision and multimodal (vision + voice) systems—from data preparation and model training to scalable inference and production deployment. You’ll build and ship end-to-end AI solutions: curate datasets, fine-tune and evaluate models, optimize inference pipelines, and operationalize services on cloud infrastructure.
Job Requirements
-
Bachelor’s degree in Computer Engineering / Software Engineering / Electronics / Computer Science or equivalent (Master’s preferred).
-
0-3 years building production AI systems, with a focus on computer vision (PyTorch/TensorFlow).
-
Strong Python engineering skills; experience with REST/JSON APIs (FastAPI/Flask), and service design.
-
Hands-on with cloud deployment for AI (AWS/GCP/Azure), containers (Docker), orchestration (Kubernetes/Cloud Run/ECS), and GPUs.
-
Experience training and serving CV models (ResNet/EfficientNet, YOLO/RetinaNet, U-Net/nnU-Net, Vision Transformers), and familiarity with ASR/TTS pipelines (e.g., Whisper, torchaudio).
-
Practical knowledge of model optimization (ONNX/TensorRT, quantization), and data tooling (Pandas, NumPy, OpenCV, ffmpeg).
-
Proficiency with MLOps tooling: experiment tracking (MLflow/W&B), model registry, CI/CD, monitoring/logging (Prometheus/Grafana/Cloud Monitoring).
-
Strong grounding in evaluation methodology: metrics, ablation studies, error/bias analysis, and reproducible research practices.
-
Comfort with cloud storage, queues, and databases; experience integrating AI services into existing systems.
-
Computer vision: 2 years (Preferred)
-
Nice-to-Have:
-
Experience with multimodal (vision + workflows; Hugging Face ecosystem.
-
Knowledge of real-time/edge inference, TensorRT-LLM, vLLM.
-
Background in signal processing or audio engineering; speaker diarization, voice cloning ethics.
-
Experience in regulated domains with HITL workflows and documentation.
-
Main Job Duties
-
Model development and inference
-
Implement, fine-tune, and evaluate CV models (classification, detection, segmentation, tracking) using PyTorch/TensorFlow.
-
Build robust inference services (REST/JSON) with batching, streaming, and hardware acceleration (GPU, TensorRT/ONNX).
-
Train and adapt voice/ASR/TTS models and integrate with vision pipelines (multimodal workflows).
-
-
Data and evaluation
-
Own dataset lifecycle: collection, cleaning, labeling/specs, augmentation, and versioning.
-
Define metrics and test sets; run offline/online evaluations (accuracy, latency, throughput, calibration) and error analysis.
-
Develop data transformation and feature pipelines; maintain data quality checks and bias/fairness assessments.
-
-
Production engineering
-
Containerize and deploy models to cloud (AWS/GCP/Azure) using Docker/Kubernetes/Cloud Run/ECS.
-
Implement CI/CD, experiment tracking, model registry, A/B canaries, rollout/rollback strategies.
-
Build monitoring for drift, performance, and cost; automate retraining or active learning loops.
-
-
Systems integration
-
Design APIs and modules, integrate with upstream/downstream systems, and ensure reliable contracts and observability.
-
Collaborate with product, design, and backend teams to turn requirements into measurable deliverables.
-
-
Performance and optimization
-
Optimize training/inference (quantization, pruning, distillation, mixed precision); leverage ONNX/TensorRT/torch.compile.
-
Profile and tune data loaders, GPU utilization, caching, and I/O.
-
-
Compliance and safety
-
Implement data governance, privacy, and security best practices; maintain audit trails and documentation.
-
Establish HITL workflows and guardrails for clinical or safety-critical contexts where applicable.
-
Your Next Step Starts Here
A space to grow, learn, and contribute to purposeful products.