We’re building production-grade NLP systems and need someone who can take a model from research to reliable, scalable deployment. You’ll own the full lifecycle — from containerisation to live inference endpoints.
What you’ll do
• Package, serve, and monitor small language models on AWS SageMaker Serverless endpoints with optimised cold-start behaviour
• Build slim multi-stage Docker images, push to ECR, and keep inference images under tight size budgets
• Own the build → test → push → deploy CI/CD pipeline for ML services
• Configure IAM roles and manage secrets via AWS Secrets Manager following least-privilege principles
• Version datasets, models, and experiments; instrument latency, throughput, and accuracy in production
• Work with NLP libraries (spaCy, Transformers, FAISS, PyTorch) to build and iterate on NLP pipelines