- with enterprise-grade controls, observability, and human-in-the-loop patterns. This role sits within a cross-functional team... are met. Production Readiness and Operations Implement observability for agents (latency, cost, tool failures, drift, quality signals...
) MLOps & Observability: Implement MLOps best practices , including: Model versioning Monitoring Logging Alerting Leverage... observability tools such as Azure Monitor , Prometheus , and MLflow to ensure reliable, production-grade deployments Operations...
and observability using logging/monitoring tools. 5. Governance & Security Implement and manage data governance using Unity Catalog...
, testing, observability, and operational readiness. Lead design reviews, establish engineering standards, and mentor engineers...). Experience with observability (monitoring, logging, tracing), incident response, and root-cause remediation. AI Demonstrated...
, etc. and MLOps/LLMOps practices. Ensure platform security, compliance, logging, observability, and performance tuning...
integrations - with enterprise-grade controls, observability, and human-in-the-loop patterns. This role sits within a cross... Readiness and Operations Implement observability for agents (latency, cost, tool failures, drift, quality signals, escalation...
, evangelize Dev-Ops culture and demonstrate best practices including observability / metrics and security. Do you have a passion...
moderation domains Experience with data observability tools and building comprehensive monitoring systems Prior experience...
standards. - Drive engineering excellence across CI/CD, observability, automation. 3) Infrastructure as Code (Terraform... observability, cost optimization, and compliance. - Maintain documentation: architecture diagrams, SOPs, playbooks. Requirements...
and observability validations post-deploy. Payments Domain Validation Author automation for end-to-end payment journeys: initiation.... Familiarity with observability (Grafana, Splunk) to correlate test runs with system telemetry. Understanding of SWIFT/ISO 20022...
maintainability. Promote best practices in testing, observability, reliability, and operational readiness. Cross-Functional Work...
pipelines, troubleshoot failures, and maintain high reliability. Implement logging, monitoring, and observability...
AI deployments using Kubernetes, Docker, and workflow orchestration tools (Airflow, Prefect, Dagster, Temporal). Observability... & Observability: Familiarity with LangSmith, MLflow, Weights & Biases, Arize, or Evidently AI for monitoring, evaluation...
policies, NACL/SG strategy; multi region HA/DR. Observability & Operations Implement CloudWatch/OTel, metric/trace/log..., dependency mapping, perf baselines, blue/green cutover. Observability & Ops: CloudWatch, metrics/logging, runbooks, chaos...
understanding of CI/CD, automated testing, and observability practices. Good communication skills and a collaborative, team...
, observability, and human-in-the-loop patterns. This role sits within a cross-functional team with Product, Operations, Technology... observability for agents (latency, cost, tool failures, drift, quality signals, escalation rates). Support CI/CD for agent prompts...
, testing, observability, and operational readiness. Lead design reviews, establish engineering standards, and mentor engineers...). Experience with observability (monitoring, logging, tracing), incident response, and root-cause remediation. AI Demonstrated...