LLMs and Agents in DevOps Workflows Training Course
Large language models (LLMs) and autonomous agent frameworks such as AutoGen and CrewAI are transforming the way DevOps teams automate tasks like change tracking, test generation, and alert triage by mimicking human-like collaboration and decision-making processes.
This instructor-led, live training (available online or on-site) is designed for advanced-level engineers who want to design and implement DevOps automation workflows using large language models (LLMs) and multi-agent systems.
By the end of this training, participants will be able to:
- Integrate LLM-based agents into CI/CD workflows for intelligent automation.
- Automate test generation, commit analysis, and change summaries using these agents.
- Coordinate multiple agents for alert triaging, response generation, and DevOps recommendations.
- Create secure and maintainable agent-powered workflows using open-source frameworks.
Format of the Course
- Interactive lecture and discussion sessions.
- Ample exercises and practice opportunities.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Course Outline
Introduction to LLMs and Agent Frameworks
- Overview of large language models in infrastructure automation
- Key concepts in multi-agent workflows
- AutoGen, CrewAI, and LangChain: use cases in DevOps
Setting Up LLM Agents for DevOps Tasks
- Installing AutoGen and configuring agent profiles
- Using OpenAI API and other LLM providers
- Setting up workspaces and CI/CD-compatible environments
Automating Test and Code Quality Workflows
- Prompting LLMs to generate unit and integration tests
- Using agents to enforce linting, commit rules, and code review guidelines
- Automated pull request summarization and tagging
LLM Agents for Alert Handling and Change Detection
- Designing responder agents for pipeline failure alerts
- Analyzing logs and traces using language models
- Proactive detection of high-risk changes or misconfigurations
Multi-Agent Coordination in DevOps
- Role-based agent orchestration (planner, executor, reviewer)
- Agent messaging loops and memory management
- Human-in-the-loop design for critical systems
Security, Governance, and Observability
- Handling data exposure and LLM safety in infrastructure
- Auditing agent actions and restricting scope
- Tracking pipeline behavior and model feedback
Real-World Use Cases and Custom Scenarios
- Designing agent workflows for incident response
- Integrating agents with GitHub Actions, Slack, or Jira
- Best practices for scaling LLM integration in DevOps
Summary and Next Steps
Requirements
- Experience with DevOps tooling and pipeline automation
- Working knowledge of Python and Git-based workflows
- Understanding of LLMs or exposure to prompt engineering
Audience
- Innovation engineers and AI-integrated platform leads
- LLM developers working in DevOps or automation
- DevOps professionals exploring intelligent agent frameworks
Need help picking the right course?
LLMs and Agents in DevOps Workflows Training Course - Enquiry
LLMs and Agents in DevOps Workflows - Consultancy Enquiry
Related Courses
Agentic Development with Gemini 3 and Google Antigravity
21 HoursGoogle Antigravity is a sophisticated development environment designed for creating autonomous agents that can plan, reason, code, and act using Gemini 3’s advanced multimodal capabilities.
This instructor-led, live training (available both online and onsite) is targeted at advanced-level technical professionals who are interested in designing, building, and deploying autonomous agents with the help of Gemini 3 and the Antigravity environment.
Upon completing this training, participants will be equipped to:
- Develop autonomous workflows that leverage Gemini 3 for reasoning, planning, and execution.
- Create agents in Antigravity that can analyze tasks, write code, and interact with various tools.
- Integrate Gemini-driven agents into enterprise systems and APIs.
- Enhance the behavior, safety, and reliability of agents in complex environments.
Format of the Course
- Expert-led demonstrations paired with interactive discussions.
- Hands-on experience in developing autonomous agents.
- Practical implementation using Antigravity, Gemini 3, and associated cloud tools.
Course Customization Options
- If your team needs specific agent behaviors or custom integrations for a particular domain, please contact us to customize the program accordingly.
Advanced Antigravity: Feedback Loops, Learning & Long-Term Agent Memory
14 HoursGoogle Antigravity is an advanced framework designed for experimenting with long-lived agents and emergent interactive behaviors.
This instructor-led, live training (available both online and onsite) is targeted at advanced-level professionals who aim to design, analyze, and optimize agents that can retain memories, improve through feedback, and evolve over extended operational periods.
Upon completing this course, participants will acquire the skills to:
- Develop long-term memory structures for agent persistence.
- Implement effective feedback mechanisms to influence agent behavior.
- Assess learning trajectories and model drift.
- Integrate memory systems into complex multi-agent environments.
Format of the Course
- Expert-led discussions combined with technical demonstrations.
- Hands-on exploration through structured design challenges.
- Application of concepts in simulated agent environments.
Course Customization Options
- If your organization requires customized content or case-specific examples, please contact us to tailor this training to your needs.
AIOps Foundation – Accredited Training
35 HoursAIOps is a rapidly evolving field that addresses the needs of modern, complex IT environments, especially those operating within cloud architectures. The AIOps Foundation course provides a comprehensive introduction to the concepts, technologies, and practices related to using artificial intelligence in IT operations.
The program delves into the background of AIOps, its core principles, tools, and the organizational challenges faced by IT teams when adopting these approaches.
The training concludes with an exam. Successfully passing it earns you the globally recognized AIOps Foundation certification, which remains valid for three years.
Who is it for?
This course is designed for professionals and managers involved in:
IT operations
DevOps and Site Reliability Engineering (SRE)
Cloud architecture
Data analysis and Data Science
Software development
IT security
Product and project management
AIOps in Action: Incident Prediction and Root Cause Automation
14 HoursAIOps (Artificial Intelligence for IT Operations) is increasingly being utilized to predict and prevent incidents before they occur, as well as to automate root cause analysis (RCA) in order to minimize downtime and speed up resolution.
This instructor-led, live training (available both online and onsite) is designed for advanced-level IT professionals who are interested in implementing predictive analytics, automating remediation processes, and designing intelligent RCA workflows using AIOps tools and machine learning models.
By the end of this training, participants will be able to:
- Develop and train machine learning models to identify patterns that lead to system failures.
- Automate RCA workflows by correlating data from multiple log sources and metrics.
- Integrate alerting and remediation processes into existing platforms.
- Deploy and scale intelligent AIOps pipelines in production environments.
Format of the Course
- Interactive lectures and discussions.
- Numerous exercises and practical activities.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
AIOps Fundamentals: Monitoring, Correlation, and Intelligent Alerting
14 HoursAIOps (Artificial Intelligence for IT Operations) is a practice that leverages machine learning and analytics to automate and enhance IT operations, particularly in monitoring, incident detection, and response.
This instructor-led, live training (available online or on-site) is designed for intermediate-level IT operations professionals who want to implement AIOps techniques. The goal is to correlate metrics and logs, reduce alert noise, and improve observability through intelligent automation.
By the end of this training, participants will be able to:
- Understand the principles and architecture of AIOps platforms.
- Correlate data from logs, metrics, and traces to identify root causes.
- Minimize alert fatigue through intelligent filtering and noise reduction.
- Use open-source or commercial tools to monitor and automatically respond to incidents.
Format of the Course
- Interactive lectures and discussions.
- Numerous exercises and practical sessions.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- For customized training for this course, please contact us to arrange.
Building an AIOps Pipeline with Open Source Tools
14 HoursAn AIOps pipeline constructed entirely with open-source tools enables teams to develop cost-effective and flexible solutions for observability, anomaly detection, and intelligent alerting in production environments.
This instructor-led, live training (available online or on-site) is designed for advanced-level engineers who want to build and deploy a comprehensive AIOps pipeline using tools such as Prometheus, ELK, Grafana, and custom machine learning models.
By the end of this training, participants will be able to:
- Design an AIOps architecture using only open-source components.
- Collect and normalize data from logs, metrics, and traces.
- Apply machine learning models to detect anomalies and predict incidents.
- Automate alerting and remediation processes using open tooling.
Format of the Course
- Interactive lectures and discussions.
- Numerous exercises and practice sessions.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Antigravity for Developers: Building Agent-First Applications
21 HoursAntigravity is a development platform designed to build AI-driven, agent-first applications.
This instructor-led, live training (online or onsite) is aimed at intermediate-level developers who wish to create real-world applications using autonomous AI agents within the Antigravity environment.
After completing this training, participants will be equipped to:
- Develop applications that utilize autonomous and coordinated AI agents.
- Use the Antigravity IDE, editor, terminal, and browser for comprehensive development processes.
- Manage multi-agent workflows using the Agent Manager.
- Integrate agent capabilities into production-grade software systems.
Format of the Course
- A combination of presentations with detailed demonstrations.
- Extensive hands-on practice and guided exercises.
- Real implementation work within the Antigravity live environment.
Course Customization Options
- For content tailored to your specific development stack, please contact us to arrange a customized version of this training.
Getting Started with Antigravity: An Introduction to Agent-First IDEs
14 HoursGoogle Antigravity is an advanced development environment that prioritizes the use of agents to automate and simplify engineering workflows.
This instructor-led, live training (available online or on-site) is designed for beginners who want to gain a foundational understanding of Antigravity and learn how agent-driven coding environments can boost productivity.
Upon completing this training, participants will be able to:
- Install and configure Google Antigravity effectively.
- Navigate and comprehend both the Editor View and Manager View with ease.
- Collaborate efficiently with agents to automate routine development tasks.
- Utilize Antigravity to create, refine, and manage project files seamlessly.
Format of the Course
- Instructor-led explanations complemented by real-time demonstrations.
- Guided exercises focusing on practical use of agents.
- Hands-on exploration of core Antigravity features in a controlled lab setting.
Course Customization Options
- For a customized version of this training, please contact us to arrange a tailored program.
Antigravity for Web Automation & Browser-Based Tasks
21 HoursGoogle Antigravity is a platform designed for creating agents that can interact with web applications, browser environments, and multi-surface workflows.
This instructor-led, live training (available online or on-site) is targeted at intermediate-level professionals who are interested in building, automating, and testing browser-based workflows using Google Antigravity.
By the end of the training, participants will be able to:
- Develop agents that can engage with web applications within a browser environment.
- Automate comprehensive workflows across various browser contexts.
- Validate and troubleshoot agent behavior in user interface-driven environments.
- Implement cross-surface automation strategies using Google Antigravity.
Format of the Course
- Guided instruction complemented by demonstrations.
- Practical, hands-on activities and scenario-based exercises.
- Implementation of agent workflows in an interactive lab setting.
Course Customization Options
- For customized training needs, please contact us to tailor the course to your specific objectives.
Enterprise AIOps with Splunk, Moogsoft, and Dynatrace
14 HoursEnterprise AIOps platforms such as Splunk, Moogsoft, and Dynatrace offer robust capabilities for detecting anomalies, correlating alerts, and automating responses in large-scale IT environments.
This instructor-led, live training (available online or on-site) is designed for intermediate-level enterprise IT teams who want to integrate AIOps tools into their existing observability stack and operational workflows.
By the end of this training, participants will be able to:
- Configure and integrate Splunk, Moogsoft, and Dynatrace into a unified AIOps architecture.
- Correlate metrics, logs, and events across distributed systems using AI-driven analysis.
- Automate incident detection, prioritization, and response with both built-in and custom workflows.
- Optimize performance, reduce mean time to resolution (MTTR), and enhance operational efficiency at an enterprise level.
Format of the Course
- Interactive lectures and discussions.
- Numerous exercises and practice sessions.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Implementing AIOps with Prometheus, Grafana, and ML
14 HoursPrometheus and Grafana are widely used tools for observability in modern infrastructure. Machine learning enhances these tools by providing predictive and intelligent insights, which help automate operational decisions.
This instructor-led, live training (available online or on-site) is designed for intermediate-level observability professionals who wish to modernize their monitoring systems by integrating AIOps practices using Prometheus, Grafana, and machine learning techniques.
By the end of this training, participants will be able to:
- Configure Prometheus and Grafana for comprehensive observability across various systems and services.
- Collect, store, and visualize high-quality time series data effectively.
- Apply machine learning models to detect anomalies and make forecasts.
- Create intelligent alerting rules based on predictive insights.
Format of the Course
- Interactive lectures and discussions.
- Numerous exercises and practical activities.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
AI Agent Development with Mastra
14 HoursThis instructor-led, live training (online or onsite) is aimed at intermediate-level software developers and engineering teams who wish to build scalable, observable AI systems using Mastra.
By the end of this training, participants will be able to:
- Understand Mastra’s architecture and how it integrates with LLMs and external APIs.
- Design and implement AI agents and workflows using TypeScript.
- Use Mastra’s observability and memory tools to monitor and improve agent performance.
- Deploy production-ready AI applications leveraging Mastra’s framework features.
Mastra Ops & Production Engineering: Deploying and Scaling AI Agents
21 HoursMastra is an operational framework designed to streamline the deployment, scaling, and lifecycle management of AI agents in production environments.
This instructor-led, live training (online or onsite) is aimed at intermediate to advanced technical professionals who need to operationalize AI agents reliably and efficiently across production systems.
Upon completing this training, attendees will be equipped to:
- Deploy Mastra-based AI agents into controlled, production-grade environments.
- Scale agents horizontally and vertically using platform-native features.
- Implement observability pipelines to monitor agent behavior and performance.
- Optimize runtime configurations to minimize latency, costs, and operational risks.
Format of the Course
- Interactive lecture and discussion.
- Hands-on exercises focused on real-world deployment scenarios.
- Live-lab implementation using containerized and orchestrated environments.
Course Customization Options
- Customization of topics, hands-on labs, or industry-specific scenarios is available upon request.
Managing Agent Workflows in Google Antigravity: Orchestration, Planning and Artifacts
14 HoursGoogle Antigravity is an agent-centric development platform designed to orchestrate, supervise, and coordinate AI-driven coding and automation workflows.
This instructor-led, live training (available online or on-site) is targeted at intermediate-level professionals who want to design, manage, and optimize multi-agent workflows within Google Antigravity.
Upon completing this training, participants will acquire the skills to:
- Set up agent responsibilities and orchestration pipelines using the Manager interface.
- Create and interpret Antigravity artifacts such as task lists, plans, logs, and browser recordings.
- Implement verification strategies to ensure that agent actions are transparent and auditable.
- Optimize multi-agent collaboration for complex development and operational tasks.
Format of the Course
- Guided presentations and practical demonstrations.
- Scenario-based exercises focused on real-world workflow challenges.
- Hands-on experimentation within a live Antigravity workspace.
Course Customization Options
- If you need a customized version of this course, please contact us to discuss your specific requirements.
Testing & Verifying Agent-Driven Code: Quality Assurance in Antigravity
14 HoursAntigravity is a framework that embodies advanced, agent-driven development processes.
This instructor-led, live training (available online or on-site) is designed for intermediate to advanced professionals who aim to verify, validate, and secure the output generated by AI agents operating within Antigravity environments.
Upon completing this training, participants will be able to:
- Evaluate the precision and safety of code artifacts produced by agents.
- Employ structured methods to verify tasks executed by agents.
- Analyze browser recordings and trace agent activities effectively.
- Apply QA and security principles to ensure the reliability of agent workflows.
Format of the Course
- Instructor-guided technical briefings and discussions.
- Practical exercises focused on verifying actual agent workflows.
- Hands-on testing and validation in a controlled lab setting.
Course Customization Options
- Scenarios, workflows, and testing examples can be adapted upon request.