David “Dave” Vigil (he/him)
Experienced DevOps Engineer with a demonstrated history of pioneering solutions in IAC (Infrastructure As Code), Kubernetes, and Docker deployments across diverse infrastructures, including sensitive environments within private and governmental sectors. Additionally, I possess expertise in Artificial Intelligence Operations (AIOps) and Machine Learning Operations (MLOps) methodologies, having worked on projects that integrated AI/ML models with CI/CD pipelines and infrastructure management. Adept at optimizing CI/CD pipelines to automate application deployments and operations on Kubernetes, AWS (AWSGov), and GCP. Known for a passion for exploring innovative approaches to enhance efficiency and reliability in complex systems.
Experience
Web AI / Oct 2024 - Present
Lead Software Engineer
-
Led and contributed to the architecture of a complex product, overseeing the creation and management of the Architectural Decision Records (ADR) process to ensure robust documentation and informed decision-making.
-
Designed, deployed, and maintained Kubernetes infrastructure on Azure, implementing best practices for scalability, security, and high availability to support mission-critical applications.
-
Partnered with customers to architect and implement tailored solutions for AI/ML training and inference pipelines, leveraging MLOps principles to optimize performance, scalability, and resource utilization.
ServiceNow / Feb 2023 - Oct 2024
Systems Engineer
-
Created CI pipeline in GitLab to automate OS regression tests to allow regular updates to all company hardware SKUs. Turned the 30-day manual process into a 3-day process with testing and minimal human interaction required.
-
Created and maintained various Puppet modules to better manage the many facets of both the Dev and Prod environments. Assisted other teams to better solve issues in an automated and repeatable manner.
SUSE Rancher Government Solutions / Feb 2021 - Feb 2023
Senior Solutions Architect Feb 2022 - Feb 2023
- Providing Rancher, Kubernetes and other CNCF and open source software and consulting solutions to federal entities that are looking to adapt and transform to devsecops, gitops and other modern workflows
Senior Support Engineer Feb 2021 - Feb 2022
- Assist US customers within the federal government with technical issues, troubleshoot, and solve issues that come up within the container orchestration, Rancher MCM, and general Kubernetes space. Wrote Ansible to provide customers with a configuration management solution to easily build RKE2 clusters (https://github.com/rancherfederal/rke2-ansible). Answered customer-initiated tickets and private customer slack channels. Provide support for all SUSE Rancher products including the Rancher Multi-cluster manager, Longhorn, RKE, RKE2 (RKE Government), and K3s.
Sandia National Labratories / Sept 2019 - Feb 2021
Senior Solutions Architect
- Lead Architect for expanding configuration management (Ansible) throughout the Labs. Built CI pipelines for all Ansible code to allow other departments to contribute to code. The CI pipeline used GitLab runners which were in multiple Rancher-built Kubenetes clusters that I built. Wrote a custom Molecule driver which used Terraform to build VMs in on-prem Azure Stack and on-prem OpenStack clusters to run tests. Worked with CyberSecurity and Common Operating Environment teams to establish a code base. Built Python Flask applications to allow automated adding of Linux hosts running node_exporter to Prometheus/Grafana monitoring via GitLab and Helm. Wrote Terraform templates to help multiple teams push for Infrastructure As Code.
Rackspace / June 2018 - Sept 2019
Senior DevOps Engineer
- Lead a DevOps team that migrated a monolithic codebase to microservices deployed on multiple clusters on Google Kubernetes Engine. This fully automated CI/CD pipeline was initiated from a developer commit and ends in full deployments on clusters. Managed the development of Helm charts that maintained the infrastructure-as-code deployments for microservices to Kubernetes. Developed and managed processes for day two maintenance of the current infrastructure consisting of over 500 Ubuntu servers across five data centers distributed around the world.
Science Applications International Corporation / May 2015 - May 2018
Applications Developer \ DevOps Team Lead
- Coded, deployed, and maintained internal enterprise Ruby on Rails applications. Migrated applications from a VM infrastructure into Docker containers. Created a fully automated CI/CD infrastructure for multiple projects using Gitlab CI. Built a full CI/CD pipeline using Puppet. Automated everything from git commit to deployment on a brownfield farm using Gitlab CI and Jenkins. Spearheaded legacy environment automation by scripting (Bash/Ruby/Python). Built OpenStack and maintained environments deployed via both VMWARE Integrated OpenStack (VIO) and Mirantis. Provisioned, automated and maintained 800+ Linux servers and workstations. Built Hadoop Data Platform environments in Docker containers and OpenStack instances. Managed multiple Kubernetes clusters with GPU support for Deep Learning capability (Nvida DGX and IBM PowerAI). As SAIC Team Lead, ensured team of 15 were on task, tracked performance, and oversaw annual reviews.
ZeroLag Communications / May 2012 - May 2015
Linux System Administrator / Security Administrator
- Automated tasks to detect and assist compromised customer servers and remedied security vulnerabilities. Resolved requests/issues escalated by L1 Admins and provided mentoring to the support staff. Assisted customers via ticket system/phone/chat, led the Quality Assurance group for client response. Monitored servers and oversaw Security and Compliance for internal machines.
HostGator.com LLC / Sept 2009 - May 2012
Linux System Administrator / Migrations Quality Assurance
- Utilized knowledge in LAMP stack and Windows/IIS web server to migrate websites for customers from competing web hosting companies to HostGator servers with zero downtime. Inspected migrated content for malware. Developed scripts (Bash/Python/Perl) to automate repetitive tasks and correct common configuration issues.
Skills
Languages: Python, Bash, Ruby (Rails)
Linux Systems: OpenSUSE, RHEL, CentOS, Ubuntu, Debian
Configuration Management: Puppet, Chef, Ansible
Cloud: IAAS, IAC, GCP, AWS (GovCloud), Kubernetes (Kubeadm, RKE2, k3s), OpenStack, Terraform, OpenTofu, Packer
Automation/Build tools: Jenkins, Docker, Puppet, Artifiactory, Vagrant, GitLab CI
Open Source Contributions
Kubernetes, Docker, Hashicorp Vagrant, RKE2, Spinnaker