
About the job
The Apache Software Foundation (ASF) is seeking an experienced Infrastructure Systems Administrator who will contribute to the stability, security, and growth of its globally-distributed infrastructure. The successful candidate will demonstrate strong system administration skills in diverse environments, proven experience with open source tools and methodologies, and a deep understanding of F/OSS project dynamics. This role is 100% remote, but availability for once-yearly travel to the United States for a multi-day Team Meetup is encouraged.
The candidate is based in Europe.Â
The position involves working in a self-directed manner as part of a small, highly-distributed Infrastructure team, known as 'ASF Infra'. You will collaborate with a knowledgeable and highly-engaged community of open source contributors and ensure the smooth operation of services powering hundreds of ASF projects worldwide.
Key Responsibilities
Infra works together to handle these tasks, making sure that knowledge is not siloed with any
one team member.
System Administration
- Administer Ubuntu-based Linux/Unix systems in production environments (installation, upgrades, patches, security, performance tuning, etc.)
- Monitor, maintain, and troubleshoot critical infrastructure services (DNS, email, backups, mailing lists, CI/CD, etc.)
- Take responsibility at regular intervals for evaluating and managing responses to alerts and urgent messages ("Infra On Call").
Backups & Recovery
- Maintain production-scale backup and restore tools and processes, including JIT data
restoration and Disaster Recovery using tools including Kopia and BackupPC. - Develop monitoring and compliance frameworks to validate backups.
Automation & Configuration Management
- Maintain and expand existing Puppet-based Infrastructure-As-Code, and build out an
Ansible footprint with a goal of converting from Puppet to Ansible where feasible. - Develop and maintain Puppet-based modules and Hiera-based YAML orchestration for
a large Puppet deployment. - Develop and maintain Ansible playbooks for system configuration and orchestration.
Internal Tooling and Self-Developed Services
- Manage and maintain commercial on-prem services such as Atlassian Jira and
Confluence. - Manage and maintain self-developed services based primarily on Python/gunicorn.
Performance Monitoring and Alerting
- Manage, maintain, and expand existing Opensearch logging and analytics platform.
- Manage, maintain, and expand existing Datadog deployment.
Software Development & Scripting
- Maintain Python code for automation and internal tooling.
- Manage Java webapp servers and internal tooling related to Apache Maven, Sonatype Nexus, JFrog Artifactory and their integration with Jenkins-based CI.
Version Control & Collaboration
- Administer, maintain, and optimize Git, GitHub, Subversion services, and maintain
related internally developed automation tools.
Security & Compliance
- Implement and maintain security best practices across systems and services.
Conduct regular audits and respond to incidents in a timely manner.
Community Engagement
- Interact with ASF project communities, addressing support tickets and queries
promptly. - Maintain and support mailing lists or other community discussion platforms.
- Work to improve and extend community-based self-service tooling.
Collaboration & Communication
- Communicate effectively in written and spoken English; must write clear, concise
documentation. - Work closely with ASF communities, provide guidance on best practices, and apply
effective customer support skills and patience in community settings.
Qualifications
- Professional Sysadmin Experience
- Required: Minimum 3-5 years of experience with Linux/Unix system administration.
- Required: Solid background in writing and using Python tooling related to Systems Administration.
- Nice to have: Proven success working remotely with minimal supervision.
- Nice to have: Professional experience managing Windows servers.
- Automation & CI/CD
- Required: Experience with configuration automation/"Infrastructure-as-Code".
- Required: Experience configuring and maintaining continuous integration systems.
- Nice to have: Direct experience with Jenkins, Buildbot, GitHub Actions.
- Nice to have: General familiarity with CI/CD pipelines and container-based deployments of Infrastructure-as-a-Service (Docker, LXD, Kubernetes, etc.).
Nice to have: Exposure to a variety of programming languages and paradigms (e.g., Java webapps, Python gunicorn apps).
- Other Relevant Experience (Nice to have)
- Experience with Vault or other secrets management systems.
- Experience with environmental isolation/containerization for services using Python venv, docker, etc.
- Experience with Elastic/Opensearch ("ELK") stack or similar.
- Experience with performance monitoring/alerting and metric management (Pagerduty/Datadog/Grafana/Promethus/etc).
- Bonus Points
- Java Services
- Experience with writing, maintaining, upgrading, and troubleshooting Java-based
applications.
- Experience with writing, maintaining, upgrading, and troubleshooting Java-based
- Cloud
- AWS, Azure, or other production cloud infrastructure experience.
- AWS, Azure, or other production cloud infrastructure experience.
- Mailing List Management
- Familiarity with ezmlm or other email list administration and/or community engagement
tools such as Discourse.
- Familiarity with ezmlm or other email list administration and/or community engagement
- Community Development
- Insight into consensus-building, meritocracy, and open communication related to the
collaborative âApache Wayâ (https://apache.org/theapacheway/).
- Insight into consensus-building, meritocracy, and open communication related to the
- Technical Leadership
- Team lead, architecture, project management, or other relevant experience.
- Java Services
Why Work with the ASF Infrastructure Team?
- Work within a globally recognized open source foundation
- Engage with an active developer and user community spanning hundreds of projects.
- Enjoy the flexibility of a fully remote position.
- Contribute to open source projects and shape best practices that impact millions of users.