Job Description
Global
Remote
Full-Time Job
Senior Infrastructure Engineer
The Position
Want to make a big impact at a fast-moving startup? Omnesoft is
on a mission to revolutionize ERP. We're building a cutting-edge,
intuitive, and scalable solution, and we need a Senior
Infrastructure Engineer. If you're a proactive problem-solver
who thrives in a small, high-impact, fully remote team, and are
equally adept in cloud and bare metal environments, we want to
talk to you.
How You'll Work
You'll work closely with our CTO to maintain Omnesoft's full
infrastructure footprint — our production cloud environment,
virtual server environments, SaaS products and solutions,
company laptops, and accounts across the various platforms we
rely on. Standard daily hours should reasonably overlap with
U.S. Eastern Time, with occasional off-hours support expected
for outages, performance degradation, and upgrades or patching.
You'll also work directly with our engineering team to support
their builds and deployments through our CI/CD process.
Key Responsibilities
- Design and manage robust, cost-efficient data architectures:
Own data structures, storage, performance, and migration
strategies to ensure scalability and speed.
- Administer and optimize our multi-vendor cloud infrastructure
and SaaS solutions: Proactively manage and secure all cloud
resources and critical SaaS tools to ensure maximum
availability and performance with minimal overhead.
- Architect and implement highly available on-premise
solutions: Design and deploy efficient server solutions for
specific storage and processing needs, prioritizing automation
and reliability.
- Design, implement, and automate robust Backup, Recovery, and
Disaster Recovery (BDR) strategies: Ensure the integrity and
availability of all critical data and systems through
comprehensive BDR planning, testing, and automation,
minimizing downtime and data loss.
- Drive automation across all infrastructure operations:
Identify and automate repetitive, error-prone tasks, including
deployments, system maintenance, and reporting, to accelerate
delivery, enhance efficiency, and reduce manual effort.
- Collaborate closely with development teams: Ensure seamless
integration between application functionality and backend
infrastructure, providing guidance on performance optimization
and efficient resource utilization.
- Ensure continuous system availability, performance, and
security: Implement timely system updates, patching, and
security measures, including participation in off-hours
support as needed to maintain 24/7 operational excellence.
- Manage internal IT infrastructure: Oversee company laptops,
virtual machines, and related tooling to enable efficient
employee productivity.
- Own and improve CI/CD pipelines: Continuously enhance our
deployment processes to enable faster, more reliable software
releases.
- Provide technical leadership and mentorship: Guide other team
members in infrastructure and security best practices, and
stay current with emerging technologies to drive innovation
and efficiency.
Required Skills
- Expert-level proficiency with Azure services (e.g., Container
Apps, Key Vault, Storage, Managed Identity, Networking).
Demonstrated ability to design, deploy, and optimize cloud
infrastructure for performance and cost-efficiency.
- Deep expertise in Linux systems administration, including
advanced networking, file systems, security hardening, and
performance optimization.
- Production-grade Docker and container orchestration (e.g.,
Docker Compose, Kubernetes fundamentals). Proven ability to
manage and troubleshoot containerized applications in
high-availability environments.
- Strong automation and scripting skills. Proficient with
GitHub Actions and general-purpose scripting languages (Bash,
PowerShell, Python) for infrastructure as code, CI/CD, and
operational automation.
- Solid PostgreSQL administration and performance tuning
experience, including high availability, backup/recovery
strategies, and query optimization.
- Expertise in web infrastructure components, including reverse
proxies (e.g., Nginx, Envoy), SSL certificate automation, and
implementing zero-downtime deployment strategies.
- Experience with performance testing methodologies and tools
(e.g., k6, JMeter, Artillery).
Additional Preferred Skills
- Experience with on-premise virtualization and container
platforms (e.g., Proxmox VE clusters, container/VM
management).
- Familiarity with .NET ecosystem technologies (e.g., .NET
Aspire, C#).
- Familiarity with JavaScript ecosystem technologies (React,
TypeScript, etc.).
- Understanding of event-driven architectures and streaming
platforms (e.g., Kafka, data streaming, event-sourced
systems).
- Background in robust monitoring, logging, and alerting
solutions (e.g., Prometheus, Grafana, Loki).
- Prior experience supporting QA automation or test
infrastructure, working closely with development teams to
build and maintain efficient testing environments.
Why Join Us?
- Be a core part of a fast-growing startup shaping the future
of ERP systems.
- Work remotely with flexible hours, allowing for a balanced
work-life experience.
- Competitive compensation and growth opportunities in a
rapidly expanding company.
- Collaborate with a passionate, creative, and innovative
team.