Lead Site Reliability Engineer Job at Zeektek, Portland, OR

N3VZVVcvS08wL0FNWWJuVnlnckkzeWVJTWc9PQ==
  • Zeektek
  • Portland, OR

Job Description

We have a Sr Lead SRE role for a candidate that's not only hands on, but also will be involved with strategy, architecture, and innovation, leadership, heavily influential with technical direction, and someone who’s will to challenge, see the blind spots, and not afraid to bring in new ideas.

Qualifications:

  • Bachelor’s degree in Computer Science or equivalent years of experience
  • Datadog, Java, AWS, Python, AWS, EC2, CloudFormation, RDS, VPC, Lambda, RDS, S3, ECS, Docker, IAM, MySQL, NeoJ4, REST API
  • Minimum 5+ years' experience working with Java
  • Expert-level proficiency with 5+ years experience in AWS components like EC2, CloudFormation, RDS/Aurora, IAM Roles, etc.
  • Expert-level proficiency with 5+ years experience in operating high-availability, fault-tolerant, scalable, distributed software in production: building monitoring into your code, tweaking dashboards, defining alerts, writing runbooks, etc.
  • 4+ years of systems engineering/administration and/or support experience with web applications, especially J2EE technologies and technologies like Docker, Tomcat, Nginx
  • Understanding of network fundamentals including (TCP/IP, VPN, DNS, SMTP,
  • Experience with programmatic manipulation of cloud infrastructure such as AWS
  • Scripting and automation skills using common scripting languages like python, bash
  • Experience with network and web application monitoring tools, Datadog is preferred
  • Experience with DBMS (e.g. MySQL, MS SQL, Postgres, RDS), as well as graph databases (Neo4j, ArangoDB)
  • Experience with REST

The company is for innovation success in multidisciplinary engineering organizations. Numerous firsts for humanity in fields such as fuel cells, electrification, space, software-defined vehicles, surgical robotics, and more all rely requirements management software to minimize the risk of defects, rework, cost overruns, and recalls. This allowing engineering organizations to intelligently manage the development process by leveraging their tools to measurably improve outcomes.

We need this candidate to bring a deep understanding of modern Cloud infrastructure, programming expertise, operational experience and a desire to change the status quo. We're looking for an engineer who can analyze and help improve our services and processes to get us to an even higher level of reliability, performance, scalability, and cost efficiency. Success will be through crossing team and functional boundaries to advocate for reliability methodologies and will work with a variety of platform and product teams to both build reliability into our platform and drive adoption of those practices into our products.

  • Responsibilities: Architect, build, and maintain highly available, fault-tolerant systems using AWS/other services
  • Use Terraform to define infrastructure as code, enabling scalable, repeatable, and secure deployments
  • Continuously review and recommend the design, maintenance, development and implementation, including deployment and support, of our SaaS production platform solution using Docker and other modern web technologies
  • Set up and enforce guardrails for databases, infrastructure, and applications, ensuring consistency and adherence to best practices
  • Support operationally critical environments using monitoring tools, scripts, and logging
  • Document designs and implementations
  • Design and manage secure networking solutions, including AWS VPCs, and firewalls
  • Partner with SRE and Engineering teams to embed reliability and security best practices into the application life-cycle
  • Collaborate with fellow Engineers, Product Managers, and Quality Assurance Engineers to develop and deliver services that meet or exceed enterprise customer reliability and quality expectations
  • Participate and be effective at pair/mob programing and code reviews, both giving and receiving feedback

Job Tags

Similar Jobs

Innovative Appliance Repair of Woodland Hills

$40/H Data Entry Clerk Job at Innovative Appliance Repair of Woodland Hills

Job Title: Remote Data Entry Clerk Compensation: $40/hour Job Type: Full-time / Part-time (Flexible Hours) Location: Remote Job Summary: We are seeking a detail-oriented and reliable Remote Data Entry Clerk to join our team. In this role, you will be responsible... 

PSI Solutions

LPN Job at PSI Solutions

 ...LPN Muraski Elementary School, Strongsville School Health Clinic Monday thru Friday, 9:00am till 3:30pm 32.5 Hours per Week Interviewing Ends March 27 Final Orientation April 7-9, 2025 The school year is coming to a close- Why not join us again... 

Shoutt International Ltd

Graphic Designer for Daddy's Chicken Shack Job at Shoutt International Ltd

Looking for a freelance graphic designer to help with a variety of work at Daddy's Chicken Shack Franchises.- Typically 25-35 hours a month.- Bonus points if you're in the Denver area.Feel free to mention a fantastic designer with available hours.Appreciate it!

Legacy Health

Professional Biller II, Lead Job at Legacy Health

 ...reside in Oregon or Washington only, may work 1-2 days a week at an assigned Legacy Health...  ..., and other days may work remotely at home, on the road or in a satellite location for...  ...collection rules and regulations. Knowledge of online systems for eligibility and status review... 

The Borgen Project

Writer/Journalist Internship Job at The Borgen Project

 ...Are you passionate about making a difference in the world? Look no further! The Borgen Project ( is an international organization that works at the political level to improve living conditions for people impacted by war, famine and poverty. With 20 years of experience...