TraceLink, Inc. Cloud Engineer II in North Reading, Massachusetts
As the leader in pharmaceutical track & trace serialization software, capturing approximately 60 to 70 percent of the global market – Tracelink continues to innovate and develop bold solutions to complex problems keeping medicines and medical products from the patients and hospitals who need them. In a global market rife with counterfeit drugs, shortages, recalls and distribution gridlocks Tracelink’s team has responded with creation of The Opus Digital Network Platform. The mission of Opus is to create an ecosystem of customers and partners to build digital networks through our multi-enterprise applications. Seamlessly linking companies, people, processes, and systems to create an effortlessly intelligent and agile supply chain - saving time, resources, and effort in an industry where lives depend on it. Our team of talented software professionals continue to bring to life this next stage in TraceLink’s mission of creating a network for the greater good, are you ready join?
We're looking for an enthusiastic, driven and passionate engineering team member with backgrounds in programming, distributed systems and Kubernetes to help our SRE team improve its Service Mesh and Kubernetes architecture. The SRE group is building and expanding on the critical need to maintain visibility and provide scalability of the TraceLink global platform. Within SRE, you'll have plenty of opportunities to share your strengths,help us build a scalable platform and collaborate closely with various engineering stakeholders.
You will work in a global team, in an inclusive environment with AWS cloud-based deployments and focus on ensuring services are running smoothly, continuously assess opportunities to reduce toil and help improve service availability and reliability, optimize AWS resources usage across multiple environments to deliver cost effective services to the engineering organization.
As a member of the SRE core team, ensure high availability and reliability expected by our customers and delivery to defined OKRs
Collaborate with engineering and business stakeholders to maintain and refine the backlog of user epics for prioritized opportunities.
Design, build, document, test new tools and technologies as part of an Agile development team. Maintain and improve these to eliminate bugs, increase performance/efficiency, or extend capabilities
Participate in code reviews, systems design and architectural sessions to ensure that our platform and supporting services are developed/deployed using best practices.
Test, diagnose and correct defects as part of a CI/CD pipeline and test automation, always expanding and improving the testing coverage. Help design and implement self-healing, resiliency patterns
Play an active role in the development process, deliver on commitments, communicate issues, work with others both in the team and in other teams
Offer suggestions on how to improve tools and/or processes and help define our sprint epics, stories based on business priorities
Help drive planned releases or any outage triage that impact the infrastructure and collaborate closely with CloudOps. Familiar with blameless postmortems, refine playbooks to reduce MTTR
2+ years of experience as an SRE/DevOps/system engineer
Strong understanding of cloud deployment and management practices
Hands-on experience with Terraform, Helm, Docker, Kubernetes, Prometheus
Hands-on experience with tools and techniques to diagnose and uncover container performance
Skilled with AWS services both from technology and cost perspectives
Skilled in DevOps/SRE practices and build/release pipelines
Experience working with mature development practices and tools for source control, security, and deployment
Excellent communication skills, written and verbal
Strong analytical and problem-solving skills
External Company Name: TraceLink, Inc.
External Company URL: http://www.tracelink.com