I was once in a small debate with one of the SRE managers at MediaMath on how to do application gut check on deploy. I liked making a "sanity" endpoint which caused the app to run a dependency check and ack the results. He preferred sticking to vanilla health checks and monitoring. Naturally I went with what he said; he being more experienced and living in that world longer than I had. However, on more than one occasion pinging each node is a cluster and finding the one that acked "err: no db" saved my bacon in a clutch. Now that we live in a world of pods of transient nodes built on managed clusters I think the "gut check" endpoint is much less valuable. Would I first hit up cubectl to see what was running? Then run hit that endpoint only to find out that the problematic server had already been vacuumed away? I'd better spend my efforts building in the right kind of monitoring and telemetry so that the magic could happen without me.
Owein Reese’s Post
More Relevant Posts
-
Owning a mission-critical platform that processes half a million users daily for 80 independent customer-facing products in the digital ecosystem can be challenging. It was a casual Friday, mid-sprint. The week's hard work was behind me, and I headed to the local pub with a clear conscience to meet friends. We chatted and shared stories, with nothing hinting at work-related issues — especially after a few beers. Suddenly, I received a panicked call from an internal stakeholder — a product manager for a popular digital entertainment service. She was yelling that my platform had crashed, leaving all her users without access during prime time. The evening quickly devolved into a series of desperate attempts to connect remotely to the virtual PC from the phone (battling a glitchy RDP) to access platform logs while trying to reach out to my DevOps lead. All culminated in a rushed dash home to oversee a hotfix deployment. So much fun for Friday night, yet I imagine the frustration of users as well. So, may your Fridays be different - genuinely entertaining and incident-free. Here's wishing everyone a splendid weekend! #TGIF #productlife #memories
To view or add a comment, sign in
-
-
Is it smarter to adopt Backstage or buy a proprietary platform? There's a vibrant discussion among platform engineers about the true costs and benefits of building internal developer platforms (IDPs), especially with tools like Backstage. On Reddit, some have expressed concerns: “We stood it up, but I decided the operational effort of maintaining it was too much.” This raises a crucial question for organizations everywhere: Is the future about building customized solutions internally or purchasing ready-made platforms that can deliver immediate value? I'd love to hear your thoughts or experiences with different IDP approaches. Are you team build or team buy? Why? #IDP #backstage #platformengineering https://lnkd.in/dbvfeFgr
To view or add a comment, sign in
-
#CaseStudy #VIDEOSTREAMING - South Africa’s leading #entertainment company @MultiChoiceGRP is modernizing its #videoondemand platform with LSD Open. Its new environment uses #RedHat #OpenShiftVirtualization and Red Hat #Ansible to speed up development, get cloud-ready, and automate patch management. #DevOps
To view or add a comment, sign in
-
How to Implement Server-Sent Events in Go Discover how to implement Server-Sent Events (SSE) in Go for real-time data streaming from your server to the client. Perfect for live notifications and more! 🔥 #Golang #ServerSentEvents #BackendDevelopment #DevOps 🔗 https://packagemain.tech/
To view or add a comment, sign in
-
-
Today marks my Day 30 since I joined the Bubble team. It’s a big onboarding milestone and it usually means the person is getting comfortable enough to start making an impact. Plot twist: it didn’t go that way 1️⃣ We had 3 outages that caused platform downtime. My second week introduction to our Community Forum was reactive messaging to a community that’s highly invested in our platform and holds us to a very high standard (we do too, btw. Things happen). Pretty heated to say the least. And before my intro call to engineering 🤣 2️⃣ I’ve also been working on a net-new initiative for the community team. It’s cross-functional (a lot of stakeholders) with a lot of calls and approvals. There was a lot of “we haven’t met yet, nice to meet you” while trying to drive an agenda. 3️⃣ Forum “wrangling”: every community professional knows it takes some time to know who’s who in a community, get a feel for the vibe, learn historical context, and know enough about the product to be effective. I’ve also been doing that while working on a forums revamp initiative to improvement pain points. 4️⃣ Bubble also has a unique onboarding where new joiners are expected to learn and build their first Bubble app their first week (I did a basic quiz app) and a rotation with the success team on week 2 (solved 5 tickets, literally didn’t have time for more with everything else) 5️⃣ I’m also now playing support for a live event in June, helping PMs with comms in the forums, the point person in engineering for outages that need reactive messaging, and daily sharing user feedback internally trying to make it more of a part of our roadmaps. This may sound like a lot for my first 4 weeks, but honestly I’m lovin’ it. I didn’t join the team because I was looking for a place to get cozy. It’s all about learning. About being uncomfortable enough to grow. And this ride is an absolute thrill! 🚀 Closing this with a shoutout to my new director of community Jayvee Nava Brady. If you think my days are nuts, you should see hers. Mega champ. We are going to build some cool stuff together 🤝 Back to work.
To view or add a comment, sign in
-
Join us on buff.ly/3ZXK9pc as we dive into setting up Consul configurations for seamless client integration through Vault. Whether you’re a DevOps pro or just getting started, this is a session you don’t want to miss! 🔐💻 #DevOps #Consul #Vault #TheDevCastOps #LiveCoding #TechTalk
thedevcastops - Twitch
twitch.tv
To view or add a comment, sign in
-
That's right. The never ending list of questions that ops teams get asked frequently :) Shout out to a good friend of mine for suggesting this one. It was an interesting exercise to distill the laundry list of q's into a short list of 20, but I think we go it. First one drops next week. #AdOps #MediaOperations #WhatDoesOpsDo #AskAdOps
Month 1 of 2025 is almost done. And it flew by. For the #AdOps/ #MediaOps folks in the room, month-end falls gloriously on a Friday (instead of the weekend). So, with a couple of days left, don't forget to check that campaign pacing and performance today. Next week, Chris Quinn, is kicking off a series of quick answers/comments/rants from a list of Top 20 Questions that the ops team routinely gets asked. Stay tuned. #ProOpsConsulting #AskAdOps #MediaOperations
To view or add a comment, sign in
-
Thanks for tuning in! We successfully set up Consul configurations to allow our clients to join through Vault. If you missed it, don’t worry—the replay will be available on buff.ly/3ZXK9pc. Catch up on the latest DevOps insights and stay tuned for our next session! 🌟 #DevOps #Consul #Vault #TheDevCastOps #TechCommunity #StreamRecap
thedevcastops - Twitch
twitch.tv
To view or add a comment, sign in
-
We have all been there :) Friday deployment: You learn the hard way not to deploy on Friday, When bugs creep in and wreak dismay. Your weekend plans are at stake Your head is down debugging fast Your PR approvers will not last. The bug is found you sigh in relief all fixed is now, and it's time to brief. A lesson learned for the next release. Deploy with care and ensure your lead. For in the world of code and screen, It's wisdom born from moments keen. So plan ahead, let caution sway, do not deploy on this day. Happy Friday. #FridayDeployemnt #DevLife
To view or add a comment, sign in
-