TensorFuse (YC W24) is now one of the most popular deals for running serverless GPUs on YC! We're the trusted infra partner for many fast growing YC companies. Here’s what we do for our early users: 1) A private Slack channel for any MLOps related support. 2) Help with tasks ranging from downloading models from Huggingface or open-source repo, to writing inference code and deploying it on autoscaling infra. 3) Help with containerising your models (if you haven’t done it already) 4) Expedited approval of GPU quota limits (we have a dedicated support channel with cloud providers) If you want to deploy custom ML models on auto-scaling infra and don’t want the hassle to manage it, get started with our docs: https://lnkd.in/gS7KAMHN
Agam Jain’s Post
More Relevant Posts
-
#AWS #machinelearning #MLOps 👉 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐚 𝐦𝐨𝐝𝐞𝐥 𝐰𝐢𝐭𝐡 𝐀𝐖𝐒 𝐄𝐂𝟐 Since the demo training code doesn't use a GPU, I launched a t2.xlarge CPU EC2 instance for the training. Obviously, SageMaker is a fully managed service that saves the hassle of installing GPU drivers, CUDA, Python dependencies, and more. However, managing resources ourselves could potentially reduce costs. Image: Amazon Linux 2023 AMI Instance Type: t2.xlarge (w/o GPU) VPC: Default VPC (same as Step 1) Security Group: launch-wizard-1 (all inbound/outbound traffic allowed) Role Name: udacity-p4-ec2 (permissions: AmazonElasticMapReduceforEC2Role, SageMaker execution role, and S3 full access) Dependencies: torch, torchvision, Pillow (including Numpy), tqdm ◾ demo video - https://lnkd.in/gKKBjcFy ◾ code - https://lnkd.in/gRtjFzgX
20241219 AWS Machine Learning - Training with EC2
https://www.youtube.com/
To view or add a comment, sign in
-
As a developer, you want to build scalable applications without the hassle of managing infrastructure. That's where Azure Function App comes in - a serverless compute service that enables you to run event-driven code effortlessly. #FunctionApp #Serverless #CloudComputing
To view or add a comment, sign in
-
🔍 S3 Multipart Uploads: What Happens When They Fail at 98% of completion? 💡 . . . Did you know that when a multipart upload to Amazon S3 fails (even at 98% completion), you’re still billed for the storage of the uploaded parts? 🧐 These "hidden" parts won't appear in the console unless you specifically check for them. Here’s what you need to know: Tracking Failed Uploads: You can’t view these parts directly in the S3 console, but by enabling an S3 Lens Dashboard and filtering for "Failed Multipart Uploads," you can get a detailed view of incomplete uploads. Manage Multipart Session IDs: To clean up these hidden parts, you'll need the multipart session ID. With that, you can either complete the upload or delete it. Using Boto3: If you're using Python, the boto3 library is your friend. It allows you to track and manage multipart uploads, so you can query and delete those orphaned parts before they start racking up unnecessary costs. Efficient cloud storage management is critical, and understanding how to handle failed uploads can save you money in the long run! 💰🚀 #AWS #S3 #CloudStorage #TechTips #Boto3 #CloudOptimization #MultipartUpload
To view or add a comment, sign in
-
Last blog post in what henceforth shall be known as 'the in-flight trilogy'! For a recent project I needed to create autoscaler rules for #Azure Container Apps that used KEDA to scale based on messages in a Service Bus queue. There are lots of samples for scaling rules, but almost all use things like CPU utlisation. My thanks to my good friend Tom Kerkhove ☁️ for the ARM samples in his GitHub profile. I've written up a worked example in bicep: https://lnkd.in/eDgt_zEp
To view or add a comment, sign in
-
Serverless is an incredible paradigm, but performance tuning sometimes feels like a black box. Our latest blog post shares 5 actionable optimization strategies from serverless expert @theburningmonk to help you unlock your app’s full potential. https://bit.ly/3YOGpaL #Serverless #CloudPerformance #AWS #Optimization
To view or add a comment, sign in
-
Profile-Guided Optimization (PGO) in Go Applications!! Profile-Guided Optimization (PGO) also known as feedback-directed optimization (FDO) was introduced in Go 1.20. With PGO, we can provide the Go compiler with a runtime profile of our application, enabling it to make smarter, more targeted optimizations on the next build. At Google and Uber, teams have been leveraging PGO to boost compute efficiency and cut costs. Uber’s fleet-wide rollout of PGO has already led to a significant reduction in CPU utilization across many services. Want to harness the power of PGO in your Go projects? Enable Profiling: Import the `net/http/pprof` package in main package. Collect a Profile: Run the application in production-like environment and download the CPU profile. Optimize Your Build: Use the collected profile in next build to activate PGO. Measure Your Gains: Compare CPU usage with and without PGO using `go tool pprof`. As of Go 1.22, benchmarks for a representative set of Go programs show that building with PGO improves performance by around 2-14% Also, Check out this beautiful blog - https://lnkd.in/gBW8Nkf4 by Cameron Balahan and official release - https://go.dev/doc/pgo #GoLang #PGO #PerformanceOptimization #SoftwareDevelopment #TechInnovation #GoogleCloud #UberTech
To view or add a comment, sign in
-
In a nutshell, we predict: – More Rust in Kubernetes – Increased experimentation on Serverless – An uptick in WASM adoption – Stronger focus on security – Bridging the gap between IaC and SW developers – AI/LLMs maturing beyond GPUs – Kubernetes (hopefully) moving towards being more user-friendly #cloudnative #cloud2025 #cloudprovider #kubernetes
Starting the week with some spicy takes to warm up the cold first month of the year! What do we reckon is going to unfold in the cloud ecosystem in 2025? 🔥 More Rust in Kubernetes 🔥 Increased experimentation with Serverless 🔥 An uptick in WASM adoption 🔥 Stronger focus on security 🔥 Bridging the gap between IaC and software developers 🔥 AI/LLMs maturing beyond GPUs 🔥 Kubernetes (finally?!) becoming more user-friendly Are you nodding along or screaming at your laptop? Follow the link to the full article on our blog and share your thoughts with us - what do you think 2025 is going to bring for cloud native? #cloudnative #spicytakes #techpredictions #cloud2025 #cloudpredictions
To view or add a comment, sign in
-
We keep on adding features and enhancments to Mountpoint for Amazon S3. CSI driver is a high focus, due to the fact that many customers run compute intensive workload in K8s. Since launch the adaptation has been nothing short of phenomenal.. The amount of oppritunities it brings for workloads that are file centric, but want those to interact with S3 Objects. In STG406 on coming re:Invent, you will see hands-on how we train machine learning models using Mountpoint and EKS and scale. https://lnkd.in/eXMJb26j #AWS #S3 #EKS #K8S #CSI #GENAI
To view or add a comment, sign in
-
Lambda Power tuning is very useful tool to calculate the best relation between execute time and memory size (execute cost). Here I share the link: https://lnkd.in/dA94Pvvx #AWS #Lambda
I help AWS builders ship better software through lessons you won’t find in the docs | AWS Serverless Hero
Your Lambda functions are running 4x slower than they should, and your IaC tools are to blame. IaC tools such as CDK, SAM and Terraform default to 128MB of memory for Lambda. This is well-intentioned, they want to help you save money. But most of the time, it won't... and this actually creates a big performance problem for many of its customers. One that's entirely avoidable! Unless you're using Rust, 128MB is not enough for a decent performance for even simple functions. This under-allocation hits you in two ways. 1) Lambda allocates CPU and network bandwidth proportional to the amount of memory allocation. At 128MB of memory, you don't have a lot of CPU power to play with. But more significantly, 2) Even if your function doesn't use the full 128MB, you will likely experience OS paging (data copied back and forth between main memory and disk) as you approach 60-70% memory utilization. This can trash the performance of your code! As you can see from the chart below, there's a big difference in both cold and warm starts for the same function running at 128MB vs. 1024MB. Yes, a function running at 1024MB will cost more per ms of execution time than a 128MB function, but the practical difference is negligible for 99.9% of functions out there. Take the following example: A function runs 1 million times a month, with an average execution time of 100ms. At 1024MB, it will cost $1.87. At 128MB, it will cost $0.41. In practice, the 128MB function will likely take longer to run, so it'll cost more. However, we can safely ignore that difference as it's negligible. The important thing to remember is that, it's better to overprovision a little by default so you don't have to worry about unnecessary performance hits. 1024MB is a good default unless a function runs millions of times a month. For those frequently executed functions, use the Lambda Powertuning tool to help you right-size the memory setting. For everything else, stick with the 1024MB default, because the saving won't even cover the engineering time you put into powertuning. To learn more practical tips for building serverless applications for the real world, subscribe to the Master Serverless newsletter: https://lnkd.in/eStnFnfF
To view or add a comment, sign in
-
-
Kubernetes Karpenter Autoscaler is actually serving me right. It was a best decision I made moving away from "cluster autoscaler" to karpenter for almost 18 months now. I have a mix of different nodepools in my cluster for various workloads. GPU nodes based off AWS g4dn,p3 and p2 GPU instances, amd64 instances and arm64 processors. I have recently embarked on a journey to build all my microservices to run on arm with docker buildx--to build all container images to run on cross platform. ARM64-base nodes are actually low-cost, high performance gain compute instances. With the AWS Gravition3 ARM-based processors. I have been running most of my workloads on graviton3 in recent time, RDS, OpenSearch,In-memory DB with Elasticache for Redis and EC2 nodepools, Graviton3 offers high-performance gains at low-cost for compute-intensive workloads. I have experimented with different hardware architectures and I can say ARM-based processors actually add more compute-optimised performance gains to your workloads. #cloudengineering #cloudcomputing #cloudnative #softwarengineering #softwarearchitecture #awscloud #coding #programming
To view or add a comment, sign in
-