top of page

Cutting your AWS EC2 Spent by up to 60% with Cristian Magherusan-Stanciu from AutoSpotting

This interview is part of the Simplyblock Cloud Commute Podcast, available on Youtube, Spotify, iTunes/Apple Podcasts, Pandora, Samsung Podcasts, and our show site.

In this installment, we're talking to Cristian Magherusan-Stanciu from AutoSpotting, a company helping to cost-optimize their AWS EC2 spent by automatically supplying matching workloads with spot instances. Cristian is talking about how spot instances work, how you can use them to save up to 60% of your EC2 cost, as well as how tools like ChatGPT, CoPilot, and AI Assistant help you writing (better) code.

Chris Engelbert: I'm really happy to have you all back and I'm really happy to have my guest for today. Cris, I'm really sorry I don't try to pronounce your last name. I can only fail. I mean, I guess, location-wise, we're not too far off from each other, but I will not try that. Maybe just introduce yourself. Who are you? Where are you from? What do you do?

Cristian Magherusan-Stanciu: Yeah. My name is Cristian. I'm based in Berlin, but I'm originally from Romania. I came to Berlin more than 12 years ago. Enough to be local, but not quite, especially with my name. It's hard for everybody to pronounce it, but don't worry. What I do is I have a lot of background in this IT space. I've been working with AWS for many years. Eventually, I entered this area of cost optimization and had some new stuff in that space built over the last eight years now. It's called AutoSpotting. It's a tool for cost optimization in AWS. For a while, I actually worked at AWS itself and then quit AWS one and a half years ago to work full-time on this. I'm also helping customers with cost optimization as a service. It's just myself, so I'm like a solopreneur. I have an offering which comes around just optimizing costs for startups and medium-sized companies. I try to focus on people who are more, let's say, more agile, more my state of mind as well. With building tools as I do stuff and offering these, the entire thing as a service.

Chris Engelbert: Right. You mentioned AutoSpotting, and I think that is your company that's also the name of the tool. Maybe you can elaborate a little bit. You mentioned it's in the cost optimization space, so how does that tool work?

Cristian Magherusan-Stanciu: Yeah. It's meant to orchestrate the spot instances in AWS. I'm not sure if your audience is familiar with this topic, but spot is just unused capacity in the cloud. It's not very far from the idea of when you go on a last-minute trip somewhere. Imagine that the hotel has fixed capacity. Basically, they can take a number of people, but not always at full capacity. What they do is they give this free capacity to other people at a lower price. They get it at a discount. That's also what I do in the cloud. With spot instances, you actually get this unused capacity at the discount. But basically, unlike the hotel, when, for example, let's say your hotel would get full, the capacity, you would just be told there's no more free capacity, but the cloud providers, the way it works, try to have this illusion of unlimited capacity. From the cloud providers' perspective, it's a big problem if a customer fails to provision some of the capacity they need. What they do is they always try to have some spare capacity, keeping it in a way that allows anybody who wants to spin up something to be able to do it. That gets them in a situation where they need to have a spare at all times. And it's also like what happens is you have a lot of options to choose from. So if you get back to the analogy of the hotel space, it's like you would imagine having all the hotels in a big city in a single chain. And let's say there's an event and you want to go there. If one of them is full, you can always be redirected to another one. So that's kind of how it works with the spot instance, when a certain capacity pool gets full, you can go and get capacity from a different one, as if you would go to a different hotel in case the first one is full. So that's kind of you can still get this illusion of unlimited capacity, but by distributed over multiple capacity pools. And the tool that I have is actually meant to orchestrate. It was back in the days where this spot service was still at the beginnings. But basically it was orchestrating this diversification across multiple capacity pools in a way that's reliable so that you always get that capacity. Now, what happens in the cloud and it's not happening in the real world is the capacity can be claimed if somebody else needs it. So imagine you would go on this trip and you would be told by the hotel that somebody wants to pay the full price for this. And then the next day you would have to vacate your room and go somewhere else.

Chris Engelbert: Right, right.

Cristian Magherusan-Stanciu: Yeah, it's not something that can happen in the real world, but in the cloud, they do this. So they give that capacity to somebody else who pays the full price. And then you're outside like with luggage. But my tooling would actually find a replacement spare capacity for you. And if nothing is available, like across all the options, then it would find a non-reduced fee capacity pool. So then in the worst case, you still get an expensive room. But yeah, you get something where to spend your day.

Chris Engelbert: So that means, when I understood you correctly, in your tool, I say I need a capacity of I don't know, like 50 CPUs and that amount of RAM over that many machines. And you would try really hard to find spot instances. And if for some reason, AWS says I need like 10 of those spot instances right now, because somebody is paying the full price or something, you would just spin up other instances somewhere else and like, OK, so we'll get the spot instances.

Cristian Magherusan-Stanciu: Yeah, that's kind of how it works. It's just that the way it works is it connects to your existing capacity groups. You know, in AWS it is called auto scaling. So you have these groups of typically identical instances and AutoSpotting, you install it in the account and it can look at all your groups as long as the group has a certain tag attached to it, the group will be considered for this action. So it happens entirely without you having to configure anything except for the tag that tells it to look at the group. So it will go to the group and replace the instances one by one with this spot clone instance. So basically within the groups, if it just runs over a few hours, one by one, the instances will be replaced with the cheaper instances.

Chris Engelbert: So how do I have to think about that? If AWS says I need those instances, do they disappear in like seconds?

Cristian Magherusan-Stanciu: Yeah. So when they, when the capacity is claimed back, there is an event coming. And basically when that event comes, you have two minutes to vacate your room.

Chris Engelbert: Oh, wow.

Cristian Magherusan-Stanciu: And what I do in those two minutes, I find you a different one and start a new instance with a new capacity. And then that will be terminated after those two minutes pass. You will be evicted from it.

Chris Engelbert: Right. Right.

Cristian Magherusan-Stanciu: So your application needs to be able to sustain these interruptions. Like if you think of it, that the application is running on multiple identical instances in a group. So those groups may scale up and down based on the load. And then when they scale up and down, it means that it has to be somewhat flexible with the capacity. So it's like stateless, basically, it should be stateless. And if the group is stateless and many of them actually are, then this entire process is seamless for the user. So like there is a load balancer in front of the group that sends your traffic. And then the users will not notice anything, because you still get some replacement.

Chris Engelbert: Right. And that means if I see that correctly, as a developer building an application, I don't have to do anything special. As long as my application is like cloud-native and basically scalable, I'm good. Right?

Cristian Magherusan-Stanciu: Yeah, that's the way it works. And I mean, that's the whole idea of how people can actually use this reliably. And I have people who use it in production. As long as the application is built in this way, it's a great way to save money for companies.

Chris Engelbert: And from what I heard, it also works together with the automatic scaling that AWS gives you. It's just that in the worst case, your tool steps in, and interferes, and says, I don't want that instance. I want that. Is that correct?

Cristian Magherusan-Stanciu: Yeah, that's exactly the way it works. So it works with these API calls to attach and detach capacity to the group. So what I do, I spin up an instance, attach it to the group and terminate and detach the existing one.

Chris Engelbert: Right.

Cristian Magherusan-Stanciu: Yeah, it allows me to keep the group configuration untouched, which is important for customers where maybe they have a legacy application. They don't want to touch or yeah, could be for whatever reason, easier to to just work from outside the configuration. Rather than maybe reconfiguring 100 groups one by one, because the entire configuration is automated.

Chris Engelbert: But does AutoSpotting also do like the actual scaling or is that still the scaling group from AWS and you're basically just exchanging?

Cristian Magherusan-Stanciu: That's the group's job, so I don't interfere with that. But when the group spins up the new scale, the new instance that it scaled out, then I immediately replace it.

Chris Engelbert: Interesting. Yeah. Okay, cool. That is very cool. So I guess. Boldly said the biggest thing is the cost efficiency for the customer because I guess the spot instances are way cheaper than regular instances.

Cristian Magherusan-Stanciu: Yeah, I mean, you get around, let's say it depends a lot by instance type. The prices go up and down based on supply and demand. There are times during the year, where people tend to need more capacity and then the prices go up. And then other times when they go back down. It's like a marketplace. So yeah, typically the savings are nowadays around 60-ish percent. 50-60% depending on the instance type and region.

Chris Engelbert: So that means it basically gives you the same cost cut as a three year reservation without the actual reservation. That's actually really cool. Wow.

Cristian Magherusan-Stanciu: And the benefit is that if you don't need it, you just stop it and you don't  pay for it. Whereas the reservation is a flat fee per hour. So I mean, I have a customer right now who's working in the stock market and they have this daily traffic pattern like during the stock exchange open hours. They have a lot of traffic and they spin up capacity and then at the end of the day, they can shut it down. Whereas if you're paying a savings plan, you would provision the savings plan for the whole 24 hours and you would definitely get savings. But if you consider that maybe you only need that capacity for the eight hours of the stock market, probably it would be better to just just run it on demand rather than buying a savings plan. If you cannot use a spot for that workload.

Chris Engelbert: So that means from, from a customer's perspective or from a customer's group perspective, I think there is like the group that is perfectly fine with reservations. They have a fairly consistent kind of load on the system capacity. And for them, it's great. And they can look into the future like, yeah, we need those systems or we need that capacity in a year and in three years. It's all good. We only scale up. And then there's the kind of people or the kind of companies, like you said, like the stock market or any kind of a very regional online shop or whatever where you have like high hours or high load hours, low load hours. And I think those are perfect candidates for AutoSpotting. Is that correct?

Cristian Magherusan-Stanciu: Yeah. I mean, I have customers from all sorts of verticals, but it's very important if you can follow the traffic pattern of the users. If it's flat capacity, I mean, you can as well just purchase the three year all upfront and you get a bit better savings than this. But yeah, this, this doesn't have any commitment. So, if you want to do something different you have the flexibility to just stop altogether.

Chris Engelbert: And I think flexibility not  having this fixed commitment that is really interesting not only for production, but I could also see that it's also interesting during a research phase, a development phase, stuff like that where you have a time where you really need more capacity because now you're having this new research project or you have a bunch of developers working on something else, but then you have times where you really don't need that capacity. That's cool. That's really cool. So you said 60% or roughly 60% is that virtual machines only or does that also work for other things?

Cristian Magherusan-Stanciu: It's just the virtual machines.

Chris Engelbert: Okay, just the VMs. That's fair. Still, I think it's a good chunk of money.

Cristian Magherusan-Stanciu: For storage and databases, I have different tools. For storage, I have something that converts between the storage volumes, GP2 to GP3. And for databases, I have a tool that converts the instance type to the ARM Graviton instance types and also does a right sizing in the process of conversion. So if you have an overprovision capacity, if we look at the metrics and say, okay, this is maybe too big, let's use the smaller one.

Chris Engelbert: Okay, so that certainly sounds like you have to come back talking about those kinds of things. That's really cool. And I think you install the tool into your own AWS account. I think that is what you said, right? Do you do this from the marketplace?

Cristian Magherusan-Stanciu: Yeah, it's available on the marketplace. Previously, it was open source, so you could just get it from GitHub. But after I left AWS, I'm kind of full time on this and trying to make a living.

Chris Engelbert: That's fair.

Cristian Magherusan-Stanciu: Yeah, I try to get some revenue out of it, but I'm not charging. I mean, the current version is charging 5% of the savings. I'm going to increase the pricing a bit in the next version, but I'm just trying to get some affiliates to do marketing for me. And with 5%, there's not much to share.

Chris Engelbert: That is true.

Cristian Magherusan-Stanciu: Yeah, but other than that, I'm trying to be as low cost as possible.

Chris Engelbert: Right, right. So you said a new version, anything you want to share about that, anything exciting, you're really looking forward to having it implemented.

Cristian Magherusan-Stanciu: I mean, the last version was like six months ago, and over these last six months, every time I have a customer reporting something, I implement it. And basically, I added a bunch of new interesting things, a lot of work in the efficiency of the software. So it uses less memory from the Lambda because it's running in Lambda. There was also something when it came to deployment. So when you have a deployment, sometimes you spin up instances during the deployment and you don't want to interfere with the deployment until it's over. So there is a way to do that. So deployment in the sense of the customer's application. So if I deploy a whole new application.

Chris Engelbert: Yeah, okay, that makes sense. So if the application is deployed on newly launched instances, sometimes they are like, "Yeah, I don't want this to be touched until the deployment is over." So it's some kind of a cool-off phase or whatever you want to call that.

Cristian Magherusan-Stanciu: Yeah.

Chris Engelbert: All right. Or maybe a warm up phase, depending on what the application does. Okay. Cool. Yeah, I can see how this is very useful for a lot of customers or potential customers. We're almost running out of time, but I have one last question, something I really, really asked everyone so far. Like, what do you think is the new trend? What do you see is the big future we're heading to?

Cristian Magherusan-Stanciu: I mean, over the last year or so, it's pretty clear it's the AI and everybody's doing something in that space. I mean, it could be as well, just like the dot com bubble. The situation could also implode at some point, but I see huge gain out of using the AI on my own work. On a daily basis, I use ChatGPT for software development. So all my code over the last year pretty much is generated by the AI. And yeah, I'm using it, not offering AI tooling. Like, you know, every application now has an AI feature. I don't have an AI feature in AutoSpotting, but the way I build AutoSpotting evolved using the AI. So yeah, that's kind of the trend that I see. And I think that makes a lot of sense.

Chris Engelbert: Yeah, that makes a lot of sense. It's interesting that you say you build with ChatGPT. I have mixed experiences with that. What I found for all of those tools, no matter if it's like a ChatGPT or what is it, CoPilot or any of those tools, they really work great when you give them some source code and ask them, like, hey, do you have recommendations? Like having a virtual pair programming partner. That is absolutely amazing. Not everything is correct. And sometimes you're like, yeah, okay, it's maybe not what you want. But it's much better. And I got some pretty decent results with that kind of use case. Anyway, as I said, we are unfortunately out of time. 20 minutes is super short. But really, really interesting. As I said, you probably have to come back for the other tools when you want to talk more about those, because they also sound super interesting. And yeah, thank you for being here.

Cristian Magherusan-Stanciu: It was a pleasure. Thanks for having me.

Chris Engelbert: And it was also a pleasure for me.


bottom of page