Resoto: Graph-powered cloud asset inventory
A cloud infrastructure asset inventory organized as a directed graph in ArangoDB
Results:
- Empower customers to reduce cloud costs, improve security compliance, and prevent outages that can impact customer experiences.
- View dependencies between all cloud infrastructure assets to improve reliability.
- Allow non-experts to search their cloud infrastructure to reduce cloud spend and increase security.
The Scenario: Untracked cloud infrastructure leading to inventory debt
Resoto is a product offered by Some Engineering, whose mission is to make cloud infrastructure searchable and accessible. Resoto was created to provide a complete overview of all the cloud resources running in an organization, automate their documentation, and reduce spend.
The need for a product like Resoto arises from several shifts in cloud computing:
- Monolithic to modular: Application architectures have evolved from an on-premise, monolithic stack with a single application deployed on a single server to a cloud-native infrastructure serving hundreds of resources, such as compute instances, storage buckets, Kubernetes pods, and microservices.
- Centralized to dispersed spend: Cloud spending has moved from being controlled by central IT to being self-procured by developers.
- Unmanaged to managed configurations: Cloud deployment has gone from being driven by consoles, such as AWS Management Console, to infrastructure-as-code products like Terraform and Pulumi. These products reduce configuration drift, where changes to software and hardware are made ad hoc and are not recorded or tracked comprehensively and systematically; this practice is called mutable updating to immutable updating.
Cloud-native infrastructure increases complexity by orders of magnitude.
These shifts have reduced friction and accelerated innovation – good things for organizations – but have the downside of increased complexity. This complexity has created a new type of technical debt that Lars Kamp, CEO of Some Engineering, calls “inventory debt.” Elaborates Kamp, “Organizations are accumulating this technical debt where they lose track of the assets running in their infrastructure, and so getting an overview of all the cloud resources is no longer a trivial task. The inventory is not only huge and growing, but also rapidly changing.”
This limitless aspect of cloud computing can lead to problems: security vulnerabilities, performance issues, and perhaps most importantly, cost explosion. As the saying goes, with cloud computing, you never run out of resources; you only run out of budget.
The Requirements: A graph database to capture metadata and dependencies
The knee-jerk reaction to building an inventory is to list all the resources. But that’s just the beginning. Explains Kamp, “You start to expand the requirements of what you want to track. First, it’s the unique properties of each resource. Then it’s how they relate to each other.”
Creating this list of resources, their properties, and how they are connected in a relational database would get complicated quickly. One would need to create different tables for all the various resources and relationships, resulting in a complex data model and, in turn, complex queries to get insights. Such complex queries would be impossible for a relational database to deliver since they can only handle joins that are a couple of levels deep. Resoto could write simpler queries to compensate, but that would lead to missed insights. “It becomes a trade-off between insight and simplicity, and for Resoto, we wanted both,” Kamp states.
Therefore, Some Engineering decided to build Resoto as a directed graph. This approach would allow them to capture metadata and dependencies as part of their data model. The data Resoto wanted to support its asset inventory is resource data and dependency data. Resource data includes general information like resource name, ID, creation timestamp, hierarchy information, and where it sits in a region or cloud account. Dependency data represents the different relationships between those resources. “And when it comes to dependencies, there’s not just one,” Kamp explains. “People typically think about logical dependencies, but next to logical dependencies are delete dependencies – there’s a distinct order in which resources need to be cleaned up or deleted. There are also start dependencies – in which order resources need to be started to work properly. And stop dependencies – in which order they need to be shut down.”
Driven by the desire to capture dependencies and the context along with them, Some Engineering considered a graph database.
Why ArangoDB: Search, simplicity, and flexibility
1. Full-text search: ArangoSearch, ArangoDB’s full-text search and ranking engine, Resoto can offer an easy way for non-expert developers to search their cloud infrastructure, which was typically only accessible by engineers with deep expertise of cloud computing. This capability is essential since modern cloud service providers such as AWS, Microsoft Azure, and Google Cloud have hundreds of service offerings in dozens of regions worldwide.
2. Powerful query language: ArangoDB Query Language (AQL) allowed Resoto to build a simplified search syntax, making the product more human-friendly for non-expert developers to ask questions about their infrastructure to ensure that their applications run fast, reliably, and securely.
3. Schemaless: Although ArangoDB is schemaless, it also provides schema validation so that Resoto can enforce a statically-typed data model on top that we can update on-the-fly. “This is important for us because quality is paramount regarding data. By enforcing our statically-typed model and testing it every time we import a new data set, we offer robust data quality, which means better insights.”
Data quality improves with a statically typed model.
4. Smooth experience. With a runtime of less than 100 MB, a Kubernetes operator, and an open-source core with a clear upgrade path to additional features via an Enterprise Edition, ArangoDB is easily embedded into Resoto, allowing Some Engineering to offer its customers a smooth and straightforward experience.
The Implementation: A digital twin of an organization’s cloud assets
Resoto is a cloud asset inventory – essentially, a digital twin – of an organization’s cloud infrastructure. It takes regular, timestamped snapshots of cloud inventory, providing a complete representation of cloud infrastructure at any specific point in time. It’s complete, up-to-date, and queryable. As Kamp puts it, “It’s a meta layer that allows platform teams and engineers to analyze infrastructure and automatically perform remediation.”
Each node in ArangoDB’s graph represents a cloud resource, which includes JSON metadata. The edges in the graph describe the relationships between these resources. With all of an organization’s cloud assets organized as a graph in ArangoDB, Resoto has also been able to build additional features to assist with automation and intelligence:
- Resoto Jobs, which trigger commands based on a schedule or event in an organization’s infrastructure. This feature automates tedious tasks like updating or deleting resources, which reduces toil to free up engineering time to focus on non-repeatable tasks that often deliver more value to the organization and customers.
- Resoto Notebooks, built on Jupyter Notebooks, allow for interactive analysis of an organization’s infrastructure graph in Python, allowing companies to run models of cloud computing capacity and costs, and thus forecast and plan more intelligently.
The Results: Transforming inventory debt into strategic assets
By enabling customers to build their cloud asset inventories on ArangoDB, Resoto provides an infrastructure graph that is a strategic data asset that can yield insights.
Infrastructure graphs are a new strategic asset for organizations, on par with data from SaaS systems and production data.
An infrastructure graph allows teams to ask questions about their infrastructure without impacting test and production, such as:
- Improved customer experience through increased application reliability: What is the blast radius of any of my cloud assets? In other words, which resources and applications will be impacted if I delete this instance? Knowing this helps organizations run cloud-based applications with increased uptime and thus deliver a better customer experience.
- Increased security: What resources sit behind this IP address? Knowing this lets cloud engineers correlate IP addresses listed in a SIEM (security information and event management) system to the actual resources being accessed, speeding up security investigations. It also allows Resoto to perform security benchmarks against your infrastructure.
- Cost reduction: In what order do we need to shut down and remove resources to reduce cloud infrastructure costs without impacting our applications and disrupting the customer experience?
Resoto allows organizations to move from inventory debt, where they had to worry about cost, security, and vulnerabilities, to a new strategic data asset.
For more details, you can watch Lars’ presentation from ArangoDB Summit 2022: