How ArangGraphML Leverages Intel’s PyG Optimizations
ArangoGraphML + Intel: Next-level Machine Learning Accelerated
ArangoDB and Intel have announced a groundbreaking partnership to enhance Graph Machine Learning (GraphML) using Intel's high-performance processors. This collaboration, part of the Intel Disruptor Program, will seek to integrate ArangoDB's graph database solutions with Intel's Xeon CPU. This synergy promises to revolutionize data analytics and pattern recognition in complex graph structures, marking a new era in database technology and GraphML advancements.
ArangoGraphML
ArangoGraphML, part of ArangoDB's suite, is an advanced graph machine learning platform designed for efficient data analysis and pattern recognition in complex graph structures, leveraging graph database technology to drive innovation in data intelligence and analytics.
Machine Learning Performance Challenge
The quest for speed in machine learning platforms is unending. By delving into Intel’s PyG optimizations, we aim to harness the power of CPU performance enhancements specifically tailored for Graph Neural Network and PyG workloads. As ArangoGraphML is leveraging PyG, any performance improvement is relevant for us and our customers. This exploration is not only about benchmarking Intel’s PyG optimizations but also about internal testing to measure their impact on our platform.
PyG benchmark
Our focus lies on gauging the performance of GraphML algorithms within our platform using torch.compile. This method allows us to assess the efficiency gains brought about by Intel’s PyG optimizations during the training and inference time, providing insights into the tangible benefits for our users.
Benchmark methodology
To ensure a robust evaluation, we conducted tests under controlled conditions:
- System Specifications: We have used an AWS EC2 instance specifically t2.2xlarge with 8 vCPUs and 32 GiB RAM.
- Dataset: We have used ogb-products dataset which is a large-scale undirected and unweighted graph, representing an Amazon product co-purchasing network. The task is to predict the category of a product in a multi-class classification setup, where the 47 top-level categories are used for target labels. This dataset highlights its relevance to real-world scenarios.
- Batch Size, Hidden Layers, and Number of Layers: We have experimented with different essential hyper-parameters in evaluating the performance of GraphML algorithms.
The outcomes
In our preliminary assessments, we observed a noteworthy increase in performance, achieving a speedup of up to 20%. The gains were evident when comparing the execution times of GraphML algorithms with and without Intel’s PyG optimizations. The results are presented graphically in the chart below and summarized in the accompanying table.
Batch Size | Hidden Channels | Layers | Mode | Median Time per Epoch (in seconds) | Speed up |
---|---|---|---|---|---|
1024 | 256 | 2 | Eager | 153.803 | |
1024 | 256 | 2 | Compile | 134.106 | |
1.15x | |||||
512 | 64 | 2 | Eager | 89.039 | |
512 | 64 | 2 | Compile | 98.714 | |
1.11x | |||||
512 | 128 | 3 | Eager | ||
512 | 128 | 3 | Compile | ||
1.12x |
Conclusion
With a demonstrated performance boost, we are now leveraging Intel’s PyG optimizations across our platform. This commitment aligns with our dedication to providing users with cutting-edge technology and optimized algorithms for their Graph Neural Network workflows.
As the field of machine learning continues to evolve, ArangoGraphML remains at the forefront, leveraging Intel’s PyTorch Geometric optimizations to ensure our users experience the fastest and most efficient ML platform available.
Stay tuned for further updates on our journey toward excellence in Graph Machine Learning!
ArangoDB’s Exciting Updates: Introducing Our Developer Hub and GenAI Bots!
Estimated reading time: 3 minutes
At ArangoDB, our commitment to empowering developers and data enthusiasts with cutting-edge tools and resources is unwavering. In line with our commitment to “Graph Done Simple,” we are thrilled to unveil two groundbreaking additions to our arsenal that promise to revolutionize your experience with our multi-model graph database.
Developer Hub: Where Knowledge Meets Accessibility
We’ve always believed in the power of community-driven knowledge sharing, and we are proud to present our brand-new Developer Hub, accessible at developer.arangodb.com. This hub is a testament to our dedication to creating an ecosystem that empowers you with the knowledge and resources you need.
(more…)Evolving ArangoDB’s Licensing Model for a Sustainable Future
Estimated reading time: 3 minutes
ArangoDB as a company is firmly grounded in Open Source. The first commit was made in October 2011, and today, we are very proud of having over 13,000 stargazers on GitHub. We believe that the ArangoDB community should be able to enjoy all of the benefits of using ArangoDB, and we have always offered a completely free community edition in addition to our paid enterprise offering.
With the evolving landscape of database technologies and the imperative need to ensure ArangoDB remains sustainable, innovative, and competitive, we’re introducing some changes to our licensing model. These alterations will help us continue our commitment to the community, fuel further development, and assist businesses in obtaining the best from our platform.
These alterations are based on changes in the broader database market.
ArangoGraph Now Available on AWS Marketplace
Estimated reading time: 1 minute
Today we are excited to announce that ArangoGraph, the ArangoDB Managed Service, is available for purchase in the AWS Marketplace. With this announcement, ArangoGraph can now be purchased directly via both AWS and GCP.
The AWS Marketplace provides an extensive catalog of software solutions for users to easily explore, test, buy, and deploy on AWS. If you’re an AWS customer, here’s what this announcement means for you:
(more…)Bridging Knowledge and Language: ArangoDB Empowers Large Language Models for Real-World Applications
Estimated reading time: 5 minutes
Understanding Large Language Models (LLMs) and Knowledge Graphs
Today, two very different technology concepts have become prominent in data analysis and predictive analytics: Knowledge Graphs and Large Language Models (LLMs). These domains each have their unique benefits, and influence the ways that we engage with and derive meaningful insights from constantly expanding and complex datasets. They are like the Odd Couple – better together than on their own!
(more…)Three Ways to Scale your Graph
Estimated reading time: 10 minutes
As businesses grow and their data needs increase, they often face the challenge of scaling their database systems to keep up with the increasing demand.
What happens when your single server machine is no longer sufficient to store your graph that has grown too large? Or when your instance can no longer cope with the increasing amount of user requests coming in?
Read moreMay 2023: What’s the Latest with ArangoDB?
Estimated reading time: 4 minutes
Welcome to the May ArangoDB newsletter. Thank you for reading! 📖
Here are some of the things we’re excited to share with you this month:
- Our upcoming webinar on ArangoDB 3.11
- Combatting fraud with graph
- How Finite State uses ArangoDB to address cyber threats
- The latest case study with Global Relay
- Our five new driver tutorials in ArangoDB University
- ArangoGraph, our cloud-based graph data and analytics platform
- Rewarding you with a $25 Amazon gift card
Graph and Entity Resolution Against Cyber Fraud
Estimated reading time: 4 minutes
With the growing prevalence of the internet in our daily lives, the risks of malware, ransomware, and other cyber fraud are rising. The digital nature of these attacks makes it very easy for fraudsters to scale by creating thousands of accounts, so even if one is identified, they can continue their attacks.
In this blog post, we will discuss how graph and entity resolution (ER) can help us battle these risks across different industries such as healthcare, finance, and e-commerce (for example, the US healthcare system alone can save $300 billion a year with entity resolution). You will also receive hands-on experience with entity resolution on ArangoDB.
Combat Fraud with Graph
Estimated reading time: 5 minutes
Fraud is one of the most significant issues facing businesses today. While companies have always faced fraud, detecting fraudulent activity has become even more challenging due to increased online transactions. Globally, fraud results in more than $3.7 trillion in annual losses (Murphy, 2022). Fraud comes in numerous forms, including but not limited to money laundering, identity theft, account takeover, and payment fraud. Due to the variety of ways companies can face fraud, they must have a system to protect themselves and their customers.
Read moreFebruary 2023: What’s the Latest with ArangoDB?
Estimated reading time: 4 minutes
Welcome to the ArangoDB newsletter for February 2023. Thank you for reading! 📖
Here are the things we’re most excited about this month:
- Our upcoming webinar on fastgraphML with ArangoDB
- We are now SOC 2 compliant
- Our latest case study with Orange detailing how ArangoDB serves as the core of its digital twin platform
- Our newest course in ArangoDB University: Coming from SQL
- ArangoGraph, our cloud-based graph data and analytics platform
- Our avocado grove is growing (we are hiring!)
Get the latest tutorials,
blog posts and news:
Thanks for subscribing! Please check your email for further instructions.