Skip to content

Simplicity vs hyped innovation

Nowadays, everyone tries to resolve problems in different ways and accepts the burden of conducting research, which may take days, weeks, months, or even years, in pursuit of the best working solution. It's not really that difficult to find some viable yet different approaches to solving our problems or mitigating the issues we've faced thus far. Many people generously share their opinions on the internet, making them open source and allowing others who face similar problems to benefit from them. Additionally, it's essential to explore various approaches, compare them, and determine which one works best for you.

We've often envisioned a world that's rosy and gleaming at the start of this blog. You encounter a problem, search for solutions, choose one of them, address your problem, and voila! Does it really work that way? One hundred percent sure, it does not.

Back to Reality

In general, you often find yourself caught between multiple solutions, all of which differ from each other but appear more promising than the alternatives for addressing your problems. This is because there isn't a single, definitive way to resolve issues in our domain. Each solution comes with its own set of trade-offs, advantages, and disadvantages that you must consider from your company's perspective. After all, not every company shares the same position or faces identical circumstances.

Even if solution A seems more promising than solution B, you might lack the necessary resources, which can encompass various factors such as the number of colleagues, equipment, knowledge base, computer memory, time, and other elements relevant to your specific situation, to implement solution A. In such cases, turning to solution B to quickly gain advantages is a valid choice. This doesn't imply that you're on the wrong path; rather, you're opting for a route that allows for eventual success. Imagine that you want to cool down on a scorching hot day in the middle of summer. You have several options: you can go to the Maldives, take a bus to the nearest beach in your current province, or simply have a shower in the bathroom next to your room. Going to the Maldives may seem like the dream solution, but making a daily trip there might not be time or cost-efficient for your situation, unless you have a private jet waiting for you downstairs. In this case, opting for a local beach, if available, and if you have the time and necessary equipment, such as a swimsuit, can be a practical choice. If you lack these resources, you can always opt for a refreshing ice-cold shower in your own bathroom. Each of these solutions will fulfill your needs in different ways. However, choosing the most suitable one for your circumstances is crucial; otherwise, you might not achieve your goal and end up stuck, for instance, at the airport in the Maldives without tickets, a budget, or even plans, drenched due to the scorching heat outside. In such situations, you'll be forced to resort to the last option, but with significant effort. Always evaluate your situation thoroughly before making any decisions!

Reflecting on our journey...

We, as Priva, have also encountered numerous challenges that needed resolution, just like everyone else. I'd like to share a story related to what I've tried to illustrate above. We are a significant IoT company specializing in the Horticulture and Building domains, providing various cloud and edge solutions tailored to them. Our cloud solutions primarily revolve around Microsoft Azure. When it comes to IoT, the concept of digital twins immediately comes to mind. You have real-world devices, objects, and sensors that need to be represented in software to provide solutions. This is where it all begins. Many discussions revolve around how and where to store your digital twins. Robust communities and academic studies exist on this topic, offering multiple solutions to choose from. You can opt for a graph data structure where each of your objects forms a network with others, or a flat data structure where strong relationships are absent, but all attributes are stored together. Alternatively, you can choose the most conventional route, using a traditional relational data structure that allows you to establish relationships between objects.

Therefore, we needed a solution that worked seamlessly in both cloud and edge environments, and unfortunately, Azure CosmosDB No-SQL did not meet this criterion. Another essential requirement was a flexible database for attribute querying with inter-object relationships.

Kerem Aytaç

Tech Lead Priva

From a broader perspective, employing graph data structures adds significant value to your data and offers flexibility because real-world objects typically have complex and irregular relationships with one another. Trying to fit these relationships neatly into templates can be challenging. Graph capabilities can help here. Alternatively, your objects might have hundreds of unique attributes, each distinct from the others. To maintain the flexibility to query any attribute at any time, a flat data structure, similar to a document, can be a useful choice. For those who prefer to play it safe, the most familiar option is to use traditional relational databases, akin to SQL.

We initially adopted a flat data structure in our cloud solutions because we dealt with a multitude of attributes and wanted easy querying. Furthermore, presenting them in a flat structure was straightforward, as the data from devices, which we processed to create digital twins, was easily parsed into a flat structure. Additionally, we didn't need to establish complex relationships or perform joins. For this purpose, we utilized Azure CosmosDB, a No-SQL product. It served us well for some time, and we had ample opportunities to observe how our database was used by various stakeholders. However, as circumstances changed within our company, we began focusing more on edge solutions atop our cloud offerings. Additionally, the domains utilizing our solutions became more demanding, requiring extensive information from our digital twins due to the proliferation of vertical solutions built upon them. Therefore, we needed a solution that worked seamlessly in both cloud and edge environments, and unfortunately, Azure CosmosDB No-SQL did not meet this criterion. Another essential requirement was a flexible database for attribute querying with inter-object relationships.

The initial requirements from our business teams seemed quite sophisticated, leading us to consider a graph database. Many digital twin solutions worldwide were gravitating toward this approach, and graph databases were receiving considerable attention as a solution. Moreover, other domain teams developing solutions on top of ours were also exploring graph-capable databases to address similar issues. We decided to turn our focus towards Graph databases and embarked on an extensive months-long research journey, experimenting with various options. Initially, we set technical requirements, such as ensuring compatibility with SPARQL and the ability to accommodate the Brick ontology, all while remaining environment-agnostic. Some of these databases were managed services on Azure, while others were not. So, decided to set up a Kubernetes Cluster in Azure to host those solutions. This would also help us with setting the same environment in Edge as well.

We conducted numerous load, stress, and performance tests on our databases, but we were unable to achieve the desired results or benefits. Additionally, setting up a Kubernetes cluster for a database introduced various dependencies and required expertise in Kubernetes maintenance. Furthermore, hosting a large, stateful database in Kubernetes for production purposes is not a recommended approach. It took us several months to realize that we were heading in a direction that was not suitable for our requirements and available resources. However, we never abandoned our search for alternatives and continued to explore them throughout our journey with graph databases. It was time to take a step back and consider our next move. We faced a choice: either continue pursuing a trend that was unlikely to align with our needs, or reevaluate our requirements and chart a different course that was more suitable for us. We revisited our discussions with the business teams to gain a deeper understanding of the requirements and reconvened our technical meetings to explore simpler solutions. We found that we did not need such a complex solution. All of our requirements, which came from the domains, could be represented hierarchically, rather than in a graph-like structure. Additionally, the queries that we concluded from the business meetings were not that complex either, and we did not plan for them to become complex in the near future. Therefore, we could trade flexibility for a simpler solution.

Guess what happened? We ran beyond the resource limits. In order to have some graph capabilities, the gateways were in danger. Additionally, MSSQL required more resources to boot up in a machine than PostgreSQL

Kerem Aytaç

The traditional relational database was a good candidate for us, and we were all familiar with how to use them. So, "hello" to SQL! However, we were still unsure if we would need at least some basic graph capabilities. This is because some of the business requirements were unpredictable and could turn out to require a graph database. That's why, at the very beginning, we switched to MSSQL, which also has graph tables as a hybrid solution to be on the safe side. Having SQL solutions would also give us the advantage of being environment-agnostic, and Azure has many managed services for SQL.

This was the time to abandon our beloved graph database work (rest in peace!) and switch to MSSQL and start over. We successfully implemented our work and created relational tables according to our domain requirements. We also added graph tables to be able to query graph-like data. The first place that we needed to try this solution was at the edge. Some of our gateways were configured for this purpose, and they were already using some PostgreSQL instances for some of the existing modules. So, this would be the second database inside the gateway. Guess what happened? We ran beyond the resource limits. In order to have some graph capabilities, the gateways were in danger. Additionally, MSSQL required more resources to boot up in a machine than PostgreSQL.

This was the time to say a real goodbye to graph capabilities altogether. We could not afford it for the sake of the possibility of being used sometime in the future and following a hype trend that did not fit us at all. Now, we have a pure relational structure in our database and are free to use any kind of SQL, which is definitely PostgreSQL for edge gateways. The metrics are looking good, and simplicity is really promising. We already meet all the business requirements from all domains.

Lessons learnt…

Of course, the graph approach, semantics, and ontologies are really important when it comes to device twins. If you have an unlimited number of domains, have entangled queries, and are unable to foresee the future, graph databases are a viable candidate for them. However, from our perspective, everything designed with its basics, even in the real world, our objects were located in a good harmony with simplicity. So, staying away from this hype and preferring simplicity in our digital twins really did the trick for us. We lost some time, but at least it was dedicated to research, and no real work or developments were initiated. Otherwise, reverting back would have cost us more as well.

Sorry for leaving, but we need to enjoy the simplicity for a while with some drinks. Cheers!