Flexify.IO: Building Multi-Сloud Storage with Large-Scale Data Migrations

Flexify.IO
6 min readMay 22, 2021

In an exclusive interview, Flexify.IO Founder Sergey Kandaurov told, how a startup has turned into a powerful multi-cloud solution in 3 years.

1) How did you come up with the idea to create Flexify.IO?

I worked as a director of product management for a large software company, and thus knew the industry and the trends well. At some point, it became obvious that cloud is the next big thing, and that there will be many competing cloud providers. And once cloud storage commoditizes, data owners will want to move their data and applications from Amazon to other cloud providers, and between them. And that will be challenging.

So, I along with my good friends from the university, decided to build a product that would help data owners manage and move their data around in a multi-cloud world.

2) Why did you choose the data migration process as the main specialization of your company?

We go where the money is.

Flexify’s focus is on helping data owners take advantage of multiple clouds and the competition between them by building a storage system on top of multiple cloud providers, a multi-cloud storage. This requires a virtualization layer on top of cloud storage, API translation to support the clouds’ unique storage APIs, and the ability to move data between clouds for data migration.

At this moment, a real multi-cloud is at the infancy stage for a number of reasons, but competitive solutions are already there. The whole idea of it requires petabytes of data being moved from one cloud to another, and that is what we do best.

Now, most customers only want to migrate their data to a new cloud once or twice. However, in the future, we will see more of them looking for true multi-cloud deployment, where they would not need to choose a single cloud to store their data, but could rely on multiple cloud providers at the same time.

3) How did Flexify.IO develop a fast, affordable way to transfer large amounts of data?

We defined the priorities. First, we decided to focus exclusively on object storage. File and block storage, or structured data is not natively supported by Flexify. This allowed us to gain a full advantage of simplicity and scalability of object storage in cloud-native design.

Also, we realized that targeting extra-large migrations differs greatly from smaller migrations that can be done manually or with simple scripts. When you are migrating petabytes of data or billions of objects, even though the probability of an error is very low for each of the objects, it adds up and becomes significant when you have billions of them. That is why we designed the system from the ground up to be reliable, always calculate and verify checksums, and properly handle different kinds of rare errors.

Sending such volumes between clouds also requires very high speeds, dozens and hundreds of Gbps. The process cannot be practically achieved by a single machine, especially when you want to do all the checksums calculations. So, we invented an algorithm to distribute those billions of objects between a number of stateless engines, each independently doing its part of the migration. This allowed us to achieve the full benefits of a cloud-native design.

4) Who was the first client of Flexify.IO? How did you attract them?

Our first customer migrated 60 GB from Amazon S3 to Wasabi in 2018 using our free tier. I still do not know how they found us. They most likely did via organic search.

The first significant paid migration was by a large convenience stores chain in the end of 2018. I am sure you know them, but we cannot disclose it without the customer’s permission.

Also, around that same time, a large Korean conglomerate reached out to us looking for a solution to move from Amazon S3 to Azure, including both migration and API translation. They ran extensive tests of our technology to confirm speed and reliability.

5) How does Flexify.IO enable data migration from large vendors, such as Amazon S3, Azure, or Google Cloud Storage?

Flexify.IO is a multi-cloud solution in itself. We integrate all major clouds to offer efficient migrations between them, or from them to smaller dedicated cloud providers like Backblaze or Wasabi.

Running our engines dynamically at strategically chosen locations within different cloud providers allows us to optimize individual migrations, either for performance or costs, sometimes for both. It also helps our customers better plan their migrations, because we can read their data locally, so they would not need to pay egress traffic fees to a provider separately.

And, of course, we support all major object storage APIs. For example, for data in AWS, Flexify.IO would use the S3 API, for data in Microsoft Azure, we would use the Azure Blob Storage API, as for data in Alibaba Cloud, they have their OSS API, etc. All completely transparent for the end user.

6) What is the largest amount of data migration done by Flexify.IO?

We have customers migrating numerous petabytes and dozens of billions of objects with Flexify.IO. The largest single migration was around 1.5 PB and 2 billion objects. Such buckets may take days just to list them all.

7) What are the benefits that Flexify.IO offers to its users?

We give the data owners flexibility to choose where to store their data and move it between storages at will. All designed and tested for extremely large datasets.

This includes migration — very fast, efficient, reliable and secure. Transparent API translation that allows S3-compatible applications to work with data in Azure or Alibaba, and storage virtualization that allows combining multiple public or private cloud storages into a single virtual namespace, all completely transparent from an application standpoint.

Such virtualization is also important when migrating actively changing data, since it effectively makes migrations transparent and eliminates downtime that would be otherwise necessary to synchronize the changes.

8) How does Flexify.IO manage to transfer large amounts of data to users much cheaper than other providers?

Indeed, the main factor preventing data owners from moving around more data are the extraordinarily high fees for egress traffic from major clouds. In the beginning, those costs had to be either paid by a data owner or included in the rates of Flexify.IO.

We have since established a dedicated infrastructure that allows us to minimize migration costs. In fact, when migrating data with Flexify, you would pay less than if you had sent the data from one cloud to another via the internet. This increased the demand for migration a lot. Traffic fees are indeed a major lock-in factor.

9) What are the prospects for the development of multi-cloud storage solutions that you foresee in the coming years?

Multi-cloud is a hot trend and will be growing only hotter in the coming years. Currently, the adoption of true multi-cloud storage is slowed by cost-prohibitive egress traffic fees, but there are incentives like Bandwidth Alliance aiming to significantly reduce and eliminate those fees.

Competition forces cloud providers to open up for interoperability in the multi-cloud world, since multi-cloud storage is going to be a must-have, and Flexify has the best technology to build it for users.

10) What are the goals that Flexify.IO aims to achieve in 2021?

We plan to add support for smaller cloud providers, including a remarkably interesting case of storing unstructured data in blockchain, as well as to extend cooperation with our existing partners, helping them get more business.

Feature-wise, we are seriously looking at implementing near-real-time continuous synchronization, as there is growing demand from our customers to keep two or more data storages in sync.

--

--

Flexify.IO

Flexify.IO is the world’s first cloud storage virtualization and migration solution. We help business building cloud-agnostic solutions by simplifying migration