The cloud has allowed data teams to collect vast quantities of data and store it at reasonable cost, opening the door to new analytics use cases that leverage data lakes, data mesh, and other modern architectures. But for very large volumes of data, generic cloud storage also presents challenges and limitations in how that data can be accessed, managed, and used.
Typical blob storage systems in the cloud lack the information required to show relationships between files or how they correspond to a table, making the job of query engines that much harder. Additionally, files by themselves do not make it easy to change schemas of a table, or to “time travel” over it. Each query engine must have its own view of how to query the files. All of a sudden, what seemed like an easy-to-implement data architecture becomes more difficult than expected.
To read this article in full, please click here
InfoWorld Cloud ComputingRead More