5 things on our data and AI radar for 2021

By mullaned2002

June 4, 2021

561

Here are some of the most significant themes we see as we look toward 2021. Some of these are emerging topics and others are developments on existing concepts, but all of them will inform our thinking in the coming year.

MLOps FTW

MLOps attempts to bridge the gap between Machine Learning (ML) applications and the CI/CD pipelines that have become standard practice. ML presents a problem for CI/CD for several reasons. The data that powers ML applications is as important as code, making version control difficult; outputs are probabilistic rather than deterministic, making testing difficult; training a model is processor intensive and time consuming, making rapid build/deploy cycles difficult. None of these problems are unsolvable, but developing solutions will require substantial effort over the coming years.

The Time Is Now to Adopt Responsible Machine Learning

The era in which tech companies had a regulatory “free ride” has come to an end. Data use is no longer a “wild west” in which anything goes; there are legal and reputational consequences for using data improperly. Responsible Machine Learning (ML) is a movement to make AI systems accountable for the results they produce. Responsible ML includes explainable AI (systems that can explain why a decision was made), human-centered machine learning, regulatory compliance, ethics, interpretability, fairness, and building secure AI. Until now, corporate adoption of responsible ML has been lukewarm and reactive at best. In the next year, increased regulation (such as GDPR, CCPA), antitrust, and other legal forces will force companies to adopt responsible ML practices.

The Right Solution for Your Data: Cloud Data Lakes and Data Lakehouses

Data lakes have experienced a fairly robust resurgence over the last few years, specifically cloud data lakes. With more businesses migrating their data infrastructure to the cloud, as well as the increase of open source projects driving innovation in cloud data lakes, these will remain on the radar in 2021. Similarly, the data lakehouse, an architecture that features attributes of both the data lake and the data warehouse, gained traction in 2020 and will continue to grow in prominence in 2021. Cloud data warehouse engineering develops as a particular focus as database solutions move more and more to the cloud.

A Wave of Cloud-Native, Distributed Data Frameworks

Data science grew up with Hadoop and its vast ecosystem. Hadoop is now last decade’s news, and momentum has shifted to Spark, which now dominates the way Hadoop used to. But there are new challengers out there. New distributed computing frameworks like Ray and Dask are more flexible, and are cloud-native: they make it very simple to move workloads to the cloud. Both are seeing strong growth. What’s the next platform on the horizon? We’ll see in the coming year.

Natural Language Processing Advances Significantly

This year, the biggest story in AI was GPT-3, and its ability to generate almost human-sounding prose. What will that lead to in 2021? There are many possibilities, ranging from interactive assistants and automated customer service to automated fake news. Looking at GPT-3 more closely, here are the questions you should be asking. GPT-3 is being delivered via an API, not by incorporating the model directly into applications. Is “Language-as-a-service” the future? GPT-3 is great at creating English text, but has no concept of common sense or even facts; for example, it has recommended suicide as a cure for depression. Can more sophisticated language models overcome those limitations? GPT-3 reflects the biases and prejudices that are built into languages. How are those to be overcome, and is that the responsibility of the model or of the application developers? GPT-3 is the most exciting development to appear during the last year; in 2021, our attention will remain focused on it and its successors. We can’t help but be excited (and maybe a little scared) by GPT-4.

O’Reilly’s online learning platform can give your employees the resources they need to upskill and stay up to date on AI, data and hundreds of other technology and business topics. Request a demo.

5 things on our data and AI radar for 2021

MLOps FTW

The Time Is Now to Adopt Responsible Machine Learning

The Right Solution for Your Data: Cloud Data Lakes and Data Lakehouses

A Wave of Cloud-Native, Distributed Data Frameworks

Natural Language Processing Advances Significantly

Maestro: Netflix’s Workflow Orchestrator

Meet Caddy – Meta’s next-gen mixed reality CAD software

AI Lab: The secrets to keeping machine learning engineers moving fast

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

How we’ll build sustainable, scalable, secure infrastructure for an AI-driven future

Automating the Process: Where Efficiency Meets SOX Compliance

VPC endpoint considerations for upgrading or creating AWS DMS version 3.4.7 or higher

POPULAR CATEGORY