Gradient Flow Trend 2023 Report Final
Gradient Flow Trend 2023 Report Final
“Data, Machine Learning, and AI: 2023 Opportunities and Trends” is an annual comprehensive
look at emerging developments in data infrastructure and engineering, machine learning (ML),
and artificial intelligence. The report is divided into 10 sections, each focusing on a different
aspect of AI and ML.
Section IV: Increased efforts to democratize machine learning and make it more
accessible to non-experts
Section V: Data processing and data management tools for unstructured data
including text, visual data, speech and audio
Overall, this report provides a comprehensive overview of the latest developments and
trends in the field of artificial intelligence and machine learning, and offers insights into the
opportunities and challenges that lie ahead in 2023 and beyond.
2
Section I: From research to real-world applications
Discriminative model: Focus on learning boundaries between classes Generative model: Focus on learning classes
For more, see this section on the generative model page on Wikipedia.
3
Generative AI in the real world
4
Section II: Tools for understanding, testing, and evaluating large (language) models
As general-purpose models become more Researchers at Stanford’s Center for Research • They then select a subset of scenarios
prevalent, there’s a growing need for tools to on Foundation Models just unveiled the results and metrics based on societal relevance
help developers select models appropriate of a study that evaluated the strengths and (e.g., user-facing applications), coverage
for their use case and, more importantly, to weaknesses of 30 well-known large language (e.g., different English dialects/varieties),
help them understand the limitations of these models. In the process, they developed a new and feasibility (i.e., amount of compute).
models. Along those lines, the startup Hugging benchmarking framework, Holistic Evaluation
Face recently released low-code tools, which of Language Models (HELM), which can be More broadly, we expect more tools for testing
make it simple to assess the performance of described as follows: models prior to release:
a set of models along an axis such as FLOPS
and model size, and to assess how well a set of • They organize the space of scenarios (use • Why Meta’s latest large language
models performs in comparison to another. cases) and metrics (desiderata). model survived only three days
online
5
Section III: Training *and* maintaining models puts the focus on efficiency and sustainability
As models (for speech, vision, and text) get more widely deployed
and used, the seemingly positive correlation between model size
and accuracy has prompted research into less resource-intensive
methods that can produce comparable results. These research
initiatives are beginning to inspire real-world deployments.
6
Sustainable AI
Organizations like Allen AI and Meta are
devoting resources to green/sustainable AI, a
collection of tools and processes that explores
the environmental impact of AI from a holistic
perspective. The goal is to develop and deploy
AI systems that yield novel results while
considering computational and environmental
costs, thereby reducing resource usage.
7
Section IV: Increased efforts to democratize machine learning and make it more accessible to non-experts
+ +
8
Section V: Tools for unstructured data
9
Section VI: Renewed focus on streaming (and data integration)
10
Section VII: Data engineers are focusing more on operational tasks
The rise of cloud warehouses and lakehouses means data engineers and data platform teams can get more done—at scale—
compared to a few years ago, when teams had to piece together and manage a variety of tools. Their role is shifting from
infrastructure development to operational tasks.
11
Section VIII: AI and data teams will be caught flat-footed by the coming wave of regulations
12
Section IX: Applications will continue to lead to more pegacorns
Independent ML
Privately held standalone company and/or was Has machine learning as a key component of their
recently acquired by a larger public company within product offering
the past year and operates as a standalone company
13
Section X: Other trends to watch
Data
Data and
and cybersecturity
Cybersecurity
14
The emergence of Twitter alternatives
15
Subscribe to the Gradient
Flow Newsletter to stay up
to date on emerging trends
16