Wht to Expect in 2024 Increasingly Capable Foundation ModelsExpert Insight Will Power Performance Evolving Proof-of-Concepts to Scaling Improvements Production Deployments In the coming year, we expect notable advancements As researchers continue to refine these models, we can in generative AI foundation models to continue. also anticipate improvements in accuracy and reduced Human experts will play an increasingly crucial role in Improvements in model performance and capabilities Models like Claude 3 have demonstrated improved latency, making models more reliable and efficient. model advancements and evaluation. As models start will motivate leaders to quickly iterate from proof-of- performance on various benchmarks, such as scoring The size of these foundation models is also likely to to exhaust the corpus of general information widely concepts to pilots to production deployments. More 86.8% on the MMLU dataset and 95.0% on the GSM8K grow, allowing them to capture and leverage even more available on the internet, models will require addi-user friendly RAG and fine-tuning solutions will emerge math problem set, indicating enhanced capabilities knowledge and nuance from the vast amounts of data tional data to improve their capabilities. While some as on-ramps to improve adoption so that organizations in reasoning and problem-solving. We also expect to they are trained on. organizations may look to replace human-generated can more easily customize models. As start up costs see the emergence of more sophisticated multimodal data with synthetic data for training, models reliant on taper, model effectiveness improves, and more robust models that can seamlessly integrate and generate synthetic data can be susceptible to model collapse. A evaluation strategies emerge, organizations will be able content across various modalities, including text, hybrid human and synthetic data approach can mitigate to more clearly capture and define return on invest- images, audio, and video as both inputs and outputs. biases from synthetic data and still reflect nuanced ment. human preferences. The domain-specific knowledge of experts allows them to provide data that captures the Increasing Emphasis on Test & Evaluation nuance, complexity, and diversity to supplement model Practices training. Experts are also critical for testing and eval- Evolution of generative AI capabilities: domain and functional capabilities are rapidly growinguation alongside reinforcement learning from human Nearly every major model release usurps a different MathCreative CodingSciencefeedback, with the knowledge to identify subtle errors, leading model on various benchmarks. Enterprises Writing inconsistencies, or biases in order to provide reliable will want to create their own evaluation methodology guidance to preferred model outputs. consisting of industry benchmarks, automated model metrics, and measures for return on investment to While experts are necessary to improve model capabili- continuously evaluate their preferred model. As model ties, we anticipate organizations defining new roles that capabilities grow, model builders will place more are centered around generative AI. Prompt engineers, importance on guardrails, steerability, safety, security, machine learning researchers, and generative AI and transparency. Public sector institutions now must experts will collaborate with subject matter experts to consider the White House’s OMB Policy and test and ensure AI initiatives are successful. Generative AI will evaluate AI systems to ensure that AI is safe. Multivariate Metaphorical DebuggingBiologyfundamentally change the nature of work. CalculusStories Applying Lyrical Code Genetic Gradient SonnetsOptimizationExpression Theorem 10 11
