Introduction The hype for generative AI has reached its peak. Developers continue to push the limits, exploring new frontiers with Table of increasingly sophisticated models. At the same time, without a standardized blueprint, enterprises and governments are Contents grappling with the risks vs. rewards that come with adopting AI. That’s why in our third edition of Scale Zeitgeist: lacking security benchmarks, and 50% desired AI Readiness Report, we focused on what it takes industry-specific benchmarks. Additionally, while AI Year in Review 4 to transition from merely adopting AI to actively 79% of respondents cited improving operational optimizing and evaluating it. To understand the efficiency as the key reason for adopting AI, only Apply AI 13 state of AI development and adoption today, half are measuring the business impact of their AI we surveyed more than 1,800 ML practitioners initiatives. And while performance and reliability Build AI 26 and leaders directly involved in building or (each at 69%) were indicated as the top reasons applying AI solutions and interviewed dozens for evaluating models, safety ranked lower (55%), more. In other words, we removed responses running counter to popular narratives. Evaluate AI 36 from business leaders or executives who are not equipped to know or understand the challenges This report presents expert insights from Scale Conclusion 46 of AI adoption first-hand. and its partners across the ecosystem, including frontier AI companies, enterprises, and govern- Methodology 47 Our findings show that of the 60% of respon- ments. Whether you are developing your own dents who have not yet adopted AI, security models (building AI), leveraging existing foun- concerns and lack of expertise were the top two dation models (applying AI), or testing models reasons holding them back. This finding seems to (evaluating AI), there are actionable insights and validate the “AI safety” narrative that dominates best practices for everyone. today’s news. Among survey respondents who have adopted AI, many feel they lack the appro- priate benchmarks to effectively evaluate models. Specifically, 48% of respondents referenced ii 01
