Change the World 2025
The promise of generative AI comes with a major drawback, in the form of the cost and energy-consumption demands of running large language models. Multiverse Computing has developed an algorithm, called CompactifAI, that enables it to sharply reduce the energy and data needs of running an LLM, with only a 2% to 3% loss in accuracy. Multiverse Computing’s compressed Slim models, based on versions of Meta’s Llama, achieved 84% greater energy efficiency and 40% faster inference than the Llama originals, and reduced the cost of compute by 50%. Other businesses are intrigued: The company now has more than 120 customers across defense, finance, manufacturing, and other industries.