GSI Technology

Generative AI Overview

GSI’s in-memory computing APU technology provides flexible bit processing capable of rapidly searching billions of documents to retrieve relevant information within milliseconds.

It extends this flexibility to large language model (LLM) researchers, enabling them to explore innovative data formats.

The current production GSI APU is suited for GenAI’s retrieval functions, fine-tuning broad search capabilities with specialized, domain-specific vector databases, and enhancing LLMs with pertinent data to minimize hallucinations.

These functionalities are achieved through technology that outpaces traditional GPU resources, while also being more energy efficient.

The GSI APU’s compute-in-memory design offers an optimal remedy for the “AI bottleneck” dataflow issue, significantly increasing both performance and memory capacity for the next generation of GenAI. This architecture inherently boasts substantial memory bandwidth and seamlessly accommodates 3D memory components, making it ideally poised to elevate performance and memory capacity for upcoming iterations of GenAI.

The GSI APU’s compute-in-memory is the ultimate solution to the “AI bottleneck” dataflow problem to significantly increase performance and memory capacity for next-generation GenAI.

The APU architecture has inherently large memory bandwidth and is a natural fit for 3D memory components addressing very large memory requirements. can significantly increase performance and memory capacity for next-generation GenAI.

Personalized Answers with Your Data—At a Lower Cost

LLMs struggle with domain-specific knowledge because they were trained for general-purpose text generation.

The APU powers our OpenSearch and neural search plugin, which provides a cost-effective solution for searching billions of items in milliseconds. With the plugin, you can search for, and retrieve, domain-specific information from your large, private vector database to use as part of the context for an LLM prompt.

That means you do not need to go through the time-consuming and costly process of fine-tuning an LLM to get it to provide reliable answers for domain-specific information.

Reduced Hallucinations

LLMs are known to “hallucinate,” where they generate text that appears plausible but is inaccurate.

Through our OpenSearch neural search plugin, the APU provides access to a knowledge base of relevant information that can be added as context to an LLM prompt.

This means that the LLM can base its answers on the relevant information it retrieves using the plugin—greatly reducing hallucinations.

Want to reduce your LLM energy consumption and cost-effectively provide reliable answers for domain-specific topics?

Contact Us to learn how GSI’s APU bit-level processor can help you do that.

Generative AI Products

Leda-E PCIe card	Contact to Purchase
1-8 Leda-E 2U Server	Contact for Configurations
16 Leda-S 1U Server	Contact for Configurations
Hosted Service	Contact for Details