Microsoft has updated its Azure AI Search service to increase storage capacity and vector index size at no additional cost, a move it said will make it more economical for enterprises to run generative AI-based applications.
Formerly known as Azure Cognitive Search, the Azure AI Search service connects external data stores containing un-indexed data with an application that sends queries or requests to a search index. It consists of three components—a query engine, indexes, and the indexing engine—and is mostly used in retrieving information to enhance the performance of generative AI, a process known as retrieval-augmented generation (RAG).
The free expanded limits will only apply to new services developed after April 3, 2024, the company said, adding that there is no way to upgrade existing services, so enterprises will need to create new ones to be benefit from the increased capacities.
In contrast to services developed before that date, new services will get a 3x to 6x increase in total storage per partition, a 5x to 11x increase in vector index size per partition, and the additional compute backing the service supports more vectors at high performance and up to 2x improvement in indexing and query throughput.
The upgrade, on average, reduces the cost per vector by 85% and saves up to 75% in total storage costs, Pablo Castro, engineer at Azure AI, wrote in a blog post.
The basic tier of the service, according to Castro, will get an additional 13 GB storage per partition following the update as opposed to just 2GB per partition before.
The S1, S2, and S3 tiers of the service will get an additional 135 GB, 250 GB, and 500 GB storage per partition respectively.
The L1 and L2 tiers will see no change, the company said.
On the vector index size, the basic, S1, S2, and S3 tiers will see an additional 4 GB, 32 GB, 88 GB, and 164 GB sizing capacity per partition respectively. Again, the L1 and L2 tiers will see no change.
The updated offering will be available across most US and UK regions, alongside other regions such as Switzerland West, Sweden Central, Poland Central, Norway East, Korea South, Korea Central, Japan East, Japan West, Italy North, Central India, Jio India West, France Central, North Europe, Canada Central, Canada East, Brazil South, East Asia, and Southeast Asia.
More features to optimize vector storage
Apart from updating the storage and vector index sizes, the company is working on bringing more features to optimize vector storage.
These features, which are currently in preview, include quantization and narrow numeric types for vectors, among other tweaks.
Microsoft is using quantization and oversampling to compress and optimize vector data storage, Castro said, adding that this reduces vector index size by 75% and vector storage on disk by up to 25%.
Further, the engineer said that enterprises could use narrow vector field primitive types such as int8, int16, or float16, to reduce vector index size and vector storage on disk by up to 75%.
Other techniques for further optimization includes setting the stored property on vector fields to reduce storage overhead.