Amazon Bedrock General Manager: A Rich Mix of Large Models Should be Offered for Customers

2024-04-25 来源：搜狐时尚原文链接评论0条

AsianFin-- Many cloud computing companies are actively engaged in developing their in-house trained foundational large models. This approach is understandable from a business perspective, but it may not be fully accepted by customers as the choice of large models may be restricted.

On one hand, the innovation of large models is yet to reach its peak, and the capabilities of different model providers vary. On the other hand, it's also related to that customer demand scenarios as no single model can meet all scenarios. Therefore, for different use cases, customers need more than one or two models to satisfy all kinds of scenario requirements.

In the past, more than 90% of Amazon Web Services (AWS) products were derived from customer needs. AWS's generative artificial intelligence (AI) strategy also follows this path.

AWS has also released its own foundational large model, Amazon Titan, in April 2023. This stems from AWS's accumulation of AI technology, such as the widely known voice assistant Alexa, drone delivery service Prime Air, and cashier-less stores Amazon Go, all of which employ a large number of speech, semantic, and visual machine learning technologies.

Atul Deo, General Manager of Amazon Bedrock, pointed out that if AWS doesn't have its own model, it means it must rely entirely on partners. Starting from scratch to build models also provides a "hands-on" approach to solving problems.

As a result, there is an interesting phenomenon: because Amazon Bedrock provides a range of capabilities needed for enterprises to build generative AI applications, it can simplify development while ensuring privacy and security. On Amazon Bedrock, customers can find Amazon Titan as well as current mainstream large model versions, including models from Anthropic, Stability AI, AI21 Labs, Meta, Cohere, and Mixtral... This list goes on and on.

On Tuesday evening, AWS announced multiple feature updates for Amazon Bedrock, which overall enhance efficiency in developing generative AI applications for customers.

In addition to feature updates, AWS also provides a range of new models on Amazon Bedrock, including the officially available Amazon Titan Image Generator for image generation, Meta Llama 3, and the preview version of Amazon Titan Text Embeddings V2. Three models from Cohere, Command R and Command R+, are also set to be released.

In particular, the preview version of Amazon Titan Text Embeddings V2 is optimized for applications such as information retrieval, question-answering chatbots, and personalized recommendations that use RAG (retrieval-augmented generation) technology. Many enterprises adopt RAG technology to enhance the results generated by base models by connecting to knowledge sources, but the issue is that running these operations can consume a lot of computing and storage resources. Amazon Titan Text Embeddings V2 reduces storage and computing costs while maintaining the accuracy of using RAG retrieval results.

Generative AI requires not only large models but also support from acceleration chips, databases, data analysis, and data security services. From the bottom layer of acceleration chips and storage optimization to the middle layer of model construction tools and services, and finally to the top layer of generative AI-related applications, it can be seen that AWS is attempting to provide an end-to-end technology stack for customers to build generative AI.

On the eve of the release, Atul shared with TMTPost his views on generative AI, technical methodology, and how Amazon Bedrock aids customer success.

The following is the tran of the dialogue, edited by TMTPost for clarity and brevity:

TMTPost: What are the different advantages between large companies and small, focused teams in achieving AI technology innovation and industry empowerment?

Atul: Regarding application deployment for customers, I don't think there are any significant differences between large companies and small businesses; they have many similarities. We all want to try different models for large companies. Currently, Data Hygiene is a demanding job. When it comes to deploying applications for smaller companies, managing and ensuring the high quality and consistency of private data required for model training is relatively easy. But for larger companies, with a large amount of differentiated data that is more dispersed, managing data will be more difficult. On the other hand, startups can act faster as they are less risk-averse. They don't have an existing customer base like large customers, so they may make mistakes and improve quickly through trial and error.

TMTPost: What problem does AWS want to address with generative AI?

Atul: We are actively exploring new possibilities. Whether customers want to build models themselves or customize existing models deeply, we hope to build a generative AI stack that allows customers to use rich and first-class tools. In addition to Amazon SageMaker and rich instance types provided by NVIDIA, we are also actively developing custom chips covering training and inference domains to meet more refined needs.

Through a series of innovations from the bottom layer to the middle layer, our goal is to allow any developer in the enterprise to freely build generative AI applications without worrying about complex machine learning or underlying infrastructure. We firmly believe that the tools provided will be industry-leading and help them achieve innovation breakthroughs in applications.

Currently, we have launched two versions of Amazon Q: Amazon Q business and Amazon Q developer. Amazon Q business aims to equip every employee in the enterprise with a professional consultant to ensure they can quickly get answers and efficiently complete tasks; while Amazon Q developer focuses on improving the efficiency of developers, providing them with instant answers to smoothly complete their specific tasks. This is the ultimate goal of Amazon Q and the direction we are tirelessly pursuing.

TMTPost: How long will it take for AWS truly change its product and business structure? How to establish AWS’s leadership in this field?

Atul: Actually, everything depends on customers and the specific problems we are trying to solve. We have seen tens of thousands of customers using SageMaker to change their customer experiences. Some changes have already happened, while others will take some time. Therefore, there is no fixed answer as to when significant changes can be expected.

For example, the New York Stock Exchange is using Bedrock to analyze and process numerous regulatory files and transform complex regulatory content into easy-to-understand language, which will have a profound impact on end-users; meanwhile, electronic health record technology provider Netsmart has successfully reduced the time for managing patient health records by 50% through the application of relevant technologies, undoubtedly freeing up more time for doctors to care for more patients.

Today we have seen some positive impacts on end-users, but I believe it is still a process that needs time to gradually develop and popularize. However, the pace of progress is relatively fast and the momentum has been building up. Therefore, I cannot predict with certainty whether generative artificial intelligence will become very common by the end of this year or next year. However, what can be certain is that it is gradually changing our world, bringing more convenience and possibilities.

TMTPost: For example, RAG is used to solve hallucination problems, but some papers have mentioned that RAG alone cannot solve hallucinations. In enterprise-level applications, how to assess the degree of hallucination and its impact when specific applications are used?

Atul: Although we cannot completely eliminate this problem, I believe that more and more cutting-edge research will come up to help with this issue. You will see customers make more progress and improvements in dealing with hallucinations. I can tell you clearly that although this problem cannot be completely solved, it does help reduce its impact and cannot be completely eliminated as part of our action.

TMTPost: Regarding the collaboration issue between models, what are AWS's better solutions for customers when multiple models are used?

Atul: This is important for customers. In this regard, we specially launched a feature called model evaluation, which was released in December last year and is planned to be fully launched tomorrow. Essentially, this feature is designed to help customers compare the performance of different models on a given set of prompts so that they can choose the model that best suits their specific use cases.

To achieve this goal, customers have three options to choose from. First, they can compare the performance of different models based on given prompts in the console; second, customers can use the automated evaluation feature to run different models on different datasets or use standard industry datasets to see which models perform well; finally, customers can also use their internal professional teams to evaluate models in different ways and determine which model meets their expectations. Ultimately, customers will receive a detailed report from Bedrock, which will show the performance of the models and how to decide which models make sense for them.

TMTPost: What initiatives has AWS taken in AI ethics?

Atul: We are working closely with multiple government organizations. Take our Titan Image Generator, for example. This tool has watermarking capabilities to add invisible watermarks to help customers determine if the generated images are generated by artificial intelligence. In addition, we are also cooperating with a series of other organizations to ensure the responsible use of artificial intelligence.

TMTPost: What is AWS's experience in self-developed chips?

Atul: Over the years, we have been investing in chip development and acquired chip design company Annapurna Labs as early as 2015. Although our initial focus was on virtualization and general-purpose computing chips, we later focused on developing AI chips specifically for machine learning. For example, two dedicated chips for artificial intelligence training and inference, Amazon Trainium and Amazon Inferentia.

Thanks to years of continuous investment in chip development, we have more opportunities to iterate and improve these chips to ensure their performance and stability. These improvements come at the right time because the demand for computing power in generative AI is growing.

TMTPost: There are many models on Bedrock. Have you observed which model is most popular among customers, such as Meta and Anthropic?

Atul: Currently, we will not disclose the specific performance of various model providers. But what I want to say is that these models are favored by a large number of users. This is mainly because the choice of models depends on specific application scenarios, and people will choose different models according to different needs. Therefore, it is too early to identify which models are widely used.

关键词： models Amazon customers AI AWS large

转载声明：本文为转载发布，仅代表原作者或原平台态度，不代表我方观点。今日新西兰仅提供信息发布平台，文章或有适当删改。对转载有异议和删稿要求的原著方，可联络[email protected]。