A few days ago, Bill Gates published an article saying that AI Agent will not only change the way people interact with computers, but may also subvert the software industry and lead the biggest computer revolution since entering commands to clicking icons.
In the wave of digitalization and technological innovation, AI Agent, as a cutting-edge technology, is opening up a wide range of application prospects and unlimited opportunities. These opportunities are not only reflected in improving work efficiency and business automation, but also in providing personalized services to users and improving customer experience.
As generative AI technology continues to develop and mature, AI Agents are playing an increasingly important role in promoting the innovation of new products and services and exploring new business models.
If you are interested in this article and want to learn more about practical skills in the AI field, you can follow the “Shenzhou Wenxue” public account. Here, you can see the latest and hottest articles and cutting-edge information in the AIGC field.
Developments related to intelligent agent technology
Since March 16, 2023, Microsoft has released Microsoft 365 Copilot. This event has caused huge repercussions in the industry, marking the rise of the application development paradigm based on large language models (LLM) and becoming the consensus Agent concept in today’s industry. a milestone.
It is worth noting that the concept of Agent has existed before, as early as the 1950s. In 1995, Wooldridge and Jennings defined AI Agent as a computer system that is in a specific environment and can act independently to achieve its design goals, and proposed that AI Agent should have autonomy, reactivity, social capabilities and initiative Four attributes.
After 2010, large models began to take shape. In 2016, AlphaGo defeated the world champion in the Go game. In 2018, Google launched BERT based on the Transformer model. In 2019, AlphaStar reached the grandmaster level in the e-sports game “StarCraft 2” and surpassed 99.8% of the world’s players. Immediately afterwards, the release of ChatGPT-3 and 3.5 in 2022, and the subsequent popularity of ChatGPT, provided new opportunities for the development of AI Agents in the era of large models.
The development prospects of large language models in intelligent agent applications are attracting attention. With the rapid advancement of AI technology, LLM not only performs well in understanding and generating natural language, but also shows great potential in participating in decision support, automated task processing, and personalized services as an intelligent agent. These developments not only push the boundaries of human-computer interaction, but also provide innovative solutions for various industries and open up new business opportunities and research directions.
In 2023, the development of large models will show explosive growth. Since January, many LLMs have been launched around the world, including LLaMA, BLOOM, StableLM, ChatGLM and many other large open source models. With the help of these models, various autonomous agents such as AutoGPT and MetaGPT were born.
In June 2023, Lilian Weng, head of the OpenAI Safety team, published an article titled “LLM Powered Autonomous Agents”, proposing a new definition of Agent: large model + memory + planning skills + tool use. On November 6, OpenAI released the official Agent development framework Assistant API at the DevDay event, aiming to help developers develop agents based on the GPT model more efficiently and conveniently.
Data source: Digital China’s “Generative AI Enterprise Application Implementation Technology White Paper”
From an entrepreneurial perspective, Agent development can be roughly divided into two categories: one is to provide a reusable Agent framework, reduce the complexity of future development, and focus on the optimization and innovation of modularity, adaptability and collaboration capabilities; the other is It is to go deep into vertical fields, become domain experts, and use industry-specific data and processes to provide more accurate and effective services.
At present, Agent development progress is mainly concentrated in the United States internationally. Because it has mature technical infrastructure and sufficient high-end chip resources, companies represented by OpenAI are in a leading position in technology. In contrast, large model applications in other countries such as the EU, UK, Canada and Japan are still in the experimental stage.
In China, some technology companies have produced several well-known large models, and the Agent intelligent applications born out of this have gradually entered the public eye. For example, Baidu applies the Wenxin model to intelligent search and autonomous driving; Alibaba applies the Tongyi Qianwen model to Amap, Youku, Hema and other products. Huawei applies its Pangu model to smart weather, speech recognition, etc. A startup company called Face Wall Intelligence has launched their AI Agent product ChatDev, which can complete the development of a software or a small game in a short time. All it needs to do is provide it with a requirement.
AI Agent application
For enterprises, the core of a successful Agent product is to improve work efficiency, which not only means improving work quality, but also saving time and costs. Analyzing the existing Agent products on the market, we found that their applicable scenarios in enterprise environments mainly include:
- Simplify daily workflow: The connection between enterprise departments often involves the production of a large amount of documents. Although this does not require complex technical support, it consumes a lot of time. If a conversational agent is introduced to understand department needs and automatically generate corresponding documents, it can greatly reduce the burden on the team and allow them to focus more on their core work.
- Database access optimization: Enterprises can leverage the text interpretation capabilities of large models to integrate and extract key information from data. In this way, enterprises do not need to manually integrate fragmented information, greatly improving the efficiency of data call.
- Programming assistance: Agent can help programmers quickly complete the framework construction and writing of basic function templates, allowing programmers to directly enter more detailed programming work, significantly reducing the programming workload.
For ordinary consumers, the development of Agent has brought more convenience, similar to Apple’s Siri and Microsoft’s Cortana. These tools can independently search and invoke a variety of information and applications based on the user’s needs. Although these agents currently mainly handle simpler tasks, with the support of large language models, they will have more powerful capabilities in the future to solve various problems in daily life and become each person’s customized personal assistant.
AI Agent Challenge
At the current stage, Agent development still faces many challenges. While large language models perform impressively well in conversation, they often feel like “artificial retards” when applied to specific work tasks. This shows that the key to commercializing large-scale models is understanding and accurately addressing business requirements.
In the ToB business, the application of AI Agent is affected by API quality and insufficient ecosystem, especially in the Chinese market. The lack and low quality of APIs lead to a significant gap between actual application results and expectations. Furthermore, trying to solve all domain problems with a single model often falls short in depth of understanding.
The implementation effect of AI Agent is also limited by the degree of closure of the application scenario. In closed scenarios (such as travel booking), AI Agents perform well thanks to rich APIs and exhaustive problems. In open scenarios (such as legal assistants), practical applications face more challenges due to the frequent emergence of new knowledge and imperfect APIs. Ideal application scenarios should choose environments that have rich vertical domain data, are closed, and have exhaustive problems.
When it comes to training, one of the main issues is the lack of high-quality data. The training data for large-scale models mainly comes from online texts, but in the commercial field, many case data will not be fully public. Successful cases become business secrets, while failure cases are rarely shared by companies. Even much industry experience has not yet been recorded in text. In addition, in order to better adapt to enterprise operations, training models requires a large amount of information about processes, and the numerous standards contained in this information vary among different industries, making model training more difficult.
Therefore, the establishment of vertical industry models for specific fields is urgent. In fields with high professionalism and huge data, such as law, medical care, and finance, establishing these industry models is the key to the implementation of AI. Companies that can build and master these vertical industry models will gain a strong competitive advantage.
Artificial intelligence trust, risk and security management (TRiSM) faces a series of challenges. One of them is that Agents may come into contact with sensitive information and critical infrastructure, so effective protection measures are required. At the same time, in order to ensure transparency and explainability of the decision-making process, it becomes particularly important to adopt a clear decision-making process. Additionally, a lack of human oversight may reduce the ability to mitigate or correct AI errors. This is because without human participation, the decisions of the AI system may not be corrected or monitored in a timely manner.
On the other hand, regulatory policies on artificial intelligence have also become a hot topic. Regarding the issue of agency in particular, early regulatory proposals tended to impose strict rules and responsibilities on autonomous actors. This change in the regulatory environment may have a significant impact on the development and application of AI. At the same time, the resistance to Agent within the organization cannot be ignored. This is mainly due to employees’ fear of being replaced by AI.
The development and evolution of AI Agent
The Multi-Agent field is developing towards a multi-Agent cooperation framework. Large models can discuss issues from multiple perspectives. If different identities are defined for each Agent, such as manager, programmer, tester, etc., professional content can be mined more effectively. This multi-Agent combination enables large models to perform deeper calculations and thinking, and better solve complex tasks. At the same time, using different large models to assume different roles in the team can bring together various advantages.
In terms of multimodality, large models are developing towards understanding non-literal forms. This ability mainly involves parsing visual information, which usually requires a large amount of text description. Agents with multi-modal processing capabilities can enhance the perception of the environment, which is crucial for applications such as autonomous driving and robots that interact with the real world. But currently, the capabilities and scale of encoders for non-text modalities are far less than those of language models. In the future, there may be large multi-modal models that are trained with multiple modal corpora from the beginning, or when the capabilities of visual modal encoders improve and keep pace with large language models, their combined use will lead to breakthrough developments.
In the future, Agents may also realize self-evolution functions like large models. If humans can self-evolve a corresponding division of labor system, perhaps Agents can also self-design an organizational structure more suitable for Agent collaboration to better complete complex tasks.
Conclusion
In the long run, AI Agent will form deeper intelligent connections. However, the current AI Agent technology is not yet mature and will take some time to develop. However, if the Agent era is already coming, then in the next few years, It will completely change the way we live our lives, and we are looking forward to it together.