With the rapid development of artificial intelligence technology, the application of large language models (such as GPT) is becoming more and more common. These models have stronger emergence ability than previous artificial intelligence, can generate more natural and fluent text, and perform well in multiple business fields. However, for most enterprises, training a large model from scratch requires high and unattainable time and resources. Therefore, in the present, how to use the GPT model in the commercial version of Azure OpenAI services that have been listed, to build a GPT intelligent application for your own enterprise, has become a question that every decision-maker must consider.


As we know, GPT is a large language model based on the Transformer structure. After evolving to GPT-3, GPT-3.5, and GPT-4, it has made significant progress in terms of intelligent ability compared to previous artificial intelligence models.

Firstly, compared with previous artificial intelligence technology, GPT has emerged with many abilities that NLP or machine learning models did not have in the past. This is because GPT is trained on a large scale of text data and has the ability to learn rules from the data. This enables GPT to generate more natural and fluent text without the need to write all possible outputs in advance.

Secondly, GPT has shown remarkable performance in generalization. It can learn from specific prompt data and transfer and reason when accepting new tasks, thus improving the model's generalization ability. It has performed well in many different application areas, including natural language processing, text generation, machine translation, and more.

Moreover, more importantly, language models provide a new way to easily extract knowledge from unstructured text and effectively reason based on that knowledge without the need for predefined models. These are the foundation of GPT models in reasoning, thinking chains, and other capabilities.

Therefore, the importance of prompt engineering in the field of large-scale model applications is gradually becoming apparent. Prompt engineering is a technology based on human language intuition that can help enterprises quickly achieve specific functions when building large language models. Prompt engineering is based on the formal representation of natural language questions and can effectively guide the learning process of large language models, thereby improving their performance. Some papers have shown that in certain complex reasoning and thinking chain contexts, prompts can even perform better than fine-tuning.


Based on the above research, we have proposed a set of application technologies based on prompt engineering for the rapid implementation of enterprise-level GPT intelligent applications, as shown in the figure below. This can allow enterprises to build their own large-scale intelligent applications in a very short time without modifying the model itself, and immediately enjoy the productivity dividend brought by the new generation of artificial intelligence.

This architecture, as shown in the diagram above, utilizes a flexible prompt computation engine, combined with various deployment forms such as PaaS, Serverless, and containers, allowing enterprises to optimize the prompt engine using existing development technology stacks and scale computing power elastically based on real-time business needs. The session and token data service layer based on Redis and CosmosDB can add capabilities such as context caching, session persistence, and prompt persistence to applications, leaving room for future model or engine optimization based on prompts. The API encapsulation, load balancing, and gateway in the frontend further enhance the security and reliability of the application, allowing the intelligent entity to access a variety of frontend apps in a more secure and stable way. Most importantly, this architecture is designed with a dual-engine architecture based on Azure cognitive search and embedding technologies. On one hand, Azure cognitive search can quickly index unstructured data such as PDF and WORD files, making existing data immediately usable. On the other hand, by leveraging Azure PostgreSQL's vector storage and processing capabilities, combined with Azure OpenAI's embedding vector generation model, the enterprise's existing structured knowledge base can be combined with the prompt engine to generate more accurate, stable, and reliable results from the GPT model.


In summary, the intelligent application built using the Prompt Engineering method and Azure PaaS services has the technical characteristics of high scalability and broad adaptability, making it an ideal choice for helping enterprises quickly build large-scale intelligent applications. The Prompt Engineering method can improve the accuracy and efficiency of the model, Azure PaaS services can provide support for multiple languages and pre-built templates and tools, and Azure's enterprise-level security can ensure the security and compliance of the enterprise. This enables enterprises to achieve business transformation, gain competitive advantages, and long-term success in the new wave of artificial intelligence.


GitHub Project Site:  xuhaoruins/Azure-OpenAI-App-Innovation-Workshop (github.com)