Getting Started with Azure OpenAI Service

Getting Started with Azure OpenAI Service

The Azure OpenAI Service empowers developers to tap into the capabilities of OpenAI’s advanced models such as GPT-4 for language tasks and DALL·E for image generation within Microsoft Azure’s secure, scalable cloud environment. Designed for enterprise-ready AI solutions, it provides a full suite of tools for provisioning, deploying, and managing models using infrastructure-as-code (like Bicep), SDKs, and Azure CLI.

In this tutorial, we’re going to walk through the steps to get started, from setting up your Azure environment to deploying and interacting with your first AI model.

To build a generative AI solution using Azure OpenAI, the first step is to provision an Azure OpenAI resource within your Azure subscription.

If you are not the owner of the resource, you will need specific role-based access control (RBAC) permissions to interact with the Azure OpenAI service:

  1. Cognitive Services OpenAI User
    Grants permission to view resources and access the Chat Playground.

  2. Cognitive Services OpenAI Contributor
    Allows the creation and management of new deployments.


Pricing

Azure OpenAI Service pricing is based on two main factors:

  • The number of tokens processed (input and output combined).
  • The type of model used.

You can find detailed pricing information on the official Azure website:
Azure OpenAI Service Pricing


Available OpenAI Models

Azure OpenAI provides access to a variety of powerful models from OpenAI, including:

  • GPT-4 Models
    Advanced language models for chat, completion, and reasoning tasks.

  • GPT-3.5 Models
    Cost-effective and fast models suitable for many general-purpose applications.

  • Embeddings Models
    Generate vector representations of text for search, similarity, and clustering.

  • DALL·E Models
    Generate images from natural language prompts.

  • Whisper Models
    Perform automatic speech recognition (ASR) from audio inputs.

  • Text-to-Speech Models
    Convert text into natural-sounding audio.

  • Other Models

Using Azure OpenAI Foundry

Azure OpenAI Foundry offers a robust environment for managing and deploying AI models. It includes tools and resources for:

  • Model management
  • Deployment and scaling
  • Experimentation and testing
  • Customization of models
  • Learning and onboarding materials

You can define and provision an Azure OpenAI resource using Bicep, Azure’s infrastructure-as-code language.

Example: Bicep Template to Create an Azure OpenAI Resource

resource openAIAccount 'Microsoft.CognitiveServices/accounts@2023-05-01' = {
  name: aiServicesName
  location: location
  identity: {
    type: 'SystemAssigned'
  }
  sku: {
    name: sku
  }
  kind: 'AIServices'
  properties: {
    publicNetworkAccess: 'Enabled'
    networkAcls: {
      defaultAction: 'Allow'
    }
    disableLocalAuth: false
  }
}

Deploying OpenAI Models

After provisioning your Azure OpenAI resource, the next essential step is to deploy a model. This deployment enables you to interact with the model via the API or tools like the Chat Playground.

Deployment Requirements

When deploying a model, you must specify the following:

  • Base model: For example, GPT-4 or GPT-3.5
  • Model version: The specific version you want to use (e.g., 0613)
  • Capacity and quota settings: Define how many tokens per minute (TPM) and how much compute power the deployment can use

You can have multiple deployments per resource, as long as they collectively stay within the allocated TPM quota.


Example: Bicep Template to Deploy GPT-4

resource gpt4 'Microsoft.CognitiveServices/accounts/deployments@2024-10-01' = {
  parent: openAIAccount
  name: '${aiServicesName}GPT4'
  sku: {
    name: 'Standard'
    capacity: 10
  }
  properties: {
    model: {
      format: 'OpenAI'
      name: 'gpt-4'
      version: '0613'
    }
    versionUpgradeOption: 'OnceNewDefaultVersionAvailable'
    currentCapacity: 10
    raiPolicyName: 'Microsoft.Default'
  }
}

Exploring OpenAI Prompts

Once your model has been deployed in Azure OpenAI, you can begin testing it by sending prompts.

A prompt is the text input that you send to the model via its completions endpoint.
The response generated by the model is known as a completion, which can be returned as:

  • Natural language text
  • Programming code
  • Structured data
  • Other AI-generated formats

You can interact with the model using the Azure OpenAI Studio, the SDKs, or via REST APIs.


Deploying Infrastructure with Azure CLI

Below is an example of how to set up a resource group and deploy your Bicep infrastructure using Azure CLI:

# Log in to Azure (Azure CLI)
az login

# Get the current Azure subscription ID
$subscriptionId = (Get-AzContext).Subscription.id

# Set the active subscription
az account set --subscription $subscriptionId

# Define the resource group name
$resourceGroupName = "RG-DATASYNC-AI"

# Create a new resource group in the West Europe region
New-AzResourceGroup -Name $resourceGroupName -Location "eastus"

# Deploy the Bicep template to the new resource group
New-AzResourceGroupDeployment `
  -Name "datasync-deploy-open-ai" `
  -ResourceGroupName $resourceGroupName  `
  -TemplateFile main.bicep  `
  -DeploymentDebugLogLevel All

Using the Azure OpenAI SDK

Azure OpenAI provides both language-specific SDKs and a REST API that developers can use to integrate AI capabilities into their applications.

Each model family within the Azure OpenAI service offers different strengths, such as natural language understanding, code generation, embeddings, or image creation.
To use any of these models, you must first deploy them through the Azure OpenAI resource.


Installation

Create a .env file on your local machine to securely store sensitive configuration values. To prevent accidentally sharing secrets, it’s recommended to add the .env file to your .gitignore.

Include the following variables:

  • The endpoint for your Azure OpenAI resource
  • Your Azure OpenAI API key.
  • The name of your Azure OpenAI deployment.

.env file content

  • AZURE_OAI_ENDPOINT: https://eastus.api.cognitive.microsoft.com/

  • AZURE_OAI_KEY: <<your-open-ai-api-key>>

  • AZURE_OAI_DEPLOYMENT: your-open-ai-deployment-name

Install the official OpenAI Python SDK using pip:

pip install openai

Example: Calling a Deployed Model Using the Azure SDK
The following Python example demonstrates how to interact with a deployed model in Azure OpenAI using the SDK:

import os
from openai import AzureOpenAI
from dotenv import load_dotenv

# Load environment variables from a .env file
load_dotenv()

# Load Azure OpenAI configuration
azure_oai_endpoint = os.getenv("AZURE_OAI_ENDPOINT")
azure_oai_key = os.getenv("AZURE_OAI_KEY")
azure_oai_deployment = os.getenv("AZURE_OAI_DEPLOYMENT")

# Define model and API version
model_name = "gpt-35-turbo-16k"
api_version = "2024-12-01-preview"

# Initialize the Azure OpenAI client
client = AzureOpenAI(
    api_version=api_version,
    azure_endpoint=azure_oai_endpoint,
    api_key=azure_oai_key,
)

# Send a prompt to the deployed model
response = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant.",
        },
        {
            "role": "user",
            "content": "Give me the list of French departments in the Île-de-France region.",
        }
    ],
    max_tokens=4096,
    temperature=1.0,
    top_p=1.0,
    model=azure_oai_deployment
)

# Print the response
print(response.choices[0].message.content)

Python Code Explanation: Azure OpenAI Integration

This Python code demonstrates how to interact with a deployed model using the Azure OpenAI SDK. The code retrieves responses from a model based on user input.


Import Required Libraries

import os
from openai import AzureOpenAI
from dotenv import load_dotenv

os: Allows interaction with the operating system, such as reading environment variables.

AzureOpenAI: The client used to interact with the Azure OpenAI API.

load_dotenv: Loads environment variables from a .env file.

Load Environment Variables

load_dotenv()

Loads environment variables from the .env file to securely handle sensitive data like API keys.

azure_oai_endpoint = os.getenv("AZURE_OAI_ENDPOINT")
azure_oai_key = os.getenv("AZURE_OAI_KEY")
azure_oai_deployment = os.getenv("AZURE_OAI_DEPLOYMENT")
Here, the environment variables for:

Azure OpenAI Endpoint (AZURE_OAI_ENDPOINT)

Azure OpenAI API Key (AZURE_OAI_KEY)

Azure OpenAI Deployment Name (AZURE_OAI_DEPLOYMENT)

Define Model and API Version

  • model_name = “gpt-35-turbo-16k”
  • api_version = “2024-12-01-preview” The version of the Azure OpenAI API being used.
  • model_name: Specifies the model to use (in this case, GPT-3.5 with 16k tokens).

Initialize Azure OpenAI Client

client = AzureOpenAI(
    api_version=api_version,
    azure_endpoint=azure_oai_endpoint,
    api_key=azure_oai_key,
)

The AzureOpenAI client is initialized with the following parameters:

api_version: The API version.

azure_endpoint: The endpoint of the Azure OpenAI resource.

api_key: The key used to authenticate API requests.

Send a Request to the Deployed Model

response = client.chat.completions.create(
ion: The version of the Azure OpenAI API being used.    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant.",
        },
        {
            "role": "user",
            "content": "give me the list of French departments in Île-de-France region",
        }
    ],
    max_tokens=4096,
    temperature=1.0,
    top_p=1.0,
    model=azure_oai_deployment
)

messages: Contains the conversation history. Here, the system message provides the assistant’s behavior, and the user message contains the query.

max_tokens: The maximum number of tokens allowed in the response.

temperature: Controls the randomness of the response (1.0 for maximum creativity).

top_p: Limits the selection of tokens to the top probability mass (higher values allow more diverse responses).

model: Refers to the deployed model that will handle the request.

Print the Model’s Response

print(response.choices[0].message.content)
This line prints the content of the response returned by the model. It extracts the assistant’s reply from the response.choices[0].message.content object.

github repository :

https://github.com/azurecorner/Azure-OpenAI-Service-Getting-Started.git

Support us

BMC logoBuy me a coffee