Feature Request: Add async support for Generative AI Inference client

## Feature Request

### Description

Add native async/await support for the OCI Generative AI Inference client to enable non-blocking concurrent requests in async applications.

### Problem Statement

The current SDK uses synchronous HTTP requests via the `requests` library. This causes issues in async applications:

1. **Event loop blocking**: Sync calls block the event loop in FastAPI, async agents, and other async frameworks
2. **Limited concurrency**: Cannot efficiently make concurrent API calls
3. **Performance bottleneck**: Sequential requests are significantly slower than concurrent alternatives

### Proposed Solution

Add an `AsyncGenerativeAiInferenceClient` class that:
- Uses `aiohttp` for true async HTTP requests
- Reuses the existing OCI `Signer` for authentication
- Provides async versions of all GenAI operations (chat, streaming, embeddings, etc.)
- Supports async context manager pattern

### Example Usage

```python
import asyncio
from oci.generative_ai_inference import AsyncGenerativeAiInferenceClient

async def main():
    async with AsyncGenerativeAiInferenceClient(config) as client:
        # Concurrent requests - 3x faster than sequential
        results = await asyncio.gather(
            client.chat(details1),
            client.chat(details2),
            client.chat(details3),
        )

asyncio.run(main())
```

### Performance Impact

Testing shows 2-3.5x throughput improvement for concurrent workloads:

| Scenario | Sequential | Concurrent | Speedup |
|----------|-----------|------------|---------|
| 3 requests (Llama 3.3) | 1.30s | 0.64s | 2.01x |
| 3 requests (Llama 3.2) | 1.40s | 0.44s | 3.18x |
| 3 requests (Cohere) | 0.50s | 0.14s | 3.54x |

### Use Cases

1. **FastAPI/async web frameworks**: Non-blocking GenAI calls in async endpoints
2. **LangChain agents**: Concurrent tool calls and chain execution
3. **Batch processing**: Parallel processing of multiple prompts
4. **Real-time applications**: Low-latency streaming responses

### Implementation

A reference implementation is provided in PR #835 with:
- Full async client implementation
- 15 unit tests
- 7 integration tests
- Tested on Python 3.9, 3.12, 3.13, 3.14
- Tested with 6 different models

### Related

- PR: #835

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Add async support for Generative AI Inference client #836

Feature Request

Description

Problem Statement

Proposed Solution

Example Usage

Performance Impact

Use Cases

Implementation

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Scenario	Sequential	Concurrent	Speedup
3 requests (Llama 3.3)	1.30s	0.64s	2.01x
3 requests (Llama 3.2)	1.40s	0.44s	3.18x
3 requests (Cohere)	0.50s	0.14s	3.54x

Feature Request: Add async support for Generative AI Inference client #836

Description

Feature Request

Description

Problem Statement

Proposed Solution

Example Usage

Performance Impact

Use Cases

Implementation

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions