---
title: Advanced Usage
description: Advanced usage patterns, CI/CD integration, and provider filtering
---

# Advanced Usage

Advanced patterns for using llm-discovery in production environments.

## CI/CD Integration

### GitHub Actions

Complete workflow for fetching and exporting models on a schedule.

```yaml
name: Update LLM Models

on:
  schedule:
    # Run every 6 hours
    - cron: '0 */6 * * *'
  workflow_dispatch:  # Allow manual triggers

jobs:
  update-models:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout repository
        uses: actions/checkout@v4

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.13'

      - name: Install llm-discovery
        run: pip install llm-discovery

      - name: Fetch models from all providers
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
          GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
        run: |
          llm-discovery update --detect-changes

      - name: Export to multiple formats
        run: |
          mkdir -p exports
          llm-discovery export --format json --output exports/models.json
          llm-discovery export --format csv --output exports/models.csv
          llm-discovery export --format markdown --output exports/models.md

      - name: Upload artifacts
        uses: actions/upload-artifact@v4
        with:
          name: llm-models
          path: exports/
          retention-days: 30

      - name: Commit changes (if any)
        run: |
          git config user.name "GitHub Actions"
          git config user.email "actions@github.com"
          git add exports/
          git diff --quiet && git diff --staged --quiet || \
            (git commit -m "chore: update LLM model data [skip ci]" && git push)
```

:::{tip}
Use GitHub Actions secrets to store API keys securely.
Never commit API keys directly to the repository.
:::

### GitLab CI

GitLab CI pipeline for model updates.

```yaml
stages:
  - fetch
  - export

variables:
  PYTHON_VERSION: "3.13"

fetch-models:
  stage: fetch
  image: python:${PYTHON_VERSION}
  script:
    - pip install llm-discovery
    - llm-discovery update --detect-changes
  artifacts:
    paths:
      - ~/.cache/llm-discovery/
    expire_in: 1 day
  only:
    - schedules
  variables:
    OPENAI_API_KEY: $OPENAI_API_KEY
    GOOGLE_API_KEY: $GOOGLE_API_KEY

export-models:
  stage: export
  image: python:${PYTHON_VERSION}
  dependencies:
    - fetch-models
  script:
    - pip install llm-discovery
    - mkdir -p exports
    - llm-discovery export --format json --output exports/models.json
    - llm-discovery export --format csv --output exports/models.csv
  artifacts:
    paths:
      - exports/
    expire_in: 30 days
  only:
    - schedules
```

**Schedule Configuration** (GitLab UI):
- Go to CI/CD → Schedules
- Add new schedule: `0 */6 * * *` (every 6 hours)
- Set variables: `OPENAI_API_KEY`, `GOOGLE_API_KEY`

## Provider Filtering

Filter models by provider using Python API.

### Filter by Single Provider

```python
import asyncio
from llm_discovery import DiscoveryClient

async def fetch_openai_only():
    client = DiscoveryClient()
    all_models = await client.fetch_models()

    # Filter OpenAI models only
    openai_models = [
        model for model in all_models
        if model.provider_name == "openai"
    ]

    print(f"OpenAI models: {len(openai_models)}")
    for model in openai_models:
        print(f"  {model.model_id}: {model.model_name}")

asyncio.run(fetch_openai_only())
```

### Filter by Multiple Providers

```python
import asyncio
from llm_discovery import DiscoveryClient

async def fetch_specific_providers():
    client = DiscoveryClient()
    all_models = await client.fetch_models()

    # Filter by provider list
    allowed_providers = {"openai", "google"}
    filtered_models = [
        model for model in all_models
        if model.provider_name in allowed_providers
    ]

    # Group by provider
    by_provider = {}
    for model in filtered_models:
        if model.provider_name not in by_provider:
            by_provider[model.provider_name] = []
        by_provider[model.provider_name].append(model)

    # Display results
    for provider, models in by_provider.items():
        print(f"\n{provider}: {len(models)} models")
        for model in models:
            print(f"  {model.model_id}")

asyncio.run(fetch_specific_providers())
```

### Filter by Capabilities

```python
import asyncio
from llm_discovery import DiscoveryClient

async def fetch_chat_models():
    client = DiscoveryClient()
    all_models = await client.fetch_models()

    # Filter models with chat capability
    chat_models = [
        model for model in all_models
        if "chat" in model.capabilities
    ]

    print(f"Chat-capable models: {len(chat_models)}")
    for model in chat_models:
        print(f"  {model.provider_name}/{model.model_id}")
        print(f"    Capabilities: {', '.join(model.capabilities)}")

asyncio.run(fetch_chat_models())
```

## Custom Error Handling

Implement custom error handling for production use.

### Retry Logic with Exponential Backoff

```python
import asyncio
import time
from llm_discovery import DiscoveryClient
from llm_discovery.exceptions import ProviderFetchError, NetworkError

async def fetch_with_retry(max_retries=3, base_delay=1):
    client = DiscoveryClient()

    for attempt in range(max_retries):
        try:
            models = await client.fetch_models()
            print(f"✓ Successfully fetched {len(models)} models")
            return models

        except NetworkError as e:
            if attempt < max_retries - 1:
                delay = base_delay * (2 ** attempt)  # Exponential backoff
                print(f"✗ Network error on attempt {attempt + 1}/{max_retries}")
                print(f"  Retrying in {delay} seconds...")
                time.sleep(delay)
            else:
                print(f"✗ Failed after {max_retries} attempts")
                raise

        except ProviderFetchError as e:
            print(f"✗ Provider fetch error: {e}")
            raise

asyncio.run(fetch_with_retry())
```

### Fallback to Cache on API Failure

```python
import asyncio
from llm_discovery import DiscoveryClient
from llm_discovery.exceptions import (
    ProviderFetchError,
    CacheNotFoundError
)

async def fetch_with_cache_fallback():
    client = DiscoveryClient()

    try:
        # Try fetching from APIs
        models = await client.fetch_models()
        print(f"✓ Fetched {len(models)} models from APIs")
        return models

    except ProviderFetchError as e:
        print(f"✗ API fetch failed: {e}")
        print("  Attempting to load from cache...")

        try:
            models = client.get_cached_models()
            print(f"✓ Loaded {len(models)} models from cache")
            print("  (Note: Data may be outdated)")
            return models

        except CacheNotFoundError:
            print("✗ No cache available")
            print("  Cannot proceed without data")
            raise

asyncio.run(fetch_with_cache_fallback())
```

### Partial Success Handling

```python
import asyncio
from llm_discovery import DiscoveryClient
from llm_discovery.exceptions import PartialFetchError

async def handle_partial_failure():
    client = DiscoveryClient()

    try:
        models = await client.fetch_models()
        print(f"✓ All providers successful: {len(models)} models")

    except PartialFetchError as e:
        print(f"⚠ Partial failure detected")
        print(f"  Successful: {', '.join(e.successful_providers)}")
        print(f"  Failed: {', '.join(e.failed_providers)}")

        # Decision: Accept partial data or abort?
        if len(e.successful_providers) >= 2:
            print("  Proceeding with partial data (2+ providers successful)")
            # Use e.models for partial data if needed
        else:
            print("  Aborting (less than 2 providers successful)")
            raise

asyncio.run(handle_partial_failure())
```

## Google Vertex AI Setup

Configure Google Vertex AI for production environments.

### Prerequisites

1. **Create GCP Project**:
   - Go to [Google Cloud Console](https://console.cloud.google.com/)
   - Create new project or select existing project

2. **Enable Vertex AI API**:
   ```bash
   gcloud services enable aiplatform.googleapis.com
   ```

3. **Create Service Account**:
   ```bash
   gcloud iam service-accounts create llm-discovery-sa \
     --display-name="LLM Discovery Service Account"
   ```

4. **Grant Permissions**:
   ```bash
   gcloud projects add-iam-policy-binding PROJECT_ID \
     --member="serviceAccount:llm-discovery-sa@PROJECT_ID.iam.gserviceaccount.com" \
     --role="roles/aiplatform.user"
   ```

5. **Download Service Account Key**:
   ```bash
   gcloud iam service-accounts keys create ~/llm-discovery-key.json \
     --iam-account=llm-discovery-sa@PROJECT_ID.iam.gserviceaccount.com
   ```

### Environment Configuration

**Local Development**:

```bash
export GOOGLE_GENAI_USE_VERTEXAI=true
export GOOGLE_APPLICATION_CREDENTIALS="$HOME/llm-discovery-key.json"
export GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1"
```

**GitHub Actions**:

```yaml
- name: Setup Google Cloud credentials
  env:
    GCP_SA_KEY: ${{ secrets.GCP_SA_KEY }}
  run: |
    echo "$GCP_SA_KEY" > $HOME/gcp-key.json
    export GOOGLE_APPLICATION_CREDENTIALS="$HOME/gcp-key.json"
    export GOOGLE_GENAI_USE_VERTEXAI=true

- name: Fetch models
  run: llm-discovery update
```

**GitLab CI**:

```yaml
variables:
  GOOGLE_GENAI_USE_VERTEXAI: "true"
  GOOGLE_APPLICATION_CREDENTIALS: "/tmp/gcp-key.json"

before_script:
  - echo "$GCP_SA_KEY" > /tmp/gcp-key.json
```

:::{caution}
Service account keys are sensitive credentials.
Store them securely using CI/CD secret management.
Never commit service account keys to version control.
:::

### Verify Setup

```python
import asyncio
from llm_discovery import DiscoveryClient
from llm_discovery.models.config import Config

async def verify_vertexai():
    # Verify configuration
    config = Config.from_env()
    print(f"Vertex AI enabled: {config.google_genai_use_vertexai}")
    print(f"Credentials path: {config.google_application_credentials}")

    # Fetch models
    client = DiscoveryClient(config=config)
    try:
        models = await client.fetch_models()
        google_models = [m for m in models if m.provider_name == "google"]
        print(f"✓ Successfully fetched {len(google_models)} Google models")
    except Exception as e:
        print(f"✗ Vertex AI setup error: {e}")
        raise

asyncio.run(verify_vertexai())
```

## Production Deployment Checklist

- [ ] API keys stored in secure secret management (not in code)
- [ ] Rate limiting configured (max 1 request per minute per provider)
- [ ] Caching strategy implemented (update every 6-24 hours)
- [ ] Error monitoring and alerting configured
- [ ] Retry logic with exponential backoff implemented
- [ ] Fallback to cache on API failure tested
- [ ] CI/CD pipeline tested in staging environment
- [ ] Log aggregation configured for debugging
- [ ] Backup strategy for cache data defined
- [ ] Documentation for runbook procedures created

## Performance Optimization

### Minimize API Calls

```python
import asyncio
from llm_discovery import DiscoveryClient

async def optimize_api_calls():
    client = DiscoveryClient()

    # Fetch once, use multiple times
    models = await client.fetch_models()

    # Filter without additional API calls
    openai_models = [m for m in models if m.provider_name == "openai"]
    google_models = [m for m in models if m.provider_name == "google"]
    chat_models = [m for m in models if "chat" in m.capabilities]

    # Export to multiple formats from same data
    from llm_discovery.services.exporters import (
        JSONExporter, CSVExporter, MarkdownExporter
    )

    json_exporter = JSONExporter()
    csv_exporter = CSVExporter()
    md_exporter = MarkdownExporter()

    json_data = json_exporter.export(models)
    csv_data = csv_exporter.export(models)
    md_data = md_exporter.export(models)

    # Save all formats
    with open("models.json", "w") as f:
        f.write(json_data)
    with open("models.csv", "w") as f:
        f.write(csv_data)
    with open("models.md", "w") as f:
        f.write(md_data)

    print("✓ Exported to 3 formats from single API call")

asyncio.run(optimize_api_calls())
```

### Cache Management

```python
from pathlib import Path
import shutil

def manage_cache():
    cache_dir = Path.home() / ".cache" / "llm-discovery"

    # Check cache size
    if cache_dir.exists():
        cache_size = sum(f.stat().st_size for f in cache_dir.rglob('*') if f.is_file())
        print(f"Cache size: {cache_size / 1024:.2f} KB")

    # Clear old snapshots (keep last 30 days)
    snapshots_dir = cache_dir / "snapshots"
    if snapshots_dir.exists():
        from datetime import datetime, timedelta, UTC
        cutoff = datetime.now(UTC) - timedelta(days=30)

        for snapshot in snapshots_dir.glob("*.toml"):
            if snapshot.stat().st_mtime < cutoff.timestamp():
                snapshot.unlink()
                print(f"Deleted old snapshot: {snapshot.name}")

manage_cache()
```

## Next Steps

- **Troubleshooting**: See [Troubleshooting Guide](troubleshooting.md)
- **API Reference**: See [Python API Reference](api-reference.md)
- **CLI Reference**: See [CLI Reference](cli-reference.md)