Google Search Tools

Overview

The Google Search Tools provide comprehensive web search capabilities for AI assistants, integrating Google’s search API with Vertex AI’s Gemini models to deliver rich, grounded search results. These tools enable AI assistants to access real-time web information, format search results for display, and provide users with cited sources and confidence scores.

Location: backend/tools/google_search.py

Core Functions

1. `create_google_search_component_string()`

Converts Gemini API responses containing search grounding metadata into React component strings for frontend display.

Signature:

def create_google_search_component_string(gemini_response: GenerateContentResponse) -> str

Purpose:

Extracts search entry point content from Gemini responses
Formats grounding metadata into structured JSON
Creates React component markup with search results
Applies CSS class name replacements for styling compatibility

Key Features:

Source Attribution: Extracts URI and title information from grounding chunks
Confidence Scores: Includes confidence ratings for search result segments
Search Query Tracking: Captures the original search queries used
CSS Safety: Removes style tags and applies namespaced CSS classes

Example Output:

<div class="google-search-container">
  <!-- Search entry point HTML -->
</div>

<googlesearch
  sources='[{"uri": "https://example.com", "title": "Example Title"}]'
  segments='[{"text": "Search result text", "confidence": 0.95, "sources": ["https://example.com"]}]'
  queries='["search query"]'
/>

2. `create_google_search_markdown()`

Generates markdown-formatted search results that group sources with their findings for clean display in chat interfaces.

Signature:

def create_google_search_markdown(gemini_response: GenerateContentResponse) -> str

Purpose:

Creates human-readable search result summaries
Groups findings by source for better organization
Includes confidence percentages for transparency
Adds Google branding with inline SVG logo

Example Output:

## Google Search Results
🔍 Google

- *[Example Title](https://example.com)*
  Climate change is a pressing global issue affecting millions.
  *(Confidence: 95.3%)*

- *[Another Source](https://example2.com)*
  Renewable energy solutions are becoming more cost-effective.
  *(Confidence: 87.1%)*

3. `google_search_retrieval()`

Orchestrates multiple parallel search queries using the AsyncTaskRunner for efficient bulk searching.

Signature:

async def google_search_retrieval(
    input_list_dict: List[Dict], 
    fallback_question: str = "",
    callback=None,
    trace=None, 
    parent_observation_id=None
) -> str

Parameters:

input_list_dict: List of search configurations with query parameters
fallback_question: Default query if no specific query provided
callback: Streaming callback for real-time updates
trace: Langfuse trace object for observability
parent_observation_id: Parent observation ID for nested tracking

Key Features:

Parallel Execution: Runs multiple searches simultaneously
Error Resilience: Continues processing even if individual searches fail
Streaming Support: Provides real-time updates through callbacks
Comprehensive Logging: Full observability through Langfuse integration

4. `google_search_retrieval_one()`

Executes a single search query using Gemini’s grounding capabilities with comprehensive error handling.

Signature:

async def google_search_retrieval_one(
    query: str, 
    trace: Optional[StatefulTraceClient] = None, 
    parent_observation_id=None
) -> str

Purpose:

Performs individual search queries
Integrates with model tools for search execution
Provides detailed error handling and logging
Returns formatted search results

System Instruction:

“You are an Aitana assistant dedicated to providing the best search results to answer people’s questions”

5. `google_search_old()` (Legacy)

Legacy implementation maintained for backward compatibility using Vertex AI GenerativeModel directly.

Signature:

async def google_search_old(
    question: str, 
    config: ConfigManager, 
    trace: Optional[StatefulTraceClient] = None, 
    parent_observation_id=None
) -> Dict[str, str]

Returns: Dictionary containing:

markdown_text: Formatted search results
search_entry_point: Raw HTML entry point content

Integration with Tool Orchestrator

Registration

The Google Search tools are integrated into the Tool Orchestrator System under the google_search_retrieval tool name:

# In tool orchestrator
'google_search_retrieval': google_search_retrieval

Configuration Options

Tool Configuration Example:

toolConfigs = {
    'google_search_retrieval': {
        'max_results': 10,
        'safe_search': True,
        'region': 'US',
        'language': 'en'
    }
}

AI Tool Selection Example:

first_response_tools = [
    {
        'name': 'google_search_retrieval',
        'config': [
            {'parameter': 'query', 'value': 'latest climate research 2024'},
            {'parameter': 'hardcode-query', 'value': 'renewable energy trends'}
        ]
    }
]

Anthropic Web Tools Integration

Complementary Capabilities

The Google Search tools work alongside Anthropic’s Claude web browsing capabilities to provide comprehensive web research:

Claude’s Web Browsing:

Real-time web page content extraction
Interactive web navigation
Dynamic content handling
JavaScript-rendered page support

Google Search Tools:

Structured search result aggregation
Multiple source comparison
Confidence-scored information
Search query optimization

Integration Patterns

1. Search-Then-Browse Pattern

# AI workflow example
first_response_tools = [
    {
        'name': 'google_search_retrieval',
        'config': [{'parameter': 'query', 'value': 'best machine learning frameworks 2024'}]
    }
    # Claude can then use web browsing to dive deeper into specific results
]

2. Parallel Research Pattern

# Combine multiple search approaches
research_tools = [
    {'name': 'google_search_retrieval', 'config': [{'parameter': 'query', 'value': 'topic overview'}]},
    # Claude web browsing for specific authoritative sources
    # Vertex AI search for internal documentation
]

3. Verification Pattern

Use Google Search for initial information gathering
Use Claude web browsing to verify claims by visiting original sources
Cross-reference findings for accuracy

Best Practices for Combined Usage

When to Use Google Search Tools:

Initial topic exploration
Finding recent news and updates
Gathering multiple perspectives
Quick fact-checking

When to Use Claude Web Browsing:

Deep diving into specific sources
Accessing paywalled or registration-required content
Navigating complex websites
Real-time data extraction

When to Use Both:

Comprehensive research projects
Fact verification workflows
Content creation with multiple sources
Academic or professional research

Frontend Integration

React Component Integration

The Google Search tools generate custom React components that can be rendered in the chat interface:

Component Structure:

interface GoogleSearchProps {
  sources: Array<{uri: string, title: string}>;
  segments: Array<{
    text: string, 
    confidence: number, 
    sources: string[]
  }>;
  queries: string[];
}

Styling and Display

CSS Classes Applied:

.google-search-container: Main container styling
.google-search-chip: Search query chips
.google-search-carousel: Result carousel display
.google-search-headline: Result headlines
.google-search-gradient-container: Visual effects

User Experience Features

Visual Source Attribution: Clear links to original sources
Confidence Indicators: Percentage confidence for each finding
Search Query Display: Shows what queries were actually executed
Responsive Design: Adapts to different screen sizes

Configuration and Deployment

Environment Variables

Required environment variables for Google Search integration:

# Google Cloud Project Configuration
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1

# Vertex AI Configuration
GOOGLE_GENAI_USE_VERTEXAI=true

# Search API Configuration (if using custom search)
GOOGLE_SEARCH_API_KEY=your-search-api-key
GOOGLE_SEARCH_ENGINE_ID=your-search-engine-id

Model Configuration

Gemini Model Settings:

gen_config = types.GenerateContentConfig(
    system_instruction="You are an Aitana assistant dedicated to providing the best search results",
    tools=tools,
    max_output_tokens=8192,
)

Error Handling and Resilience

Common Error Scenarios

1. API Rate Limiting

# Automatic retry with exponential backoff
try:
    response = await call_gemini_async(contents, gen_config=gen_config)
except RateLimitError:
    # Handled by AsyncTaskRunner retry mechanism
    pass

2. No Search Results

if not grounding_metadata:
    log.info("No grounding metadata found")
    return ''  # Graceful degradation

3. Malformed Responses

except Exception as e:
    log.error(f"Error creating Google Search component: {str(e)}")
    return f'<googlesearch error={json.dumps(str(e))} />'

Monitoring and Observability

Langfuse Integration:

Search query tracking
Response time monitoring
Error rate analysis
Token usage tracking

Logging Levels:

INFO: Successful searches and metadata extraction
WARN: Partial failures or fallback scenarios
ERROR: Complete failures with full traceback

Testing and Development

Unit Testing

Test Structure:

# backend/tests/tools/test_google_search.py
class TestGoogleSearch:
    async def test_search_component_creation(self):
        # Test component string generation
        pass
    
    async def test_markdown_generation(self):
        # Test markdown formatting
        pass
    
    async def test_parallel_search_execution(self):
        # Test multi-query handling
        pass

Integration Testing

API Integration:

async def test_live_search():
    result = await google_search_retrieval_one("test query")
    assert "Google was queried" in result
    assert "found this text" in result

Performance Testing

Metrics to Monitor:

Search response time (target: <2 seconds)
Parallel execution efficiency
Memory usage during large searches
Error recovery time

Usage Examples

Basic Search Query

# Single search execution
result = await google_search_retrieval_one(
    query="climate change solutions 2024",
    trace=trace_obj
)
print(result)
# Output: "Google was queried with climate change solutions 2024 and found this text: <google_search_query>...</google_search_query>"

Multiple Parallel Searches

# Multiple search queries
search_configs = [
    {'query': 'renewable energy trends'},
    {'query': 'solar panel efficiency 2024'},
    {'hardcode-query': 'wind power innovations'}
]

results = await google_search_retrieval(
    input_list_dict=search_configs,
    fallback_question="energy technologies",
    trace=trace_obj
)
print(results)  # Combined results from all searches

Component Generation

# Generate React component from search response
gemini_response = await call_gemini_async(contents, gen_config=gen_config)
component_string = create_google_search_component_string(gemini_response)

# Result can be directly rendered in React frontend
print(component_string)
# Output: HTML + <googlesearch> component with JSON props

Markdown Formatting

# Generate markdown summary
markdown_results = create_google_search_markdown(gemini_response)
print(markdown_results)
# Output: Formatted markdown with sources and confidence scores

Security Considerations

Input Validation

Search queries are sanitized before execution
Parameter validation prevents injection attacks
Rate limiting prevents abuse

Output Sanitization

HTML content is stripped of dangerous elements
CSS classes are namespaced to prevent conflicts
JSON output is properly escaped

API Security

Google Cloud authentication via service accounts
Environment variable configuration for sensitive data
Audit logging through Langfuse integration

Performance Optimization

Caching Strategies

Search results can be cached by query hash
Component strings cached for repeated queries
Markdown formatting cached separately

Parallel Processing

Multiple searches execute simultaneously via AsyncTaskRunner
No blocking between independent search queries
Efficient resource utilization

Resource Management

Memory-efficient streaming for large results
Automatic garbage collection of completed tasks
Configurable timeout handling

Troubleshooting

Common Issues

1. No Search Results Returned

# Check logs for:
log.info("No grounding metadata found")
log.info("No search_entry_point found")

2. Component Rendering Issues

Verify CSS class name replacements
Check JSON prop formatting
Ensure React component registration

3. API Authentication Errors

Verify GOOGLE_CLOUD_PROJECT environment variable
Check service account permissions
Validate Vertex AI API enablement

Debug Mode

Enable detailed logging:

import logging
logging.getLogger('google_search').setLevel(logging.DEBUG)

Performance Issues

Monitor Langfuse traces for bottlenecks
Check AsyncTaskRunner execution times
Verify network connectivity to Google APIs

Core Integration

Tool Orchestrator System - Backend tool coordination
How to Add Tools to Aitana - Tool development guide
VAC Service Architecture - Overall system design

Search and AI Tools

Vertex AI Search Widget - Internal search capabilities
Document Search Agent - Document-specific search
Assistant Calling Agent - Cross-assistant coordination

Frontend Components

Artifact System - Component rendering system
Chat Interface Component - Chat integration
Message Content Component - Content display

Development and Testing

Backend API How-To - API development
API Integration Tests - Testing strategies
Local CI Consistency - Development setup

Anthropic Integration Resources

Anthropic Web Browsing Documentation - Claude’s web browsing capabilities
Tool Use Guide - General tool integration patterns
Model Context Protocol - Advanced tool integration

Google Search Tools

Overview

Core Functions

1. create_google_search_component_string()

2. create_google_search_markdown()

3. google_search_retrieval()

4. google_search_retrieval_one()

5. google_search_old() (Legacy)

Integration with Tool Orchestrator

Registration

Configuration Options

Anthropic Web Tools Integration

Complementary Capabilities

Integration Patterns

1. Search-Then-Browse Pattern

2. Parallel Research Pattern

3. Verification Pattern

Best Practices for Combined Usage

Frontend Integration

React Component Integration

Styling and Display

User Experience Features

Configuration and Deployment

Environment Variables

Model Configuration

Error Handling and Resilience

Common Error Scenarios

Monitoring and Observability

Testing and Development

Unit Testing

Integration Testing

Performance Testing

Usage Examples

Basic Search Query

Multiple Parallel Searches

Component Generation

Markdown Formatting

Security Considerations

Input Validation

Output Sanitization

API Security

Performance Optimization

Caching Strategies

Parallel Processing

Resource Management

Troubleshooting

Common Issues

Debug Mode

Performance Issues

Related Documentation

Core Integration

Search and AI Tools

Frontend Components

Development and Testing

Anthropic Integration Resources

1. `create_google_search_component_string()`

2. `create_google_search_markdown()`

3. `google_search_retrieval()`

4. `google_search_retrieval_one()`

5. `google_search_old()` (Legacy)