Google Search Tools
Overview
The Google Search Tools provide comprehensive web search capabilities for AI assistants, integrating Google’s search API with Vertex AI’s Gemini models to deliver rich, grounded search results. These tools enable AI assistants to access real-time web information, format search results for display, and provide users with cited sources and confidence scores.
Location: backend/tools/google_search.py
Core Functions
1. create_google_search_component_string()
Converts Gemini API responses containing search grounding metadata into React component strings for frontend display.
Signature:
def create_google_search_component_string(gemini_response: GenerateContentResponse) -> str
Purpose:
- Extracts search entry point content from Gemini responses
- Formats grounding metadata into structured JSON
- Creates React component markup with search results
- Applies CSS class name replacements for styling compatibility
Key Features:
- Source Attribution: Extracts URI and title information from grounding chunks
- Confidence Scores: Includes confidence ratings for search result segments
- Search Query Tracking: Captures the original search queries used
- CSS Safety: Removes style tags and applies namespaced CSS classes
Example Output:
<div class="google-search-container">
<!-- Search entry point HTML -->
</div>
<googlesearch
sources='[{"uri": "https://example.com", "title": "Example Title"}]'
segments='[{"text": "Search result text", "confidence": 0.95, "sources": ["https://example.com"]}]'
queries='["search query"]'
/>
2. create_google_search_markdown()
Generates markdown-formatted search results that group sources with their findings for clean display in chat interfaces.
Signature:
def create_google_search_markdown(gemini_response: GenerateContentResponse) -> str
Purpose:
- Creates human-readable search result summaries
- Groups findings by source for better organization
- Includes confidence percentages for transparency
- Adds Google branding with inline SVG logo
Example Output:
## Google Search Results
🔍 Google
- *[Example Title](https://example.com)*
Climate change is a pressing global issue affecting millions.
*(Confidence: 95.3%)*
- *[Another Source](https://example2.com)*
Renewable energy solutions are becoming more cost-effective.
*(Confidence: 87.1%)*
3. google_search_retrieval()
Orchestrates multiple parallel search queries using the AsyncTaskRunner for efficient bulk searching.
Signature:
async def google_search_retrieval(
input_list_dict: List[Dict],
fallback_question: str = "",
callback=None,
trace=None,
parent_observation_id=None
) -> str
Parameters:
input_list_dict: List of search configurations with query parametersfallback_question: Default query if no specific query providedcallback: Streaming callback for real-time updatestrace: Langfuse trace object for observabilityparent_observation_id: Parent observation ID for nested tracking
Key Features:
- Parallel Execution: Runs multiple searches simultaneously
- Error Resilience: Continues processing even if individual searches fail
- Streaming Support: Provides real-time updates through callbacks
- Comprehensive Logging: Full observability through Langfuse integration
4. google_search_retrieval_one()
Executes a single search query using Gemini’s grounding capabilities with comprehensive error handling.
Signature:
async def google_search_retrieval_one(
query: str,
trace: Optional[StatefulTraceClient] = None,
parent_observation_id=None
) -> str
Purpose:
- Performs individual search queries
- Integrates with model tools for search execution
- Provides detailed error handling and logging
- Returns formatted search results
System Instruction:
“You are an Aitana assistant dedicated to providing the best search results to answer people’s questions”
5. google_search_old() (Legacy)
Legacy implementation maintained for backward compatibility using Vertex AI GenerativeModel directly.
Signature:
async def google_search_old(
question: str,
config: ConfigManager,
trace: Optional[StatefulTraceClient] = None,
parent_observation_id=None
) -> Dict[str, str]
Returns: Dictionary containing:
markdown_text: Formatted search resultssearch_entry_point: Raw HTML entry point content
Integration with Tool Orchestrator
Registration
The Google Search tools are integrated into the Tool Orchestrator System under the google_search_retrieval tool name:
# In tool orchestrator
'google_search_retrieval': google_search_retrieval
Configuration Options
Tool Configuration Example:
toolConfigs = {
'google_search_retrieval': {
'max_results': 10,
'safe_search': True,
'region': 'US',
'language': 'en'
}
}
AI Tool Selection Example:
first_response_tools = [
{
'name': 'google_search_retrieval',
'config': [
{'parameter': 'query', 'value': 'latest climate research 2024'},
{'parameter': 'hardcode-query', 'value': 'renewable energy trends'}
]
}
]
Anthropic Web Tools Integration
Complementary Capabilities
The Google Search tools work alongside Anthropic’s Claude web browsing capabilities to provide comprehensive web research:
Claude’s Web Browsing:
- Real-time web page content extraction
- Interactive web navigation
- Dynamic content handling
- JavaScript-rendered page support
Google Search Tools:
- Structured search result aggregation
- Multiple source comparison
- Confidence-scored information
- Search query optimization
Integration Patterns
1. Search-Then-Browse Pattern
# AI workflow example
first_response_tools = [
{
'name': 'google_search_retrieval',
'config': [{'parameter': 'query', 'value': 'best machine learning frameworks 2024'}]
}
# Claude can then use web browsing to dive deeper into specific results
]
2. Parallel Research Pattern
# Combine multiple search approaches
research_tools = [
{'name': 'google_search_retrieval', 'config': [{'parameter': 'query', 'value': 'topic overview'}]},
# Claude web browsing for specific authoritative sources
# Vertex AI search for internal documentation
]
3. Verification Pattern
- Use Google Search for initial information gathering
- Use Claude web browsing to verify claims by visiting original sources
- Cross-reference findings for accuracy
Best Practices for Combined Usage
When to Use Google Search Tools:
- Initial topic exploration
- Finding recent news and updates
- Gathering multiple perspectives
- Quick fact-checking
When to Use Claude Web Browsing:
- Deep diving into specific sources
- Accessing paywalled or registration-required content
- Navigating complex websites
- Real-time data extraction
When to Use Both:
- Comprehensive research projects
- Fact verification workflows
- Content creation with multiple sources
- Academic or professional research
Frontend Integration
React Component Integration
The Google Search tools generate custom React components that can be rendered in the chat interface:
Component Structure:
interface GoogleSearchProps {
sources: Array<{uri: string, title: string}>;
segments: Array<{
text: string,
confidence: number,
sources: string[]
}>;
queries: string[];
}
Styling and Display
CSS Classes Applied:
.google-search-container: Main container styling.google-search-chip: Search query chips.google-search-carousel: Result carousel display.google-search-headline: Result headlines.google-search-gradient-container: Visual effects
User Experience Features
- Visual Source Attribution: Clear links to original sources
- Confidence Indicators: Percentage confidence for each finding
- Search Query Display: Shows what queries were actually executed
- Responsive Design: Adapts to different screen sizes
Configuration and Deployment
Environment Variables
Required environment variables for Google Search integration:
# Google Cloud Project Configuration
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1
# Vertex AI Configuration
GOOGLE_GENAI_USE_VERTEXAI=true
# Search API Configuration (if using custom search)
GOOGLE_SEARCH_API_KEY=your-search-api-key
GOOGLE_SEARCH_ENGINE_ID=your-search-engine-id
Model Configuration
Gemini Model Settings:
gen_config = types.GenerateContentConfig(
system_instruction="You are an Aitana assistant dedicated to providing the best search results",
tools=tools,
max_output_tokens=8192,
)
Error Handling and Resilience
Common Error Scenarios
1. API Rate Limiting
# Automatic retry with exponential backoff
try:
response = await call_gemini_async(contents, gen_config=gen_config)
except RateLimitError:
# Handled by AsyncTaskRunner retry mechanism
pass
2. No Search Results
if not grounding_metadata:
log.info("No grounding metadata found")
return '' # Graceful degradation
3. Malformed Responses
except Exception as e:
log.error(f"Error creating Google Search component: {str(e)}")
return f'<googlesearch error={json.dumps(str(e))} />'
Monitoring and Observability
Langfuse Integration:
- Search query tracking
- Response time monitoring
- Error rate analysis
- Token usage tracking
Logging Levels:
INFO: Successful searches and metadata extractionWARN: Partial failures or fallback scenariosERROR: Complete failures with full traceback
Testing and Development
Unit Testing
Test Structure:
# backend/tests/tools/test_google_search.py
class TestGoogleSearch:
async def test_search_component_creation(self):
# Test component string generation
pass
async def test_markdown_generation(self):
# Test markdown formatting
pass
async def test_parallel_search_execution(self):
# Test multi-query handling
pass
Integration Testing
API Integration:
async def test_live_search():
result = await google_search_retrieval_one("test query")
assert "Google was queried" in result
assert "found this text" in result
Performance Testing
Metrics to Monitor:
- Search response time (target: <2 seconds)
- Parallel execution efficiency
- Memory usage during large searches
- Error recovery time
Usage Examples
Basic Search Query
# Single search execution
result = await google_search_retrieval_one(
query="climate change solutions 2024",
trace=trace_obj
)
print(result)
# Output: "Google was queried with climate change solutions 2024 and found this text: <google_search_query>...</google_search_query>"
Multiple Parallel Searches
# Multiple search queries
search_configs = [
{'query': 'renewable energy trends'},
{'query': 'solar panel efficiency 2024'},
{'hardcode-query': 'wind power innovations'}
]
results = await google_search_retrieval(
input_list_dict=search_configs,
fallback_question="energy technologies",
trace=trace_obj
)
print(results) # Combined results from all searches
Component Generation
# Generate React component from search response
gemini_response = await call_gemini_async(contents, gen_config=gen_config)
component_string = create_google_search_component_string(gemini_response)
# Result can be directly rendered in React frontend
print(component_string)
# Output: HTML + <googlesearch> component with JSON props
Markdown Formatting
# Generate markdown summary
markdown_results = create_google_search_markdown(gemini_response)
print(markdown_results)
# Output: Formatted markdown with sources and confidence scores
Security Considerations
Input Validation
- Search queries are sanitized before execution
- Parameter validation prevents injection attacks
- Rate limiting prevents abuse
Output Sanitization
- HTML content is stripped of dangerous elements
- CSS classes are namespaced to prevent conflicts
- JSON output is properly escaped
API Security
- Google Cloud authentication via service accounts
- Environment variable configuration for sensitive data
- Audit logging through Langfuse integration
Performance Optimization
Caching Strategies
- Search results can be cached by query hash
- Component strings cached for repeated queries
- Markdown formatting cached separately
Parallel Processing
- Multiple searches execute simultaneously via AsyncTaskRunner
- No blocking between independent search queries
- Efficient resource utilization
Resource Management
- Memory-efficient streaming for large results
- Automatic garbage collection of completed tasks
- Configurable timeout handling
Troubleshooting
Common Issues
1. No Search Results Returned
# Check logs for:
log.info("No grounding metadata found")
log.info("No search_entry_point found")
2. Component Rendering Issues
- Verify CSS class name replacements
- Check JSON prop formatting
- Ensure React component registration
3. API Authentication Errors
- Verify
GOOGLE_CLOUD_PROJECTenvironment variable - Check service account permissions
- Validate Vertex AI API enablement
Debug Mode
Enable detailed logging:
import logging
logging.getLogger('google_search').setLevel(logging.DEBUG)
Performance Issues
- Monitor Langfuse traces for bottlenecks
- Check AsyncTaskRunner execution times
- Verify network connectivity to Google APIs
Related Documentation
Core Integration
- Tool Orchestrator System - Backend tool coordination
- How to Add Tools to Aitana - Tool development guide
- VAC Service Architecture - Overall system design
Search and AI Tools
- Vertex AI Search Widget - Internal search capabilities
- Document Search Agent - Document-specific search
- Assistant Calling Agent - Cross-assistant coordination
Frontend Components
- Artifact System - Component rendering system
- Chat Interface Component - Chat integration
- Message Content Component - Content display
Development and Testing
- Backend API How-To - API development
- API Integration Tests - Testing strategies
- Local CI Consistency - Development setup
Anthropic Integration Resources
- Anthropic Web Browsing Documentation - Claude’s web browsing capabilities
- Tool Use Guide - General tool integration patterns
- Model Context Protocol - Advanced tool integration