Lens

Python SDK

Build sophisticated research workflows directly in your Python applications.

The BaseCollector Contract

All collectors must inherit from the BaseCollector abstract class. This defines the standard interface and automated workflow for every data source in Lens.

python
from abc import ABC, abstractmethod

class BaseCollector(ABC):
    def __init__(self, name: str, config: dict):
        self.name = name
        self.config = config
        self.data = []

    @abstractmethod
    def fetch(self) -> dict:
        """Retrieve raw data from the source."""
        pass

    def validate(self) -> bool:
        """Validate response integrity."""
        return True

    def format(self) -> dict:
        """Standardize raw response for the ResearchGraph."""
        pass

    def run(self) -> dict:
        """Core lifecycle: fetch -> validate -> format."""
        data = self.fetch()
        if self.validate():
            return self.format()
        return None

Framework Lifecycle

1. Initialization

Framework injects global config and credentials from the registry into the collector instance.

2. Execution

The run() method orchestrates the internal pipeline while managing retries and logging.

3. Normalization

Standardizes heterogeneous API responses into a unified ResearchItem object.

Working with the Result

The SDK provides structured access to all data retrieved during the research flow.

python
# Access the ResearchGraph
graph = result.graph

for node in graph.nodes:
    print(f"[{node.source}] {node.title}")

# Export to various formats
result.to_markdown("research_report.md")
result.to_json("data_dump.json")

Async Support

Lens fully supports asyncio. For high-throughput applications, use await lens.aquery() to execute queries without blocking the event loop.

Error Handling

Always wrap your research tasks in a try-except block to handle rate limits and API failures gracefully.

python
from lens.exceptions import RateLimitError, APIError

try:
    result = lens.query("...")
except RateLimitError as e:
    print(f"Hit rate limit for {e.source}. Retrying in {e.retry_after}s.")
except APIError:
    print("General research failure.")