Lens
📄

arXiv

The arXiv collector specializes in academic research, providing access to over 2 million pre-prints in physics, computer science, and related fields.

Capabilities

Paper Search Search by title, author, or category
PDF Processing Extract text from academic PDFs
Metadata Citations and references

How Configuration Works

All collectors in Lens adhere to a unified configuration pattern. Instead of hardcoding values, you inject parameters at runtime using the lens.config.yaml file or directly via the ResearchAgent constructor.

Dynamic Configuration

When you initialize a collector, it inspects the configuration object passed to it. This allows for flexible behavior adjustments—such as changing target languages, result limits, or API-specific flags—without modifying the core collector logic. Refer to the central Configuration Reference for the full list of supported parameters available to this collector.

Schema Mapping

Source Field Lens Field Type
title title string
summary abstract text
id doi_or_id string
published published_date datetime

Rate Limiting

arXiv requires a 3-second delay between API requests. The Lens engine handles this automatically through its built-in throttling mechanism.