Lens

Architecture

Lens is built as a modular research engine. It separates data collection from synthesis and output.

Collectors
(Wikipedia, GitHub...)
Lens Engine
(Dedupe & Synth)
Output
(JSON, MD)

Core Data Flow Pipeline

Module Interaction Flow

1. Entry main.py starts and initializes the Config singleton.
2. Registry Engine loads the collector registry from core/config.py.
3. Factory Instantiates collectors inheriting from the BaseCollector abstract class.
→ run() Executes fetch()validate()format() lifecycle.
4. Output OutputHandler manages persistence to JSON/CSV/Parquet.
5. Report Final synthesis and generation of the summary_report.json.

Core Framework

Lens follows an Abstract Factory Pattern where the BaseCollector defines a strict contract for all data sources. This ensures that adding a new collector only requires implementing the source-specific fetch() logic, while error handling, retries, and logging are managed by the core framework.