Architecture
Lens is built as a modular research engine. It separates data collection from synthesis and output.
Collectors
(Wikipedia, GitHub...)
(Wikipedia, GitHub...)
→
Lens Engine
(Dedupe & Synth)
(Dedupe & Synth)
→
Output
(JSON, MD)
(JSON, MD)
Core Data Flow Pipeline
Module Interaction Flow
1. Entry
main.py starts and initializes the Config singleton. 2. Registry Engine loads the collector registry from
core/config.py. 3. Factory Instantiates collectors inheriting from the
BaseCollector abstract class. → run() Executes
fetch() → validate() → format() lifecycle. 4. Output
OutputHandler manages persistence to JSON/CSV/Parquet. 5. Report Final synthesis and generation of the
summary_report.json. Core Framework
Lens follows an Abstract Factory Pattern where the BaseCollector defines a strict contract for all data sources. This ensures that adding a new collector only requires implementing the source-specific fetch() logic, while error handling, retries, and logging are managed by the core framework.