Skip to content

lezzin/report-processor

Repository files navigation

Report Generation System (Laravel + Redis + SSE)

A high-performance, asynchronous report generation engine built with Laravel. Designed to handle complex SQL queries, large datasets, and real-time user feedback through a decoupled SSE (Server-Sent Events) architecture.

🏗 Architecture Overview

The system follows a Domain-Driven Pipe & Filter architecture, focusing on memory efficiency and scalability.

graph TD
    Client[VueJS Frontend] <--> NestJS[NestJS SSE Bridge]
    Client -- "POST /api/reports/{type}" --> Laravel[Laravel API]
    Laravel -- "Dispatch Job" --> RedisQ[Redis Queue]
    RedisQ -- "Execute Pipeline" --> Worker[Laravel Worker]
    Worker -- "SQL Cursor" --> DB[(Database)]
    Worker -- "Flush Buffers" --> S3[Minio/S3 Storage]
    Worker -- "Publish Events" --> RedisPub[Redis Pub/Sub]
    RedisPub -- "Subscribe" --> NestJS
Loading

Core Components

  • Laravel API: Entry point for report requests, validation, and configuration.
  • Pipe & Filter Engine: A sequential pipeline that processes reports in stages (Start, Build, Count, Process, Zip, Finish).
  • Redis: Dual-purpose as a reliable Job Queue and a high-speed Message Broker (Pub/Sub).
  • NestJS Bridge (External): Handles long-lived SSE connections, offloading concurrency from the main PHP application.

📂 Project structure Mapping

The codebase is organized by domain-specific responsibility:

app/
├── Actions/
│   ├── Files/          # Low-level file operations (Multipart S3 Uploads, Expiration)
│   └── Reports/        # Business actions (SSE Publishing, Status Changes)
├── Reports/            # The "Report Engines" (Mappers, Queries, DTOs)
│   ├── Abstract/       # Base contracts for ReportQuery, ReportMapper, and Filters
│   ├── ProposalGeneration/
│   └── ResumeGeneration/
├── Pipelines/
│   └── CSV/            # Pipe & Filter implementation for CSV generation
│       └── Pipes/      # Granular steps: BuildQuery, CountRows, ProcessRows, etc.
├── Services/
│   ├── Csv/            # CsvExportService (Memory-safe buffering & Streaming)
│   ├── ReportDispatch/ # Orchestrates request -> background job
│   └── Pagination/     # Handles "Preview" mode using the same Report Engines
├── Support/
│   └── Reports/        # ReportProcessManager (Heartbeats, Cancellation, Monitoring)
├── Jobs/               # Background processing & Event debouncing
├── ValueObjects/       # Immutable configuration objects (ReportConfiguration)
└── DTOs/               # Strongly-typed data transfer objects

🔄 System Flows

1. Request Dispatch Flow

User requests are normalized into a ReportConfiguration ValueObject before being queued.

sequenceDiagram
    participant U as Client
    participant C as ReportController
    participant S as ReportDispatchService
    participant J as ProcessReportQueryJob

    U->>C: POST /api/reports/proposals
    C->>C: Create ReportConfiguration
    C->>S: dispatch(Configuration)
    S->>S: Validate Duplicate Processes
    S->>S: Persist ExportProgress (Status: WAITING)
    S->>J: Dispatch(ReportProcessorData)
    S-->>U: 202 Accepted (process_id)
Loading

2. The CSV Pipeline (Pipe & Filter)

The heavy lifting is handled by a sequential pipeline where each stage has a single responsibility.

graph LR
    Start[StartReport] --> Build[BuildQuery]
    Build --> Count[CountRows]
    Count --> Process[ProcessRows]
    Process --> Zip[ZipCsv]
    Zip --> Finish[FinishReport]

    subgraph "ProcessRows Core Loop"
        Cursor[Eloquent Cursor] -- "Row-by-row" --> Buffer[Service Buffer]
        Buffer -- "Chunk reached?" --> Disk[Append to Local Disk]
        Disk -- "Every 3s" --> SSE[Queue SSE Progress]
        Disk -- "Every 10s" --> Beat[Redis Heartbeat]
    end
Loading

3. Real-time Communication (SSE)

Laravel publishes to Redis, which NestJS bridges to the client. This prevents PHP workers from being blocked by slow SSE clients.

sequenceDiagram
    participant W as Laravel Worker
    participant R as Redis Pub/Sub
    participant N as NestJS Bridge
    participant C as Client

    W->>W: Processed 5000 rows
    W->>R: PUBLISH reports:events:user:{id} { "progress": 50% }
    R-->>N: Trigger Subscriber
    N->>C: Push SSE: progress
Loading

🚀 Key Performance & Reliability Features

  1. Memory Management:
    • Eloquent Cursors: Streams database results without loading the entire collection.
    • Chunked Buffering: Rows are buffered in memory and flushed to disk in configurable sizes (e.g., 5000 rows) to keep memory usage flat.
  2. Concurrency Control:
    • Duplicate Prevention: Checks for active reports with identical filters for the same user before dispatching.
    • Heartbeat Monitoring: Workers send a heartbeat to Redis. A scheduled command (VerifyReportHeartbeatsCommand) detects and fails "zombie" jobs.
  3. Resiliency:
    • SSE Debouncing: SendSseEventJob can be configured to drop intermediate updates if the queue is backed up, ensuring the UI always receives the latest state.
    • Multipart Uploads: Large zip files are uploaded to S3 using multi-threaded streaming (10MB chunks).
  4. Extensibility:
    • Boilerplate Generation: A custom Artisan command creates all necessary files (Mapper, Query, Controller, Request) to ensure consistency.

🛠 Usage & Extension

Adding a New Report Type

  1. Generate Boilerplate:
    php artisan make:report {DomainName} --report_name="Friendly Name" --queue="queue_name"
  2. Define Query: Implement formulateTables, formulateConditions, and formulateColumns in app/Reports/{DomainName}/{DomainName}Query.php.
  3. Define Mapper: Implement the map method in app/Reports/{DomainName}/{DomainName}Mapper.php to transform raw DB rows into CSV columns.
  4. Register Route: Add the route to routes/api.php as suggested by the command output:
    Route::match(['GET', 'POST'], 'your-report-name', YourReportController::class);

Maintained by the Core Engineering Team.

About

A high-performance, asynchronous platform for generating complex reports. Built with Laravel, it leverages advanced SQL processing, large-scale data handling, and a decoupled Server-Sent Events (SSE) architecture for real-time user feedback.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages