< All Topics
Print

NAGIX AI Technical Architecture v1.5

NAGIX AI Technical Architecture v1.5 – KB Article

NAGIX AI Technical Architecture v1.5

KB Type: Technical Architecture Article

Source: NAGIX-AI Technical Architecture v1.5 PDF

Scope: SaaS architecture, deployment model, core services, storage, identity, external AI services, secrets, monitoring, and operational responsibility.

Status: Draft article prepared from a full PDF source for KB review.

NAGIX AI AWS EKS MongoDB Amazon S3 Keycloak Azure OCR OpenAI PDF/UA WCAG

💡
Purpose: This article converts the source PDF into a KB-ready technical summary. It does not replace a security design review, contractual Data Processing Agreement, cloud configuration evidence, or vendor compliance documentation.

1. Overview

NAGIX AI is described as a SaaS system hosted in AWS for automatic AI-based conversion of documents into accessible documents, including PDF/UA outputs. The system is intended to process individual documents, including long and complex documents, and convert them automatically into accessible formats.

The platform supports common office/document formats such as PDF, Word, PowerPoint, and Excel. It is designed to identify document structure elements such as headings, tables, images, reading order, and other accessibility-related elements required for accessible output.

Business value: The target use case is large-scale document remediation where organizations need to reduce manual accessibility work and improve alignment with accessibility standards and regulations such as PDF/UA and WCAG.

2. Architecture at a Glance

The architecture is built around an AWS-hosted Kubernetes environment. The diagram in the source PDF shows Cloudflare, AWS WAF, NGINX ingress, Amazon EKS, an internal Kubernetes cluster, internal services, customer-specific storage, and integrations with Azure and OpenAI services.

NAGIX AI architecture diagram from source PDF page 2
Architecture diagram extracted from source PDF page 2.

2.1 High-Level Layers

LayerPurposeMain Elements
Edge and ingress Protect and route incoming access to internal services. Cloudflare, AWS WAF, NGINX ingress
Compute and orchestration Run the platform services in a managed Kubernetes environment. Amazon EKS / Kubernetes Cluster
Core application services Receive requests, coordinate processing, expose secure APIs, and manage operational workflows. NAGIX-AI service, Backoffice API, OCR service, Shield Proxy
Identity and administration Authenticate users, manage roles, groups, companies, and authorized access. Keycloak, Backoffice Front Application
Data and file storage Store technical metadata and document files with tenant separation. MongoDB, Amazon S3 customer buckets
External AI/OCR services Perform OCR, document analysis, and image analysis when required. Azure Content Understanding, Azure Document Intelligence, OpenAI

3. Deployment Model

The source document separates deployment responsibilities into four numbered areas shown in the architecture diagram.

Diagram MarkDeployment MeaningKB Interpretation
1 Managed in the supplier environment. Can also be deployed in the customer's organizational EKS environment. The main platform stack is supplier-managed by default, with a possible customer-hosted EKS model.
2-3 Third-party services operated and managed only in the supplier's infrastructure and not intended for customer-side installation. Azure OCR/document services and OpenAI image analysis remain external service dependencies.
4 Can be installed either in the customer environment or in the supplier environment, depending on target architecture and project requirements. The portal/application access layer has flexible deployment placement.
ℹ️
Architecture choice: The same logical product can be consumed as supplier-managed SaaS or integrated into a customer-hosted model for selected components. The final ownership model must be defined during implementation.

4. Component Map

4.1 NAGIX-AI Core Service

The NAGIX-AI service is the central service layer. It exposes secure APIs to system consumers and integrations, receives requests, manages processing coordination, and orchestrates the platform components involved in document conversion.

4.2 Internal OCR Service

The architecture includes an internal OCR service running inside the Kubernetes cluster. Its role is document processing and information extraction as part of the overall conversion workflow.

4.3 Shield Proxy

The diagram includes an internal Shield Proxy layer inside the Kubernetes cluster. In the KB context, this should be treated as an internal mediation/protection layer between platform services unless more detailed design documentation defines its exact routing and enforcement responsibilities.

4.4 Keycloak

Keycloak is the identity and access management component for the Backoffice. It handles authentication, authorization, roles, and user groups. The source document states support for OpenID Connect, OAuth 2.0, SSO, and integration with enterprise identity systems such as Microsoft Entra ID or Active Directory.

4.5 Backoffice Front Application

The Backoffice Front Application is an Angular-based web application used as the administration interface. It can be installed in the supplier environment or customer environment. It supports management of companies, authorized users, roles, permissions, and administrative/operational actions.

4.6 Backoffice API

The Backoffice API is the backend service for the administration layer. It exposes secure APIs for company, user, role, and permission management. It is also responsible for request handling, access enforcement, data validation, and saving/updating management information in the system databases.

4.7 NAGIX-AI Portal

The NAGIX-AI Portal is the user-facing access layer. It communicates with backend services through secure APIs. Users can create and manage business processes, view processed accessible documents, retrieve document lists, perform management/control actions, and track workflow status according to their authorization level.

5. Data Storage, Retention, and Tenant Isolation

5.1 MongoDB

The system uses MongoDB databases managed in AWS. According to the source document, each customer receives a dedicated and separate database instance to support full tenant isolation. MongoDB stores technical and operational metadata only, not the document content itself.

Stored in MongoDBNot Stored in MongoDB
  • Request IDs
  • Intake and processing times
  • Processing status
  • Services and components involved in processing
  • Monitoring, troubleshooting, and audit information
  • Full document content
  • Full file payloads
  • Unnecessary document data beyond operational metadata

5.2 Amazon S3

Amazon S3 is used for secure document and file storage. Each customer receives a dedicated and separate bucket, enabling separation between customer data. The default retention period is up to 30 days, and the retention duration can be adjusted according to customer requirements.

The source document also states that, where required, the system can be configured to work directly with the customer's S3 environment so that files do not need to be stored in the supplier's storage environment.

⚠️
Review point: For production onboarding, retention, deletion timing, bucket ownership, encryption keys, access logs, and restore procedures must be explicitly agreed and verified. The article summarizes the architecture; it does not prove the actual live cloud configuration.

6. External AI and OCR Services

The source architecture consumes external services for document OCR, document analysis, and image analysis.

External ServicePurposeData Handling Notes from Source
Azure Content Understanding and Azure Document Intelligence OCR, document analysis, and structured data extraction. Communication is performed over TLS/HTTPS. The source document states use of Private Endpoint / Private Channel and temporary storage in Microsoft Azure West Europe during processing.
OpenAI Image analysis only. Only the images required for processing are sent, not the full document, following a data minimization approach.

6.1 Azure Services

Azure Content Understanding and Azure Document Intelligence are used for OCR, document analysis, and extraction of structured information. Data transfer between AWS and Azure is described as encrypted using TLS/HTTPS. The document states that access to these services is protected through Private Endpoint and Private Channel so the services are not exposed to the public internet and are accessible only from authorized addresses and networks.

During processing, documents are stored temporarily in the Microsoft Azure service environment in the West Europe region. The source document states that documents are deleted according to the service policy after processing and are not used by Microsoft to train, improve, or adapt AI models.

6.2 OpenAI Image Analysis

OpenAI is used only for image analysis. The source document emphasizes data minimization: only required images are sent to the service, without sending the complete document or unrelated information.

🔧
Implementation validation: For regulated customers, confirm the exact vendor terms, region, retention policy, data usage policy, private connectivity design, and secret ownership model before go-live.

7. Security Controls

7.1 Access Control

Access to the Backoffice and Portal is limited to authorized users. User visibility and available actions are controlled by authentication and authorization mechanisms, so each user can access only the information and operations allowed by their role.

7.2 Tenant Isolation

The architecture describes tenant isolation at both metadata and file-storage levels: customer-specific MongoDB instances and customer-specific Amazon S3 buckets. The design goal is to prevent sharing or access between customer data environments.

7.3 Least Privilege

The source document refers to the principle of Least Privilege. Access to databases, S3 storage, internal services, and service-specific data should be limited only to the components and services that require it for their role.

7.4 Encryption

AreaControl
Files at rest in S3Automatic AWS encryption at rest is described in the source document.
Data in transitTLS/HTTPS is used for communication, including AWS-to-Azure communication.
Azure processingThe source document states data is encrypted during storage and processing according to Azure security mechanisms.

7.5 Secrets Management

The system uses secure secrets management for access keys, tokens, external service credentials, and cloud permissions. The current model described in the PDF manages required secrets in Kubernetes with access restricted to authorized services. The architecture also supports AWS Secrets Manager for secure storage, permission management, and centralized control.

The document defines a separation between customer-owned secrets and supplier-owned secrets. Customer-related secrets can remain under customer management and control, while supplier-service secrets are managed in the supplier's secured environment. Integration with customer enterprise secret management systems is also supported.

💡
Ownership model: During implementation, all secret types should be mapped and assigned to an ownership model: customer-owned, supplier-owned, or shared/integration-specific.

8. Monitoring and Operational Responsibility

The system includes monitoring, control, and alerting capabilities for service availability, performance, processing health, and abnormal system events. Operational responsibility depends on the deployment model.

Deployment ModelOperational Responsibility
Supplier-managed SaaS The supplier is responsible for monitoring, maintenance, incident handling, alert management, and availability control.
Customer-hosted environment The customer is responsible for monitoring infrastructure, services, and resources using the customer's tools, procedures, and organizational controls.

In a customer-hosted model, controlled information sharing, alerts, or support access can be defined for troubleshooting and support, subject to the customer's information security policy.

9. QA / Architecture Review Checklist

Use this checklist when reviewing a real deployment, implementation proposal, or customer-facing architecture response.

AreaQuestionExpected Evidence
Deployment model Which components are supplier-hosted, customer-hosted, or third-party? Approved target architecture diagram and responsibility matrix.
Tenant isolation Does each customer have separate MongoDB and S3 resources? Cloud resource mapping, naming convention, IAM policy evidence.
Retention Is the default 30-day file retention accepted or changed? Customer-approved retention setting and deletion verification process.
External services Which data is sent to Azure and OpenAI? Data flow diagram, payload examples, vendor policy references.
Private connectivity Are Azure services reachable only through authorized private paths? Private Endpoint / network configuration evidence.
Identity Is Keycloak integrated with customer SSO, Entra ID, or Active Directory? OIDC/OAuth configuration, MFA policy, role mapping.
Secrets Which secrets are customer-owned and which are supplier-owned? Secrets ownership matrix and access policy.
Monitoring Who monitors service health and handles alerts? Runbook, SLA/SLO agreement, escalation path.
Accessibility output Which standards are used to validate output? PDF/UA, WCAG, PAC/validator results, remediation reports.
⚠️
Do not assume: This article should not be used as proof that a specific production environment is configured correctly. Treat it as an architecture summary until validated against live cloud settings, logs, policies, and customer contracts.

10. Source Mapping

PDF PageUsed For
Page 1Title, product name, version 1.5, NAGIX AI branding.
Page 2Cloud architecture overview and architecture diagram.
Page 3Deployment model, NAGIX-AI core, OCR service, MongoDB, Amazon S3, Keycloak, Backoffice components.
Page 4S3 storage, retention, encryption, Keycloak, Backoffice Front Application, Backoffice API.
Page 5NAGIX-AI Portal, Azure OCR/document analysis services, OpenAI image analysis, data minimization.
Page 6Secrets management, monitoring, operational responsibility for supplier-managed SaaS and customer-hosted models.
Page 7Product summary, supported file types, accessibility targets, PDF/UA and WCAG context.

Recommended KB Tags

NAGIX-AI, Technical Architecture, AWS, EKS, Kubernetes, MongoDB, S3, Keycloak, Azure Document Intelligence, OpenAI, PDF/UA, WCAG, Accessibility, SaaS, Security

📥
KB Import Note: This file is a standalone Custom HTML article. Replace this block with an internal KB source link after upload.
תוכן עיניינים