Skip to content

YeboVerify Verification Pipeline

Deep dive into the face comparison and OCR pipeline.

Pipeline Overview

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Upload    │────►│   Storage   │────►│  Processing │
│   Images    │     │   (R2)      │     │   Queue     │
└─────────────┘     └─────────────┘     └──────┬──────┘

                    ┌──────────────────────────┤
                    │                          │
              ┌─────▼─────┐              ┌─────▼─────┐
              │ Rekognition│              │  Gemini   │
              │   (Face)   │              │  (OCR)    │
              └─────┬─────┘              └─────┬─────┘
                    │                          │
                    └──────────┬───────────────┘

                         ┌─────▼─────┐
                         │ Decision  │
                         │  Engine   │
                         └─────┬─────┘

                         ┌─────▼─────┐
                         │  Webhook  │
                         │  Delivery │
                         └───────────┘

Stage 1: Image Upload

Accepted Formats

  • JPEG/JPG
  • PNG
  • WebP
  • HEIC/HEIF

Size Limits

  • Maximum: 10MB per file
  • Recommended: 1-5MB for optimal processing

Quality Requirements

  • ID Document: Clear, all text legible
  • Selfie: Face clearly visible, good lighting

Stage 2: Storage

Images are stored in Cloudflare R2:

verifications/
└── {verificationId}/
    ├── id-front-{timestamp}
    ├── id-back-{timestamp}   (optional)
    └── selfie-{timestamp}
  • Private bucket - No public access
  • Encrypted at rest
  • Region: Auto (Cloudflare's global network)

Stage 3: Face Comparison (Rekognition)

AWS Rekognition CompareFaces

typescript
const command = new CompareFacesCommand({
  SourceImage: { Bytes: idFrontBuffer },
  TargetImage: { Bytes: selfieBuffer },
  SimilarityThreshold: 0,
  QualityFilter: 'AUTO'
});

const response = await rekognitionClient.send(command);

Response Analysis

typescript
if (response.FaceMatches && response.FaceMatches.length > 0) {
  const match = response.FaceMatches[0];
  
  return {
    similarity: match.Similarity,        // 0-100
    confidence: match.Face.Confidence,   // Detection confidence
    boundingBox: match.Face.BoundingBox, // Face location
    quality: match.Face.Quality          // Brightness, sharpness
  };
}

Similarity Thresholds

ScoreResultDescription
90-100Strong matchVery high confidence
80-89Good matchAcceptable for approval
70-79BorderlineMay need review
60-69Weak matchLikely different people
<60No matchImmediate rejection

Quality Checks

Rekognition also returns face quality metrics:

typescript
interface FaceQuality {
  Brightness: number;  // 0-100
  Sharpness: number;   // 0-100
}

If quality is poor, the verification may fail or need review.

Stage 4: Document OCR (Gemini)

Gemini Vision API

typescript
const model = genAI.getGenerativeModel({ model: 'gemini-1.5-pro' });

const prompt = `Analyze this ID document and extract:
- surname
- names (first and middle)
- dateOfBirth (YYYY-MM-DD format)
- idNumber
- documentType
- nationality (if visible)

Return confidence score (0-100).
Return ONLY valid JSON.`;

const result = await model.generateContent([
  { text: prompt },
  { inlineData: { mimeType: 'image/jpeg', data: base64Image } }
]);

Extracted Fields

FieldDescriptionExample
surnameLast name"Doe"
namesFirst + middle"John James"
dateOfBirthBirth date"1990-05-15"
idNumberDocument number"123456789"
documentTypeDocument type"National ID"
nationalityCountry"South Africa"
expiryDateExpiry date"2025-12-31"

Confidence Scoring

Gemini provides a confidence score based on:

  • Text clarity
  • Field completeness
  • Format consistency

Stage 5: Decision Engine

Decision Matrix

typescript
function makeDecision(faceScore: number, ocrConfidence: number): Decision {
  // High confidence approval
  if (faceScore >= 85 && ocrConfidence >= 70) {
    return {
      decision: 'APPROVED',
      confidence: 'high',
      reason: 'High face match and OCR confidence'
    };
  }
  
  // Medium confidence approval
  if (faceScore >= 80 && ocrConfidence >= 60) {
    return {
      decision: 'APPROVED',
      confidence: 'medium',
      reason: 'Acceptable face match and OCR confidence'
    };
  }
  
  // Needs review
  if (faceScore >= 70 && ocrConfidence >= 50) {
    return {
      decision: 'NEEDS_REVIEW',
      confidence: 'medium',
      reason: 'Borderline scores require manual review'
    };
  }
  
  // Rejection
  return {
    decision: 'REJECTED',
    confidence: 'low',
    reason: 'Scores below minimum thresholds'
  };
}

Rejection Reasons

ReasonCause
Face similarity below 60%Selfie doesn't match ID photo
OCR confidence below 50%Document unreadable
No face detectedID photo or selfie missing face
Poor image qualityBlurry, dark, or obscured

Stage 6: Webhook Delivery

Payload

json
{
  "event": "verification.completed",
  "verificationId": "vrf_abc123",
  "status": "completed",
  "decision": "approved",
  "confidence": "high",
  "faceScore": 92.5,
  "ocrConfidence": 85.0,
  "extractedData": {
    "surname": "Doe",
    "names": "John"
  },
  "timestamp": "2024-01-15T10:00:03Z"
}

Signature Verification

typescript
// Your webhook handler
app.post('/webhook', (req, res) => {
  const signature = req.headers['x-yeboverify-signature'];
  const expected = `sha256=${crypto
    .createHmac('sha256', WEBHOOK_SECRET)
    .update(JSON.stringify(req.body))
    .digest('hex')}`;
  
  if (signature !== expected) {
    return res.status(401).send('Invalid signature');
  }
  
  // Process event...
  res.status(200).send('OK');
});

Processing Time

Typical end-to-end: 2-5 seconds

StageTypical Time
Upload500ms
Face comparison1-2s
OCR extraction1-2s
Decision<100ms
Webhook500ms

Error Handling

Rekognition Errors

  • No face detected in source
  • No face detected in target
  • Multiple faces detected
  • Image quality too low

OCR Errors

  • Document not recognized
  • Text not legible
  • Unsupported document type

Fallback Behavior

If either service fails:

  • Mark verification as NEEDS_REVIEW
  • Include error details in response
  • Still fire webhook with partial data

One chat. Everything done.