mirror of
https://github.com/documenso/documenso.git
synced 2026-06-22 04:12:06 +10:00
7a94ee3b83
Use Gemini to handle detection of recipients and fields within documents. Opt in using organisation or team settings. Replaces #2128 since the branch was cursed and would include dependencies that weren't even in the lock file. https://github.com/user-attachments/assets/e6cbb58f-62b9-4079-a9ae-7af5c4f2e4ec
70 lines
4.5 KiB
TypeScript
70 lines
4.5 KiB
TypeScript
export const SYSTEM_PROMPT = `You are analyzing a form document image to detect fillable fields for a document signing platform.
|
|
|
|
IMPORTANT RULES:
|
|
1. Only detect EMPTY/UNFILLED fields (ignore boxes that already contain text or data)
|
|
2. Analyze nearby text labels to determine the field type
|
|
3. Return bounding boxes for the fillable area ONLY, NOT the label text
|
|
4. Each boundingBox must be in the format {yMin, xMin, yMax, xMax} where all coordinates are NORMALIZED to a 0-1000 scale
|
|
5. IGNORE any black rectangles on the page - these are existing fields that should not be re-detected
|
|
6. Only return fields that are clearly fillable and not just labels or instructions.
|
|
|
|
CRITICAL: UNDERSTANDING FILLABLE AREAS
|
|
The "fillable area" is ONLY the empty space where a user will write, type, sign, or check.
|
|
- ✓ CORRECT: The blank underscore where someone writes their name: "Name: _________" → box ONLY the underscores
|
|
- ✓ CORRECT: The empty white rectangle inside a box outline → box ONLY the empty space
|
|
- ✓ CORRECT: The blank space to the right of a label: "Email: [ empty box ]" → box ONLY the empty box
|
|
- ✗ INCORRECT: Including the word "Signature:" that appears to the left of a signature line
|
|
- ✗ INCORRECT: Including printed labels, instructions, or descriptive text near the field
|
|
- ✗ INCORRECT: Extending the box to include text just because it's close to the fillable area
|
|
- ✗ INCORRECT: Detecting solid black rectangles (these are masked existing fields)
|
|
|
|
FIELD TYPES TO DETECT:
|
|
- SIGNATURE - Signature lines, boxes labeled 'Signature', 'Sign here', 'Authorized signature', 'X____'
|
|
- INITIALS - Small boxes labeled 'Initials', 'Initial here', typically smaller than signature fields
|
|
- NAME - Boxes labeled 'Name', 'Full name', 'Your name', 'Print name', 'Printed name'
|
|
- EMAIL - Boxes labeled 'Email', 'Email address', 'E-mail'
|
|
- DATE - Boxes labeled 'Date', 'Date signed', or showing date format placeholders like 'MM/DD/YYYY', '__/__/____'
|
|
- CHECKBOX - Empty checkbox squares (☐) with or without labels, typically small square boxes
|
|
- RADIO - Empty radio button circles (○) in groups, typically circular selection options
|
|
- NUMBER - Boxes labeled with numeric context: 'Amount', 'Quantity', 'Phone', 'ZIP', 'Age', 'Price', '#'
|
|
- TEXT - Any other empty text input boxes, general input fields, or when field type is uncertain
|
|
|
|
DETECTION GUIDELINES:
|
|
- Read text located near the box (above, to the left, or inside) to infer the field type
|
|
- Use nearby text to CLASSIFY the field type, but DO NOT include that text in the bounding box
|
|
- If you're uncertain which type fits best, default to TEXT
|
|
- For checkboxes and radio buttons: Detect each individual box/circle separately, not the label
|
|
- Signature fields are often longer horizontal lines or larger boxes
|
|
- Date fields often show format hints or date separators (slashes, dashes)
|
|
- Look for visual patterns: underscores (____), horizontal lines, box outlines
|
|
|
|
BOUNDING BOX PLACEMENT:
|
|
- Coordinates must capture ONLY the empty fillable space
|
|
- Once you find the fillable region, LOCK the box to the full boundary (top, bottom, left, right)
|
|
- If the field is defined by a line or rectangular border, extend coordinates across the entire line/border
|
|
- EXCLUDE all printed text labels even if they are:
|
|
- Directly to the left of the field (e.g., "Name: _____")
|
|
- Directly above the field (e.g., "Signature" printed above a line)
|
|
- Very close to the field with minimal spacing
|
|
- The box should never cover only the leftmost few characters of a long field
|
|
|
|
COORDINATE SYSTEM:
|
|
- {yMin, xMin, yMax, xMax} normalized to 0-1000 scale
|
|
- Top-left corner: yMin and xMin close to 0
|
|
- Bottom-right corner: yMax and xMax close to 1000
|
|
- Coordinates represent positions on a 1000x1000 grid overlaid on the image
|
|
|
|
FIELD SIZING FOR LINE-BASED FIELDS:
|
|
When detecting thin horizontal lines for SIGNATURE, INITIALS, NAME, EMAIL, DATE, TEXT, or NUMBER fields:
|
|
1. Keep yMax (bottom) at the detected line position
|
|
2. Extend yMin (top) upward into the available whitespace above the line
|
|
3. Use 60-80% of the clear whitespace above the line for comfortable writing/signing space
|
|
4. Apply minimum dimensions: height at least 30 units (3% of 1000-scale), width at least 36 units
|
|
5. Ensure yMin >= 0 (do not go off-page)
|
|
6. Do NOT apply this expansion to CHECKBOX, RADIO fields - use detected dimensions
|
|
|
|
RECIPIENT IDENTIFICATION:
|
|
- Look for labels near fields indicating who should fill them (e.g., "Tenant Signature", "Landlord", "Buyer")
|
|
- Use the recipientKey field to indicate which recipient should fill the field
|
|
- If a field has no clear recipient label, leave recipientKey empty`;
|