Documentation

Welcome to the AI Phishing Detector documentation. This guide provides deep technical insights and practical instructions for every feature in our security ecosystem.

Layered Defense Strategy

AI Phishing Detector operates on a multi-layered security model designed to provide comprehensive protection across the entire threat lifecycle.

🔍

Detection

→

🔬

Investigation

→

🛡️

Protection

→

🎓

Education

1. Detection
Identify threats using ML, heuristics, and visual signals.

2. Investigation
Deep-dive into threat infrastructure and risk factors.

3. Protection
Take action with automated reporting and real-time blocking.

4. Education
Empower users with simulations and interactive AI guidance.

Core Detection Features

Random Forest Pipeline

Scan URL

The primary entry point for analyzing individual links. It provides a real-time verdict on whether a URL is legitimate or a phishing attempt.

How to Use

Navigate to the Scan page.
Enter the suspicious URL into the input field.
Click Analyze URL.
Review the verdict, confidence score, and red flags.

How it Works

Uses a Random Forest classifier trained on a dataset of over 1 million URLs. Our Advanced Feature Extractor pulls 37 unique signals including URL structure, domain age proxies, and keyword anomalies to calculate a probability score.

High-Throughput Async

Batch Checker

Designed for security researchers and SOC analysts to triage large volumes of suspicious links simultaneously.

How to Use

Go to the Batch Checker.
Paste a list of URLs (one per line) into the text area.
Select Analyze Batch.
View the summary table with aggregate stats and individual reports.

How it Works

Utilizes asynchronous processing to run the prediction pipeline across multiple inputs. It enriches the data with summary counts and highlights patterns across the batch to identify campaign trends.

Structural Heuristics

Zero-Day Detection

Detects phishing attempts that have never been seen before and are not yet on any blocklists.

How to Use

Access Zero-Day Detection from the Research menu.
Input a fresh or unknown URL.
The system will ignore reputation data and focus purely on behavior.

How it Works

Calculates Shannon Entropy for URLs and domains to detect obfuscation. It scores structural anomalies like Punycode/IDN homographs, suspicious TLDs (.xyz, .top), and brand spoofing patterns where a brand name appears in a non-legitimate domain.

Visual Fingerprinting

Visual Detect

Identifies sites that visually impersonate trusted brands (like PayPal, Google, or Microsoft) even if the URL looks relatively safe.

How to Use

Launch Visual Detect.
The engine scans the page content for visual brand markers.
Review the "Matched Brand" and "Spoofing Confidence" metrics.

How it Works

Extracts DOM signals including Favicon hashes, CSS color palettes, and image alt attributes. It matches these against a database of known brand profiles. If a site looks like "Netflix" but is hosted on a different domain, it triggers a visual spoofing alert.

Analysis & Investigation

Model Interpretability

Risk Heatmap

Explains the "Why" behind a detection, showing which specific features pushed the ML model toward a phishing verdict.

How to Use

After a scan, click View Heatmap.
Hover over different feature blocks to see their weights.
Red blocks indicate high phishing risk; Green indicates safety.

How it Works

Uses Local Interpretable Model-agnostic Explanations (LIME) logic to decompose the Random Forest prediction into individual feature contributions, visualizing the decision-making process of the AI.

Graph Analysis

Campaign Clustering

Identifies relationships between multiple phishing URLs to reveal broad attack campaigns.

How to Use

Enter multiple URLs or a target domain.
Run Cluster Analysis.
Examine the graph to find shared infrastructure (IPs, MX records, Nameservers).

How it Works

Uses a Distance Metric based on Jaccard Similarity between URL features and shared threat intelligence markers. It groups URLs that share identical hosting providers, registration patterns, or landing page structures.

NLP Analysis

Email & Scam Detector

Moves beyond URLs to analyze the text content of suspicious emails, SMS, or chat messages for social engineering signals.

How to Use

Paste the content of a suspicious message.
Select Analyze Content.
Check for urgency triggers, suspicious links, and sender anomalies.

How it Works

Employs Natural Language Processing (NLP) to detect "Psychological Triggers" like false urgency, fear-inducing language, and credential-harvesting patterns in the message body.

Response & Protection

Automated Takedown

Auto Report

Generates professional takedown reports for detected phishing URLs to be sent to registrars and hosting providers.

How to Use

After a positive phishing detection, click Generate Report.
Review the automatically drafted email.
Add any custom notes and send to the abuse contact.

How it Works

Queries WHOIS data to identify the abuse contact for the domain and hosting provider. It constructs a formal report containing the analysis timestamp, ML evidence, and visual spoofing data to accelerate the takedown process.

k-Anonymity Privacy

Breach Check

Checks if your passwords or accounts have been leaked in known data breaches without ever seeing your full credentials.

How to Use

Enter a password or email address.
Click Check Exposure.
See how many times (if any) the credential appeared in breaches.

How it Works

Uses k-Anonymity logic. The system hashes your input, sends only the first 5 characters of the hash to the breach database, and performs the final comparison locally. Your actual data never leaves your device.

Interactive Security AI

Hook Assistant

Hook is the interactive "guardian" layer of the platform, providing proactive advice and guided security workflows.

How to Use

Look for the Hook icon in the bottom right corner.
Ask questions like "Is this link safe?" or "Help me report a scam."
Follow Hook's interactive prompts for deep investigation.

How it Works

Uses Intent Classification to understand user requests. Hook can trigger any of the platform's detection APIs (Scan, Zero-Day, Breach Check) and format the results into actionable conversation. It acts as an orchestrator across all defense layers.