SCOPE - Scan, Classify, Organize, Present & Explore. Self-hosted PDF document management with OCR, AI summaries, and NAS integration.
  • PHP 81.5%
  • CSS 8.9%
  • Shell 5.4%
  • JavaScript 2.6%
  • Python 1.6%
Find a file
gdadkisson c2e4a85597 Security fixes from codex-review.md
Addresses all six review findings plus one additional same-class XSS:

- Critical: thumbnails no longer served via unauthenticated static Nginx
  alias; new authenticated /documents/{id}/thumbnail endpoint enforces
  login and category permissions.
- High: default deployment is now HTTPS (TLS + HTTP->HTTPS redirect + HSTS,
  self-signed fallback); session cookies set SameSite=Lax and Secure.
- High: migrate.php adds the missing watched_folders columns
  (default_category_id, exclusion_patterns) and collections.user_id, with
  idempotent ALTER-based migrations for existing installs.
- High: stored XSS via collection names / filenames / emails fixed by
  replacing inline onclick handlers with data-* + addEventListener.
- High: broad www-data sudoers replaced with a single strictly-validated
  root helper (scope-mount-helper); removes "mount -t cifs *" and
  "tee /etc/fstab".
- Medium: tags/stats/activity scoped to permitted categories; collections
  gain ownership and delete authorization.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016Yp96UhXo7DqQD8oDACQfm
2026-06-20 18:43:05 -04:00
opt Security fixes from codex-review.md 2026-06-20 18:43:05 -04:00
public Security fixes from codex-review.md 2026-06-20 18:43:05 -04:00
scripts Security fixes from codex-review.md 2026-06-20 18:43:05 -04:00
src Security fixes from codex-review.md 2026-06-20 18:43:05 -04:00
templates Security fixes from codex-review.md 2026-06-20 18:43:05 -04:00
.env.example Initial commit: SCOPE v.0.74 2026-05-22 15:06:02 -04:00
.gitignore Security fixes from codex-review.md 2026-06-20 18:43:05 -04:00
composer.json Initial commit: SCOPE v.0.74 2026-05-22 15:06:02 -04:00
composer.lock Initial commit: SCOPE v.0.74 2026-05-22 15:06:02 -04:00
install.conf.example Security fixes from codex-review.md 2026-06-20 18:43:05 -04:00
install.sh Security fixes from codex-review.md 2026-06-20 18:43:05 -04:00
README.md Update README: replace manual install steps with install.sh workflow 2026-05-26 15:59:38 -04:00

SCOPE

Scan, Classify, Organize, Present & Explore

A self-hosted PDF document management system with full-text OCR search, AI-generated summaries, NAS folder monitoring, and a clean web UI. Built for personal/internal home network use.


Features

  • Upload PDFs via web UI or auto-ingest from watched NAS folders
  • Full-text search across OCR content, tags, and filenames
  • First-page thumbnails in search results
  • In-browser PDF viewer
  • AI summaries and tag suggestions (Ollama, Anthropic Claude, or Google Gemini)
  • Document categories with color coding and per-user permissions
  • Collections (saved filter sets)
  • Activity dashboard with live OCR and summarization status
  • Multi-user support with super-admin and regular user roles
  • Email registration with SMTP verification

Requirements

  • Debian 13 (Trixie) or later
  • A VM or bare-metal host (privileged LXC containers restrict SMB share mounting)
  • Network access to an LLM backend: Ollama, Anthropic API key, or Google Gemini API key

All other dependencies (Nginx, PHP 8.4, MariaDB, Tesseract, Poppler, Python 3, Composer, etc.) are installed automatically by the installer.


Installation

1. Download the config template and fill it in

curl -fsSL https://git.knowthenerds.com/gdadkisson/scope/raw/branch/main/install.conf.example \
     -o install.conf

Edit install.conf and set at minimum:

Key Description
APP_URL Full URL the app will be served at (e.g. http://scope.example.com)
APP_HOSTNAME Hostname for the Nginx server_name directive
DB_ROOT_PASS Password to set for MariaDB root
DB_PASS Password for the app's database user
ADMIN_USERNAME First super-admin login name
ADMIN_PASSWORD Leave blank to auto-generate and print at the end

LLM and SMTP settings can be left blank and configured later at /settings.

2. Run the installer

curl -fsSL https://git.knowthenerds.com/gdadkisson/scope/raw/branch/main/install.sh \
     -o install.sh
sudo bash install.sh install.conf

The installer will:

  • Install all system packages (Nginx, PHP 8.4 + extensions, MariaDB, Tesseract, Poppler, cifs-utils, Python 3 venv)
  • Clone this repo to /var/www/scope and run composer install
  • Write /var/www/scope/.env with a freshly generated APP_SECRET
  • Create the MariaDB database and run the schema migration
  • Configure Nginx and PHP-FPM
  • Create /var/scope/{thumbnails,logs,tmp} with correct permissions
  • Install the NAS folder watcher as a systemd service (scope-watcher)
  • Write /etc/sudoers.d/scope-mount so the app can mount/unmount SMB shares
  • Enable and start all services

On completion it prints the app URL, admin username, and (if auto-generated) the admin password. Save the password -- it is only shown once.

Re-running / upgrading

The installer is safe to re-run. On an existing installation it does git pull instead of a fresh clone and leaves the existing .env and database intact.


First Login

Navigate to APP_URL. Log in with the admin credentials shown at the end of the install. Go to Settings > Users to create additional accounts.


LLM Configuration

In Settings > LLM, choose a provider:

Provider Notes
Ollama Self-hosted; set endpoint to http://your-ollama-host:11434; uses native /api/chat
Anthropic Requires API key; uses Claude models
Google Gemini Requires API key

Recommended Ollama model: qwen2.5:7b


NAS Folder Monitoring

Add watched folders at Settings > Folders. SCOPE mounts SMB shares via CIFS and monitors them with the scope-watcher service. New PDF files are automatically ingested, OCR'd, and indexed.

If a NAS account password changes, update the folder credentials in Settings > Folders and use the remount action. The app detects and recovers from stale CIFS sessions automatically.


Background Scripts

Script Purpose
scripts/process.php OCR pipeline (Tesseract)
scripts/summarize_batch.php Batch AI summarization
scripts/suggest_tags_batch.php Batch AI tag suggestions
scripts/import.php Bulk import a watched folder
scripts/ingest.php Ingest files detected by the watcher
scripts/migrate.php Database schema migration

These are invoked automatically by the web app or manually for bulk operations.


Key Paths

Path Contents
/var/www/scope/ Application source
/var/www/scope/.env Environment config (credentials)
/var/scope/thumbnails/ Generated PDF thumbnails
/var/scope/logs/ OCR, summarization, watcher, and Nginx logs
/etc/scope/*.creds SMB credentials files (root-readable only)
/opt/scope-watcher/ Python venv + watcher script

Environment File Reference

See .env.example for a template with all supported variables.


License

Private/personal use. Not licensed for redistribution.