Project

General

Profile

Design-Database

Technical documentation
24/03/2026

Collector Database Design

Collector Database will have a collections table.
collections table will have the following fields that are mandatory.

  • id (primary key)
  • domain
  • apikey
  • srcmd5 - varchar(32)
  • cleanmd5
  • srctxt
  • cleantxt
  • cur_url
  • scriptpath
  • parameters
  • seqn
  • remote_addr
  • user_agent
  • createdat
  • lastmodified
  • length
  • words
  • handle
    |A |B |C |
    |--|--|--|
    |cleanmd5 |varchar(32) |index |
    | | | |
    | | | |
    | | | |
    | | | |

The collections table should be indexed on the the following columns.

  1. srcmd5
  2. cleanmd5

The collections table will have unique key for the followings:

  • id
  • srcmd5, domain, cur_url

Files