Email sorting tool

Problem: archiving functions and rule features in email clients are not able to handle the massive amounts of emails that many people have accumulated, so finding emails and email conversations is often difficult.

Solution: a cross-platform, cross-client plug-in to automatically sort emails based on multiple, user-configurable parameters.

Parameters:

  1. Sender of the email
    1. If the sender has multiple email addresses, must have a way to make sure the emails go to the same destination
    2. If the user wants to put more than one sender into the same folder, then need a way to do this (e.g., everyone on the user’s casual softball team)
    3. Some email listservs use the FROM field in a tricky way, if the user wants all listserv to go to the same folder, need a way to deal with that
  2. Recipient
    1. Ability to put email from the user to a particular recipient in a specific folder, including the same folder as received email. Keep in mind the problem of a recipient with multiple email addresses
    2. If the user is one of multiple recipients, then ability to copy email to folders of the other recipients
    3. If the user sends the email to multiple people, the option to place copies in all folders or only one folder
    4. If the email has CC or BCC, option to place copies in folders
    5. If the user has multiple email addresses that go to the same client, then the ability to either make rules that apply to all accounts or only specific accounts
  3. Date
    1. Folders based on dates
    2. Subfolders based on dates (e.g., Archive/Bob/2012)
    3. Only archive if older than a specific age
  4. Flags
    1. Archive by flag (e.g., Red flags go to a red folder)
    2. Ability to use a flag to prevent archiving until flag is cleared (especially useful with auto-archival of old emails)
    3. Only archive when a flag is set (especially if the flag is the “completed” flag)
    4. Subfolders based on flags (e.g., Archive/Bob/HR-Flag)
  5. Locate duplicates within a folder
  6. De-duplicate threads (advanced feature)
    1. Most email threads have massive amounts of duplicate data because of quoting prior emails
    2. The ability to reduce the amount to archived data would be very useful
    3. But, must be careful to preserve meta data of original emails
    4. Must also be careful to not delete quoted emails that were modified
    5. Some threads get forked into multiple threads, be careful not to delete
    6. This technology already exists at companies who specialize in discovery for lawsuits
  7. Clients:
    1. Outlook
    2. Firefox
    3. Google
    4. Hotmail
    5. Yahoo
    6. Other
  8. Ability to synchronize across clients
    1. De-duplication is especially important with this feature
    2. If some clients have storage-size limitations, ability to manage this
      1. Allow for incomplete synchronization (e.g., if Gmail is at the user-define storage limit, then delete the largest files or attachments only from Gmail and leave them intact in other places)
      2. Deep-archive (see below)
  9. Deep-archive
    1. Based on parameters, can create multiple archive files
    2. Allows to keep files sizes smaller
    3. Especially useful for extremely old emails
    4. Useful for senders who are no longer active (e.g., a defunct listserv or former client)
    5. Ability to compress files
    6. Less overhead when loading client programs
    7. Can help avoid file storage size problems with synchronization
  10. Auto backup
    1. Not the same as synchronization
    2. If synchronization is used, deleting files/emails in one location could cause them to be deleted in all locations
    3. Backup is essential to preventing data loss from user error
    4. Backing up some files, such as Outlook, can be difficult because the client program locks the files and because the files can change during the backup process
    5. Build in features to backup to multiple locations on schedules
  11. Client portability, or import/export, or ease of migration
    1. The same technology used for synchronization could easily be adapted to allow users to easily migrate their data from one client to another
  12. Category of sender
    1. Most clients have address books and most address books allow the user to assign categories to the address book entries
    2. Allow archival based on category (e.g., Archive/Family/Mom/2012)
    3. If multiple categories, allow multiple copies to be stored
  13. Tagging of copies
    1. If an email is copied to multiple locations, add information to the email meta data identifying all of the locations of the copies
    2. From time-to-time, run a check to make sure the copies still exist
    3. If it finds a copy, check that the meta data on both copies match, if not then update
    4. From time-to-time, scan for copies that are were not recorded (this could be resource intensive)
  14. Retroactive reassignment of parameters and sorting
    1. If the user changes the name of Category ABC to Category XYZ, allow the user to rename all relevant folders and resort
    2. If the user re-assigns an address book contact from Category P to Category Q, allow the user to rename and resort
    3. If the user changes the folder scheme, allow resorting. (E.g., if the structure was Archive/Category/User/Year and it changes to Archive/User/Flag)
  15. Importation of rules
    1. Many clients already have a “rules” feature
    2. Allow the user to import the rules, as best as possible, to create rules within the plug-in
  16. Allow automatic processing of the plug-in, manual process, or a combination
  17. All (or at least most) other options available in the “rule” features of other clients
  18. Special problems
    1. Encrypted emails make it difficult to sort based on the contents of the email
    2. Some legitimate senders intentionally obfuscate their identity

Read more