Untitled Synchronization and Backup Project
From WiredWings Wiki
Contents |
Participate
This is supposed to be a collaborative effort towards a better world. ;-) So, if you have any ideas, suggestions, or comments, don't hesitate to contact me. Or just register an account here and start editing!
Idea
Everything is already available and only has to be "plugged together" !
Requirements/Goals
- synchronize and/or backup file sets
- unlimited revisions (version history)
- support central servers and/or decentralized client exchange ("peer2peer")
- support to backup to external drives, including the full file history
- local revision history is automatically distributed to all participating clients
- transparent
- automatically monitors/tracks changes and sync in the background
- if connected to network/other clients are avaiable, sync to them (including file history)
- alternatively, eg. for files that are constantly changing (email database), do scheduled syncs
- safe encryption on all clients/servers (both storage and transfers)
- open source
- platform independent
- independently scalable to unlimited clients
- "traveller mode":
- web interface to download individual files from any revision
- sync complete set of files from a specific date/revision without admin rights
What?
To get the idea how this looks like, check out existing solutions.
- TeamDrive - most promising; commercial
- PowerFolder - no versioning; commercial
- Syncplicity, Dropbox, SugarSync, BeInSync - commercial; no own server possible, no storage encryption
- Yosemite Filekeeper - commercial, Windows only
How?
Build this on top of existing Open Source projects, with minimal effort!
Components
Synchronization and Versioning
Distributed revision control systems (DRCS) already contain most of the required features. They can work for both centralized and decentralized models. There are web interfaces available to browse/download files from repositories. The client does not require admin rights. Transfers can be encypted. Most are platform independent.
Currently, I'm in favor of either Bazaar or Mercurial. Todo: Find out/test how good the different version control systems handle large repositories/files.
"Monitor"
A small tool will take care of the rest. All it has to do is:
- monitor selected folders for changes
- depending on the configuration or file type, execute a command to immediately auto-commit the changes into the local repository, or schedule a commit for later
- monitor availability of remote repositories
- if available, synchronize with other clients directly, or with specified central repository
- monitor availability of external drives
- synchronize to repository copy on external drive
The commands for each action can be preconfigured, but also flexibly changed, so there is no dependency on a special DRCS whatsoever, and the DRCS itself doesn't have to be adapted/changed.
Encryption
For example, the repository could be placed inside any encrypted container (eg. TrueCrypt). Maybe we can find something more suitable.
Existing Libraries
- JNotify monitors for file changes efficiently on both Linux and Windows
- The Volume Shadow Copy Service can be used to commit locked files on Windows clients
Why not use existing solutions?
I'd be happy to find one that meets all my requirements! Please, tell me if you do!
If you're looking for something similar, but not exactly like that, you might want to consider
Tools
Commercial
Open Source
- BackupPC
- rdiff-backup
- rsync / DeltaCopy for Windows
- FSVS - subversion based backup/restore/versioning, without the annoying .svn directories everywhere
- Duplicity - Encrypted bandwidth-efficient backup using the rsync algorithm
- Box Backup - encrypted online backup
Services/Online Storages (commercial)
- JungleDisk - backup to Amazon.com's S3 Storage Service, no sync
- SpiderOak
- Syncplicity
- DropBox
Friend2Friend ("Social Backup")
All solutions only back up, no sync possible.
- CrashPlan - $60 one time
- Cucku - Windows only, uses Skype for transfers; supports only ONE partner for backup (no "swarming"); free
- Wuala - very interesting concept, cheap/free; no versioning
Research Projects/Decentralized File Systems
Worth looking into.
- Farsite is a serverless, distributed file system that does not assume mutual trust among the client computers on which it runs
- OceanStore is an Internet storage system that allows multiple independent storage-service providers to collaboratively provide highly reliable, location-transparent storage for Internet devices
- Tahoe is a secure, decentralized, fault-tolerant filesystem
