| Authors | Wilkinson, Shawn; Buterik, Tome; et al. (Storj Labs) |
|---|---|
| Year | 2016 |
| Project | Storj |
| License | MIT |
| Official Source | https://storj.io/storj.pdf |
This page is an educational summary and analysis of an official whitepaper or technical paper, written for reference purposes. It is not a verbatim reproduction. CryptoGloss does not claim authorship of the original work. All intellectual property rights remain with the original author(s). The official document is linked above.
“Storj: A Peer-to-Peer Cloud Storage Network” is the whitepaper by Storj Labs (versions from 2016 through 2018, with the most current version describing the Storj V3 architecture) describing a decentralized cloud storage marketplace. In Storj:
- Files are encrypted client-side before leaving the uploader’s device
- Files are split into erasure-coded shards (redundant pieces where any 29 of 80 pieces are sufficient to reconstruct the file)
- Shards are distributed across a global network of independent Storage Node Operators (SNOs)
- Cryptographic audits verify that SNOs actually retain the shards they claim to store
- Users pay in STORJ tokens (or credit-card-based fiat) based on actual GB-months stored and GB transferred
> Whitepaper: Available at storj.io/storj.pdf.
Publication and Context
In 2016, cloud storage was dominated by AWS S3, Google Cloud Storage, and Dropbox — all centralized, subject to data breaches, price hikes, and geographic restriction. Storj proposed a marketplace that:
- Replaced single-provider trust with cryptographic verification
- Offered lower cost by utilizing unused storage on SNO machines worldwide
- Preserved privacy by encrypting all data before it left the user
The whitepaper describes Storjv3 (c. 2018), the fully rewritten architecture from founding engineer JT Olio’s team, replacing the earlier Storjv1/v2 designs.
System Architecture
Three-layer architecture:
1. Clients (Uploaders/Downloaders):
- Generate a unique encryption key (derived from a master passphrase)
- Encrypt files client-side using AES-256-GCM before any data leaves the device
- Apply Reed-Solomon erasure coding to create 80 pieces where any 29 can reconstruct the file
- Upload pieces to SNOs via direct TCP connections (not through a central server)
2. Satellites (Metadata Coordinators):
- Trusted (but not storage-holding) nodes that store file metadata: which pieces exist, which SNOs hold them, piece hashes
- Coordinate uploader-to-SNO connections
- Conduct audits to verify SNOs are retaining data
- Handle payments to SNOs based on audit results + bandwidth delivery
- Satellites are a centralization point — Storj Labs runs official satellites; third-party satellites are possible but rare
3. Storage Node Operators (SNOs):
- Individuals or businesses with spare disk capacity who run Storj Node software
- Accept piece uploads, serve piece downloads, respond to audits
- Earn STORJ tokens for storage-months retained (verified by audit) and GB transferred
Erasure Coding
Storj uses Reed-Solomon erasure coding (the same algorithm used in RAID storage and Blu-ray discs):
For a file of size S:
- Split into k=29 data pieces of size S/29
- Generate n=80 total pieces (51 parity pieces)
- Any 29 of the 80 pieces can reconstruct the original file
Consequence: Even if 51 of 80 SNOs go offline simultaneously, the file remains fully retrievable. This provides 99.95%+ durability without requiring SNOs to be individually reliable.
Geographic distribution: The Satellite distributes pieces across SNOs in diverse geographic regions and on diverse ISPs, reducing correlated failure risk.
Audit Protocol
How does the Satellite verify a SNO actually stores a piece (without downloading the whole piece every time)?
Merkle audit protocol:
- When a SNO receives a piece, it creates a Merkle tree over the piece data and sends the Merkle root to the Satellite
- Periodically, the Satellite sends a challenge: provide the Merkle proof for leaf i
- The SNO must return the data at position i and the Merkle proof showing it matches the committed root
- If the SNO deleted the data, they cannot produce a valid proof without the full piece
SNOs that fail audits have their reputation score reduced; repeated failures result in removal from the network and forfeiture of escrow-held STORJ tokens.
Cryptographic Macaroons
Storj uses macaroons (a generalization of bearer tokens by Google Research) for access control:
- An uploader mints a root macaroon for a file/bucket with full permissions
- They can attenuate the macaroon: create a derived macaroon granting only read access, or read access until a specific date, or from a specific IP range
- Derived macaroons cannot exceed the permissions of the parent
- The Satellite verifies macaroon validity without storing user secret keys
This enables fine-grained, delegatable access control for shared storage without requiring a centralized identity system.
Reality Check
Storj’s architecture is technically sound and operates as a real S3-compatible decentralized storage service:
- Storj is S3-compatible: standard AWS S3 clients (boto3, rclone, the AWS CLI) work with Storj endpoints
- Real-world pricing is competitive with AWS S3 (~$4/TB vs. ~$23/TB)
- Throughput from multiple SNOs often exceeds a single origin server via parallel downloads
Caveats:
- Satellite centralization: Storj Labs’ Satellite is a central coordinator. If Storj Labs shuts down their Satellite, files become inaccessible (though users can run their own Satellite).
- SNO quality variability: The network quality varies by SNO uptime; the erasure coding handles SNO failures gracefully, but retrieval latency is less predictable than CDN-backed S3.
- Not censorship-resistant by default: Client-side encryption prevents Storj from reading content, but the Satellite can disable access to specific files or accounts.
Legacy
Storj is one of the few decentralized storage protocols that achieved genuine product-market fit — it is used by real enterprises for production file storage. Its audit design (Merkle challenges vs. Filecoin’s PoRep/PoSt) is simpler but sufficient for real-world durability requirements. The erasure coding approach to distributed storage reliability has influenced later decentralized storage designs.
Related Terms
Research
- Wilkinson, S., Buterik, T., et al. (2018). Storj: A Peer-to-Peer Cloud Storage Network (v3.0). Storj Labs.
— Primary whitepaper. Section 3 describes the node architecture; Section 5 covers the audit protocol.
- Plank, J.S. (2009). A Tutorial on Reed-Solomon Coding for Fault-Tolerance in RAID-like Systems. Software—Practice & Experience.
— Reed-Solomon erasure coding tutorial; the algorithm underlying Storj’s piece redundancy.
- Birgisson, A., Politz, J.G., Erlingsson, Ú., et al. (2014). Macaroons: Cookies with Contextual Caveats for Decentralized Authorization in the Cloud. NDSS 2014.
— Macaroon construction paper; Storj adopted macaroons for delegatable access control.