Choosing the Right Cold Storage for Your Archival Use Cases
Every production system generates data that needs to be kept but is rarely accessed. Compliance logs, audit trails, historical transaction records, old user uploads, database backups, surveillance archives — this data can't be deleted (regulatory or business requirements), but it doesn't need millisecond access either. The question is: where do you put it, and how much should you pay?
The difference between choosing the right and wrong storage tier can be 10x–100x in cost. Let's break down the options and build a decision framework.
Understanding Storage Tiers
AWS offers a spectrum of storage tiers, each trading access speed for cost:
S3 Standard — Hot Storage
- Cost: ~$0.023/GB/month
- Retrieval: Instant (milliseconds)
- Use case: Frequently accessed data — application assets, active datasets, recent logs
- Minimum storage duration: None
This is your default. If data is accessed more than once a month, it belongs here.
S3 Infrequent Access (IA) — Warm Storage
- Cost: ~$0.0125/GB/month + $0.01/GB retrieval
- Retrieval: Instant (milliseconds)
- Use case: Data accessed a few times per quarter — older logs, previous month's reports, backup copies
- Minimum storage duration: 30 days
Same speed as Standard but ~45% cheaper for storage. The trade-off is a per-GB retrieval charge — so it's only cheaper if you rarely read the data.
S3 Glacier Instant Retrieval — Cool Storage
- Cost: ~$0.004/GB/month + higher retrieval fees
- Retrieval: Milliseconds (same as Standard)
- Use case: Quarterly access patterns — compliance archives that need to be available for audits, medical images accessed once a year
- Minimum storage duration: 90 days
~82% cheaper than Standard for storage. Instant retrieval but expensive to read. Ideal for data you must keep and can access quickly, but rarely do.
S3 Glacier Flexible Retrieval — Cold Storage
- Cost: ~$0.0036/GB/month
- Retrieval: 1–5 minutes (expedited), 3–5 hours (standard), 5–12 hours (bulk)
- Use case: Annual access — backup archives, regulatory records, old project data
- Minimum storage duration: 90 days
This is the classic "cold storage" tier. You can't read data immediately — you submit a retrieval request and wait. The three retrieval speeds have very different costs: expedited is ~10x the cost of bulk.
S3 Glacier Deep Archive — Frozen Storage
- Cost: ~$0.00099/GB/month
- Retrieval: 12 hours (standard), 48 hours (bulk)
- Use case: 7–10 year retention — regulatory compliance archives (HIPAA, SOX, PCI), legal hold data, surveillance footage
- Minimum storage duration: 180 days
The cheapest storage on AWS — 96% cheaper than Standard. Retrieval takes half a day minimum. This is for data you legally must keep but realistically will never read unless an auditor or regulator asks for it.
Real-World Archival Scenarios
Scenario 1: Financial Transaction Archives (7-year retention)
A bank must retain all transaction records for 7 years per regulatory requirements. Records older than 90 days are never accessed unless there's a fraud investigation or audit.
Recommended approach:
- 0–30 days: S3 Standard (active queries and reports)
- 30–90 days: S3 IA (occasional lookups)
- 90 days–1 year: Glacier Instant Retrieval (audit-ready, fast access if needed)
- 1–7 years: Glacier Deep Archive (regulatory retention, 12h retrieval acceptable)
For 10 TB of monthly transaction data, this tiered approach costs ~$1,200/month versus ~$23,000/month if everything stayed in S3 Standard.
Scenario 2: Compliance Communication Archives (Petabyte-scale)
An enterprise compliance platform captures and archives all corporate communication data — emails, chats, voice recordings — for regulatory supervision. Data volumes are in petabytes, retention is 7–10 years, but most data is only accessed during investigations.
Recommended approach:
- 0–90 days: S3 Standard + Elasticsearch for active supervision and search
- 90 days–2 years: Glacier Flexible Retrieval (investigation access in 3–5 hours)
- 2–10 years: Glacier Deep Archive (regulatory hold, 12h retrieval)
At petabyte scale, Deep Archive at ~$1/TB/month is the difference between a viable product and bankruptcy-level storage bills.
Scenario 3: Surveillance Video Archives
A retail chain stores CCTV footage. Local regulations require 90 days of retention. Corporate policy extends to 1 year for loss prevention analysis.
Recommended approach:
- 0–7 days: S3 Standard (active review by security team)
- 7–90 days: Glacier Instant Retrieval (quick access for incident investigation)
- 90 days–1 year: Glacier Deep Archive (long-term retention at minimal cost)
Automating the Tiering: S3 Lifecycle Policies
You don't manually move objects between tiers. S3 Lifecycle Policies automate transitions based on object age:
{
"Rules": [
{
"ID": "ArchivalTiering",
"Status": "Enabled",
"Transitions": [
{ "Days": 30, "StorageClass": "STANDARD_IA" },
{ "Days": 90, "StorageClass": "GLACIER_IR" },
{ "Days": 365, "StorageClass": "DEEP_ARCHIVE" }
],
"Expiration": { "Days": 2555 }
}
]
}
This single policy handles the entire lifecycle: data flows from hot to warm to cold to frozen, and is automatically deleted after 7 years. Set it once, forget it. The cost savings compound every month as data ages.
Key Decision Factors
When choosing a cold storage strategy, evaluate these factors:
- Retrieval SLA — How fast must you be able to access the data? Instant? Hours? Days? This is the primary differentiator between tiers.
- Retrieval frequency — How often will you actually access the data? Once a month? Once a year? Never? IA charges per retrieval; Glacier charges per retrieval request + time.
- Data volume — At terabyte scale, the cost difference between tiers is significant. At petabyte scale, it's the difference between viable and not viable.
- Compliance requirements — Some regulations specify minimum retention periods. Some require data to be "reasonably accessible." Define what that means in SLA terms before choosing a tier.
- Minimum storage duration — Glacier Deep Archive has a 180-day minimum. If you delete data before that, you pay for the full 180 days. Don't use it for data with unpredictable lifespans.
Beyond AWS: When to Consider Alternatives
AWS isn't the only option for cold storage:
- Azure Blob Archive — Similar pricing to Glacier Deep Archive. Consider if you're already in the Azure ecosystem.
- Google Cloud Archive — Comparable to Glacier, with no retrieval fee for data reads (but higher storage cost).
- Backblaze B2 — Significantly cheaper for straightforward backup/archive use cases. No minimum storage duration.
- Physical tape (Iron Mountain, etc.) — Still relevant for extremely long retention (20+ years), air-gapped security requirements, or regulatory mandates that require physical media.
Conclusion
Cold storage is not a one-size-fits-all decision. The right choice depends on retrieval speed requirements, access frequency, data volume, and compliance constraints. The most cost-effective approach is almost always a tiered lifecycle — automated transitions from hot to warm to cold to frozen — rather than keeping everything in a single tier.
At TechTrailCamp, data lifecycle and storage architecture is a key topic in our consulting practice and training tracks. Whether you're archiving terabytes of financial records or petabytes of compliance data, we help you design storage strategies that meet regulatory requirements without burning budget.
Need help designing your archival strategy?
Let's build a storage architecture that balances compliance, access needs, and cost.
Get Started
TechTrailCamp