Traditional HDFS relies on replication (typically 3x) to provide data durability. While effective, this approach leads to high storage overhead — for every 1 TB of data, you need 3 TB of disk.

Erasure Coding (EC) offers a more efficient alternative. Introduced in Hadoop 3.0, EC can reduce storage overhead by up to 50%, without sacrificing fault tolerance.

In this blog, we’ll explore how erasure coding in HDFS works, when to use it, and how to configure it for storage-efficient and resilient big data infrastructure.


What is Erasure Coding?

Erasure coding is a method of data protection where files are split into data blocks and parity blocks, which can be used to reconstruct lost data.

  • In HDFS, the default EC policy is RS-6-3-1024k:
    • 6 data blocks
    • 3 parity blocks
    • Block size: 1024 KB

Compared to 3x replication (200% overhead), RS-6-3 requires only 50% overhead.

Original Data = 6 blocks
Parity Blocks = 3
Total Blocks Stored = 9
Overhead = 3/6 = 50%

How Erasure Coding Works in HDFS

  1. A file is split into data stripes.
  2. Each stripe is encoded into data + parity blocks.
  3. Blocks are distributed across DataNodes.
  4. If any blocks are lost, they can be reconstructed from the remaining blocks.

This allows HDFS to tolerate up to 3 block failures per stripe, similar to triple replication.


Enabling Erasure Coding in HDFS

EC is disabled by default. To enable it:

  1. List supported policies:
hdfs ec -listPolicies
  1. Apply an EC policy to a directory:
hdfs ec -enablePolicy -policy RS-6-3-1024k
hdfs ec -setPolicy -path /data/warehouse -policy RS-6-3-1024k

Only new files written to /data/warehouse will use EC. Existing files remain unchanged.


Verifying Erasure-Coded Files

To confirm EC is applied:

hdfs ec -getPolicy -path /data/warehouse
hdfs fsck /data/warehouse -files -blocks -racks

Look for block types like BLOCK_GROUP in the output.


When to Use Erasure Coding

Use EC for:

  • Cold or archival data
  • Historical logs
  • Large datasets with low read/write frequency

Avoid EC for:

  • Small files (less than block size)
  • Hot or frequently accessed data
  • Random writes (not supported)

EC incurs higher CPU and network overhead during read/write and recovery, so it’s best suited for cold storage.


Performance Considerations

Factor Replication Erasure Coding
Storage Overhead 200% 50% (RS-6-3)
Write Speed Fast Slower (more CPU/network)
Read Speed Fast Slower (decode cost)
Use Case Hot data Cold/archive data

To improve performance:

  • Use EC on large files only
  • Avoid using EC for files smaller than 100MB
  • Use SSDs or tune memory buffers for encoding/decoding

Monitoring and Maintenance

Monitor EC-related health with:

hdfs dfsadmin -report
hdfs fsck /

Watch for:

  • Block group health
  • Corruption or uncorrectable failures
  • Network load due to EC reconstruction

Use block storage policies to co-locate EC data on lower-cost hardware (HDDs).


Best Practices for Using Erasure Coding

  • Apply EC only to specific directories
  • Benchmark read/write performance before deploying to production
  • Enable kerberos and access control to protect EC metadata
  • Use HDFS snapshots for critical EC directories
  • Regularly audit and validate EC health via fsck

Conclusion

Erasure coding in HDFS is a powerful feature that enables cost-efficient storage without compromising on fault tolerance. By understanding its benefits and trade-offs, you can apply EC strategically to reduce storage costs while ensuring data durability in your Hadoop ecosystem.

Used properly, EC can extend the lifespan of your data infrastructure and help meet the demands of ever-growing data volumes.