XenonStack

A Stack Innovator

Post Top Ad

Tuesday 22 May 2018

Persistent Storage Solution For Containers - Docker & Kubernetes


Persistent Storage Solution For Containers - Docker & Kubernetes

Overview of Persistent Storage

Persistent Storage is a critical part in order to run stateful containers. Kubernetes is an open source system for automating deployment and management of containerized applications.
There are a lot of options available for storage. In this blog, we are going to discuss most widely used storage which can use on on-premises or in a cloud-like GlusterFS, CephFS, Ceph RBD, OpenEBS, NFS, GCE Persistent Storage, AWS EBS, NFS & Azure Disk.

Prerequisites for Implementing Storage Solutions

To follow this guide you need -
  • Kubernetes
  • Kubectl
  • DockerFile
  • Container Registry
  • Storage technologies that will be used -
    • OpenEBS
    • CephFS
    • GlusterFS
    • AWS EBS
    • Azure Disk
    • GCE persistent storage
    • CephRBD

Kubernetes

Kubernetes is one of the best open-source orchestration platforms for deployment, autoscaling and management of containerized applications.

Kubectl

Kubectl is a command line utility used to manage kubernetes clusters either remotely or locally. To configure kubectl use this link.

Container Registry

Container Registry is the repository where we store and distribute docker images. There are several repositories available online we have DockerHub, Google Cloud,Microsoft Azure, and AWS Elastic Container Registry (ECR).
Container storage is ephemeral, meaning all the data in the container is removed when it crashes or restarted. Persistent storage is necessary for stateful containers in order to run applications like MySQL, Apache, PostgreSQL etc. so that we don’t lose our data when a container stops.

3 Types of Storage 

  • Block Storage - It is most commonly used storage and is very flexible. Block storage stores chunks of data in blocks. A block is only identified by its address. It is mostly used for databases because of its performance.
  • File Storage - It stores data as files, each file is referenced by a filename and has attributes associated with it. NFS is the most commonly used file systems. We can use file storage where we want to share data with multiple containers.
  • Object Storage - Object storage is different from file storage and block storage. In object storage data is stored as an object and is referenced by object ID. It is massively scalable and provides more flexibility than block storage but performance is slower than block storage. Most commonly used object storage are Amazon S3, Swift and Ceph Object Storage.
Emerging Storage Technologies

7 Emerging Storage Technologies

OpenEBS Container Storage

OpenEBS is a pure container based storage platform available for Kubernetes. Using OpenEBS, we can easily use persistent storage for stateful containers and the process of provisioning of a disk is automated.
It is a scalable storage solution which can run anywhere, from cloud to on-premises hardware.

Ceph Storage Cluster

Ceph is an advanced and scalable Software-defined storage which fits best with the needs of today’s requirement providing Object Storage, Block Storage and File System on a single platform.
Ceph can also be used with Kubernetes. We can either use CephFS or CephRBD for persistent storage for kubernetes pods.
  • Ceph RBD is the block storage which we assign to pod. CephRBD can’t be shared with two pods at a time in read-write mode.
  • CephFS is a POSIX-compliant file system service which stores data on top Ceph cluster. We can share CephFS with multiple pods at the same time. CephFS is now announced as stable in the latest Ceph release.

GlusterFS Storage Cluster

GlusterFS is a scalable network file system suitable for cloud storage. It is also a software-defined storage which runs on commodity hardware just like Ceph but it only provides File systems, and it is similar to CephFS.
Glusterfs provides more speed than Ceph as it uses larger block size as compared to ceph i.e Glusterfs uses a block size of 128kb whereas ceph uses a block size of 64Kb.

AWS EBS Block Storage

Amazon EBS provides persistent block storage volumes which are attached to EC2 instances. AWS provides various options for EBS, so we can choose the storage according to requirement depending on parameters like number of IOPS, storage type(SSD/HDD) etc.
We mount AWS EBS with kubernetes pods for persistent block storage using AWSElasticBlockStore. EBS disks are automatically replicated over multiple AZ’s for durability and high availability.

GCEPersistentDisk Storage

GCEPersistentDisk is a durable and high-performance block storage used with Google Cloud Platform. We can use it either with Google Compute Engine or Google Container Engine.
We can choose from HDD or SSD and can increase the size of the volume disk as the need increases. GCEPersistentDisks are automatically replicated across multiple data centres for durability and high availability.
We mount GCEPersistentDisk with kubernetes pods for persistent block storage using GCEPersistentDisk.

Azure Disk Storage

An Azure Disk is also a durable and high-performance block storage like AWS EBS and GCEPersistentDisk. Providing the option to choose from SSD or HDD for your environment and features like Point-in-time backup, easy migration etc.
An AzureDiskVolume is used to mount an Azure Data Disk into a Pod. Azure Disks are replicated within multiple data centres for high availability and durability.

Network File System Storage

NFS is the one of the oldest used file system providing the facility to share single file system on the network with multiple machines.
There are several NAS devices available for high performance or can we make our system to be used as NAS. We use NFS for persistent storage for pods and data can be shared with multiple instances.

Deployment of Storage Solutions

Now we are going to walk through with the deployments of storage solutions described above. We are going to start with Ceph.

No comments:

Post a Comment