Basic Concept of Docker

SERVER MANAGEMENT

Server management is very complicated and difficult

In general, Server management is very complicated and difficult. Just a few years ago, one of the developers' main tasks was to install the various programs in the operating system to deploy the source codes. The installation manual was long and complicated and I failed to install it for some unknown reason even though they worked properly on my local machine. If it did not install correctly, I would install from OS again and again.

For examples, The OS and Database server versions used by the company were fixed, and updating versions was a huge risk, so it was best to leave the server untouched as much as possible after installation.
The day I set up the new servers was a sleepless night, I became a master of configurations and installation. If you think you have become accustomed to it, you would have problems if the Linux distribution changed or the environment changed.
It was also a problem to install several programs on one server. If the versions of the libraries used were different, or if you used the same port, the installation was very difficult.
Nowadays, Deployment is more frequent and micro-service architecture is popular, and programs become more fragmented. It means the server management becomes more complex. New tools continue coming out. With the evolution of the cloud, hundreds or even thousands of servers should be installed.
In this situation, Docker changed the whole way of server management.

HISTORY OF DOCKER

The future of Linux Containers

At the Pycon Conference in Santa Clara, DC in March 2013, Docker was known to the world by dotCloud founder Solomon Hykes during a session called TheFuture of Linux Containers.

After that, Docker became popular. He changed his company name to Docker Inc. and announced Docker 1.0 in June 2014. In August 2014, he sold dotCloud platform to focus on the Docker and they got fund about $95M at 2015 already. Docker is still growing so fast.

The Evolution of the Modern Software Supply Chain - The Docker Survey, 2016

This is 2016 survey result by Docker. It’s pretty old but I would like to show how Docker is popular and spread fast. According to the survey result, 90% of people are using Docker for development, 80% is going to use for DevOps, 58% are using in production.

WHAT IS DOCKER?

Docker is a platform to manage containers

Docker is an open-source project that automates the deployment of software applications inside containers by providing an additional layer of abstraction and automation of OS-level virtualization on Linux.
In simpler words, Docker is a tool that allows developers, sys-admins etc. to easily deploy their applications in a sandbox (called containers) to run on the host operating system i.e. Linux.
The key benefit of Docker is that it allows users to package an application with all of its dependencies into a standardized unit for software development. Unlike virtual machines, containers do not have high overhead and hence enable more efficient usage of the underlying system and resources.

CONTAINER

We can imagine a container as a square cargo box, which can be loaded onto boats. Each container can contain a variety of cargoes such as clothes, shoes, electronics, liquor, fruit, etc., and can be easily transported to various transportations such as container ships and trailers.

Docker Container is also similar to this container. They abstract various programs and execution environments into containers and provide the same interface to simplify program distribution and management.

Any programs, such as a backend program, a database server, or a message queue, can be abstracted into a container and run anywhere on the AWS, Azure, Google Cloud Or your personal PC as well.

VIRTUAL MACHINE VS. DOCKER

Virtual Machine and Docker

Containers provide most of the isolation of virtual machines at a fraction of the computing power. This is one of Virtualizing but different approach. To compare Virtual Machine with Docker is a good start to understand Docker.

The existing virtualization method virtualizes the OS. Virtual machines, such as VMware or VirtualBox, virtualize and use the entire guest OS over the host OS.

They are used to virtualize various operating systems (such as running Windows on Linux) and this is simple to use but heavy and slow, so we cannot use it in production environments.

The way to install and virtualize additional OS has a performance problem anyway, and, to improve it, process isolation solution comes out. In Linux, this is called a LXC(Linux container) and simply isolates the process, and it’s much lighter and faster. CPU and memory use just as much as needed for the process, and there is little performance loss.

If you run multiple containers on the same server, they run independently without affecting each other, and it feels like using a lightweight Virtual Machine.
You can run commands on a running container and you can install packages with apt-get or yum. Also, you can add users, and run multiple processes in the background. You can limit CPU or memory usage, connect to a specific port on the host, or use a specific directory on the host like an internal directory.

It takes only 1-2 seconds to create a new container, which is super faster than a virtual machine.

The docker started with LXC and they used their own libcontainer technology from 0.9 version and it added to the runC technology later.

WHY USE CONTAINERS?

Decoupling allows container-based applications to be deployed easily and consistently, regardless of whether the target environment.

Gives developers the ability to create predictable environments that are isolated from rest of the applications and can be run anywhere.

From an operations standpoint, apart from portability containers also give more granular control over resources giving your infrastructure improved efficiency which can result in better utilization of your compute resources.

IMAGE

Docker Image

The image contains the files and settings needed to run the container, does not have a status value, and does not change (Immutable).

Containers are after running an image. Added or changed values are stored in a container. You can create multiple containers from the same image, and the image is not changed even if the container state changes or the container is deleted.

The ubuntu image contains all the files needed to run ubuntu, and the MySQL image contains the files, commands and port information to run MySQL. For a more complicated example, Gitlab images are based on centos and have ruby, go, database, redis, gitlab source, and nginx.
Literally, images have all the information needed to run the container, so you no longer need to compile and install dependency files.
Now when a new server is added, you just need to download the pre-created image and create the container. You can run multiple containers on one server, or even thousands of servers.

DOCKER HUB

Docker Store

You can register your docker images on the Dockerhub or create and manage your own images on Docker Registry. More than 2 million images are now available on the market and Docker hub downloads over dozen or more billion images. Anyone can easily create and distribute images.

The size of the image is usually several hundred megabytes. Storing and managing these large images on the server is not easy, but Docker provides the public images for free through the Docker hub.

The docker is not a completely new technology. It packed existing technology very well.

Before Docker, there were no technologies to combine Containers, overlay networks, union file systems, and other existing technologies to use easily. Docker implemented those features with simple but revolutionary ideas.

DOCKER LAYER

Docker Layer

Because the container image has all the information to run the container, its size is usually hundreds of megabytes or gigabytes. If you download the first image, it will not be burdened much, but if you download a whole file again when an only single file added, it will be very inefficient.

To solve this problem, the docker uses the concept of a layer. And it uses the union file system to use multiple layers as a single file system. An image consists of several read-only layers, and when a file is added or modified, a new layer is created.

If the ubuntu image is a set of A + B + C, the nginx image based on the ubuntu image, it will be A + B + C + nginx. If you have created a webapp image based on nginx images, it will consist of A + B + C + nginx + source layers. If you modify the webapp source, you only need to download the new source layer except the A, B, C, and nginx layers, so you can manage the image very efficiently.

When you create a container, you also use the layer method, which adds a read-write layer over the existing image layer. Since the image layer is preserved, the files or changes that the container creates during its execution is stored in the read/write layer, so creating multiple containers will use minimal capacity.

This is a very simple but incredibly clever design comparing the virtualization.

IMAGE PATH

Docker Image URL

Images are managed and tagged in a url manner.

ubuntu 14.04 image is docker.io/library/ubuntu:14.04 or docker.io/library/ubuntu:trusty and docker.io/library is optional. So you can use just ubuntu: 14.04. This approach is easy to understand and easy to use, and you can easily test and rollback using tagging.

DOCKERFILE

To create an image, the docker writes the image creation process in a file called Dockerfile using its own domain-specific language (DSL) language.

This is a very simple but useful idea. When you install a program on your server and install a dependency package and create a configuration file, you don't need to write the process on your notepad anymore. Instead, you can manage it with Dockerfile. This file is versioned with the source and anyone can view and modify the image creation process.

Search This Blog

Mijoo's Blog