XenonStack

A Stack Innovator

Post Top Ad

Wednesday, 22 March 2017

3/22/2017 03:04:00 pm

How To Deploy PostgreSQL on Kubernetes


What is PostgreSQL?


PostgreSQL is a powerful, open source Relational Database Management System.

PostgreSQL is not controlled by any organization or any individual. Its source code is available free of charge. It is pronounced as "post-gress-Q-L".

PostgreSQL has earned a strong reputation for its reliability, data integrity, and correctness.
  • It runs on all major operating systems, including Linux, UNIX (AIX, BSD, HP-UX, SGI IRIX, MacOS, Solaris, Tru64), and Windows.
  • It is fully ACID compliant, has full support for foreign keys, joins, views, triggers, and stored procedures (in multiple languages)
  • It includes most SQL:2008 data types, including INTEGER, NUMERIC, BOOLEAN, CHAR, VARCHAR, DATE, INTERVAL, and TIMESTAMP.
  • It also supports storage of binary large objects, including pictures, sounds, or video.
  • It has native programming interfaces for C/C++, Java, .Net, Perl, Python, Ruby, Tcl, ODBC, among others, and exceptional documentation.


Prerequisites


To follow this guide you need -


Step 1 - Create a PostgreSQL Container Image

Create a file name “Dockerfile” for PostgreSQL. This image contains our custom config dockerfile which will look like -

FROM ubuntu:latest
MAINTAINER XenonStack

RUN apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys B97B0AFCAA1A47F044F244A07FCC7D46ACCC4CF8

RUN echo "deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main" > /etc/apt/sources.list.d/pgdg.list

RUN apt-get update && apt-get install -y python-software-properties software-properties-common postgresql-9.6 postgresql-client-9.6 postgresql-contrib-9.6

RUN /etc/init.d/postgresql start &&\
 psql --command "CREATE USER root WITH SUPERUSER PASSWORD 'xenonstack';" &&\
 createdb -O root xenonstack

RUN echo "host all  all 0.0.0.0/0  md5" >> /etc/postgresql/9.6/main/pg_hba.conf

RUN echo "listen_addresses='*'" >> /etc/postgresql/9.6/main/postgresql.conf

# Expose the PostgreSQL port
EXPOSE 5432

# Add VOLUMEs to allow backup of databases
VOLUME  ["/var/lib/postgresql"]

# Set the default command to run when starting the container
CMD ["/usr/lib/postgresql/9.6/bin/postgres", "-D", "/var/lib/postgresql", "-c", "config_file=/etc/postgresql/9.6/main/postgresql.conf"]

This Postgres image has a base image of ubuntu xenial. After that, we create Super User and default databases. Exposing 5432 port will help external system to connect the PostgreSQL server.

Step 2 - Build PostgreSQL Docker Image


$ docker build -t dr.xenonstack.com:5050/postgres:v9.6

Step 3 - Create a Storage Volume (Using GlusterFS)

Using below-mentioned command create a volume in GlusterFS for PostgreSQL and start it.

As we don’t want to lose our PostgreSQL Database data just because a Gluster server dies in the cluster, so we put replica 2 or more for higher availability of data.


$ gluster volume create postgres-disk replica 2 transport tcp k8-master:/mnt/brick1/postgres-disk  k8-1:/mnt/brick1/postgres-disk
$ gluster volume start postgres-disk
$ gluster volume info postgres-disk





Step 4 - Deploy PostgreSQL on Kubernetes

Deploying PostgreSQL on Kubernetes have following prerequisites -
  • Docker Image: We have created a Docker Image for Postgres in Step 2
  • Persistent Shared Storage Volume: We have created a Persistent Shared Storage Volume in Step 3
  • Deployment & Service Files: Next, we will create Deployment & Service Files

Create a file name “deployment.yml” for PostgreSQL. This deployment file will look like -

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: postgres
  namespace: production
spec:
  replicas: 1
  template:
 metadata:
   labels:
    k8s-app: postgres
 spec:
   containers:
   - name: postgres
     image: dr.xenonstack.com:5050/postgres:v9.6
     imagePullPolicy: "IfNotPresent"
     ports:
     - containerPort: 5432
     env:
     - name: POSTGRES_USER
       value: postgres
     - name: POSTGRES_PASSWORD
       value: superpostgres
     - name: PGDATA
       value: /var/lib/postgresql/data/pgdata
     volumeMounts:
        - mountPath: /var/lib/postgresql/data
          name: postgredb
   volumes:
     - name: postgredb
       glusterfs:
         endpoints: glusterfs-cluster
         path: postgres-disk
         readOnly: false

Continue Reading The Full Article At - XenonStack.com/Blog

Tuesday, 14 March 2017

3/14/2017 12:48:00 pm

Why We Need Modern Big Data Integration Platform




Data is everywhere and we are generating data from different Sources like Social Media, Sensors, API’s, Databases.

Healthcare, Insurance, Finance, Banking, Energy, Telecom, Manufacturing, Retail, IoT, M2M are the leading domains/areas for Data Generation. The Government is using BigData to improve their efficiency and distribution of the services to the people.

The Biggest Challenge for the Enterprises is to create the Business Value from the data coming from the existing system and from new sources. Enterprises are looking for a Modern Data Integration platform for Aggregation, Migration, Broadcast, Correlation, Data Management, and Security.

Traditional ETL is having a paradigm shift for Business Agility and need of Modern Data Integration Platform is arising. Enterprises need Modern Data Integration for agility and for an end to end operations and decision-making which involves Data Integration from different sources, Processing Batch Streaming Real Time with BigData Management, BigData Governance, and Security.


BigData Type Includes:
  • What type of data it is
  • Format of content of data required
  • Whether data is transactional data, historical data or master data
  • The Speed or Frequency at which data made to be available
  • How to process the data i.e. whether in real time or in batch mode


5 V’s to Define BigData



5vs of big data










 

Additional 5V’s to Define BigData


additional 5vs of big data



 

Data Ingestion and Data Transformation


Data Ingestion comprises of integrating Structured/unstructured data from where it is originated into a system, where it can be stored and analyzed for making business decisions. Data Ingestion may be continuous or asynchronous, real-time or batched or both.

Defining the BigData Characteristics: Using Different BigData types, helps us to define the BigData Characteristics i.e how the BigData is Collected, Processed, Analyzed and how we deploy that data On-Premises or Public or Hybrid Cloud.

  • Data type: Type of data
    • Transactional
    • Historical
    • Master Data and others

  • Data Content Format: Format of data
    • Structured (RDBMS)
    • Unstructured (audio, video, and images)
    • Semi-Structured

  • Data Sizes: Data size like Small, Medium, Large and Extra Large which means we can receive data having sizes in Bytes, KBs, MBs or even in GBs.

  • Data Throughput and Latency: How much data is expected and at what frequency does it arrive. Data throughput and latency depend on data sources:
    • On demand, as with Social Media Data
    • Continuous feed, Real-Time (Weather Data, Transactional Data)
    • Time series (Time-Based Data)

  • Processing Methodology: The type of technique to be applied for processing data (e.g. Predictive Analytics, Ad-Hoc Query and Reporting).

  • Data Sources: Data generated Sources
    • The Web and Social Media
    • Machine-Generated
    • Human-Generated etc

  • Data Consumers: A list of all possible consumers of the processed data:
    • Business processes
    • Business users
    • Enterprise applications
    • Individual people in various business roles
    • Part of the process flows
    • Other data repositories or enterprise applications

modern big data integration platform

 

Major Industries Impacted with BigData



industries impacted with big data

 

What is Data Integration?


Data Integration is the process of Data Ingestion - integrating data from different sources i.e. RDBMS, Social Media, Sensors, M2M etc, then using Data Mapping, Schema Definition, Data transformation to build a Data platform for analytics and further Reporting. You need to deliver the right data in the right format at the right timeframe.

BigData integration provides a unified view of data for Business Agility and Decision Making and it involves:

  • Discovering the Data
  • Profiling the Data
  • Understanding the Data
  • Improving the Data
  • Transforming the Data

A Data Integration project usually involves the following steps:

  • Ingest Data from different sources where data resides in multiple formats.
  • Transform Data means converting data into a single format so that one can easily be able to manage his problem with that unified data records. Data Pipeline is the main component used for Integration or Transformation.
  • MetaData Management: Centralized Data Collection.
  • Store Transform Data so that analyst can exactly get when the business needs it, whether it is in batch or real time.

modern big data integration platform

 

Why Data Integration is required


  • Make Data Records Centralized: As data is stored in different formats like in Tabular, Graphical, Hierarchical, Structured, Unstructured form. For making the business decision, a user has to go through all these formats before reaching a conclusion. That’s why a single image is the combination of different format helpful in better decision making.
  • Format Selecting Freedom: Every user has different way or style to solve a problem. User are flexible to use data in whatever system and in whatever format they feel better.
  • Reduce Data Complexity: When data resides in different formats, so by increasing data size, complexity also increases that degrade decision making capability and one will consume much more time in understanding how one should proceed with data.
  • Prioritize the Data: When one have a single image of all the data records, then prioritizing the data what's very much useful and what's not required for business can easily find out.
  • Better Understanding of Information: A single image of data helps non-technical user also to understand how effectively one can utilize data records. While solving any problem one can win the game only if a non-technical person is able to understand what he is saying.
  • Keeping Information Up to Date: As data keeps on increasing on daily basis. So many new things come that become necessary to add on with existing data, so Data Integration makes easy to keep the information up to date.

Continue Reading The Full Article At - XenonStack.com/Blog

Thursday, 9 March 2017

3/09/2017 11:23:00 am

Healthcare is Drowning in Data, Thirst For Knowledge



The Amount of Data in Healthcare is increasing at an astonishing rate. However, in general, the industry has not deployed the level of data management and analysis necessary to make use of those data.

As a result, healthcare executives face the risk of being overwhelmed by a flood of unusable data.

Consider the many sources of data. Current medical technology makes it possible to scan a single organ in 1 second and complete a full-body scan in roughly 60 seconds. The result is nearly 10 GB of raw image data delivered to a hospital’s Picture Archive and Communications System (PACS).

Clinical areas in their digital infancy, such as pathology, proteomics, and genomics, which are the key to personalized medicine, can generate over 2TB of data per patient.

Add to that the research and development of advanced medical compounds and devices, which generate terabytes over their lengthy development, testing and approval processes.


 

Doctors Are Drowning In Data


Technology isn't enough to improve healthcare. Doctors must be able to distinguish between valuable data and information overload.

One of the hopes of Electronic Health Records (EHRs) is that they will revolutionize medicine by collecting information that can be used to improve how we provide care. Getting good data from EHRs can occur if good data is input.

This doesn't always happen. To see patients, document encounters, enter smoking status, create coded problems lists, update medication lists, e-prescribe medications, order tests, find, open, and review multiple prior notes, schedule follow-up appointments, search for SNOWMED codes, search for ICD-9 codes, and find CPT codes to bill encounters(tasks previously delegated to a number of people) and compassionately interact with patients, providers have to take shortcuts.

But We have to Say HealthCare Drowning in Data Elements not yet interoperable onto one Platform.

First, the Data Exchange and Interoperability between EMRs, HIEs, Hospitals, Nursing Homes, Home care, ERs, portals, etc., must be addressed and industry standards need to emerge on the technology, but also the costs need to be defined. Who is going to pay for what and when?

It seems like the deepest pockets in the industry – pharmaceuticals and insurance – have put a dime into technology solutions or Big Data. Yet they have the most to gain. This is a huge disconnect because physicians and hospitals cannot afford to capitalize this start up by ourselves.

I believe that they will need to be influenced to contribute to this effort, in kind or with cash, for this system to be made whole and meaningful.

HIT industry leaders need to sit down with busy clinicians to create a workflow of automated Big Data in a way that provides all the stakeholders with the data to improve all levels of efficiencies and outcomes.











Decisions Through Data-Small data, Predictive modeling expansion, and real-time analytics are three forms of data analytics.

Healthcare data will continue to accumulate rapidly. If practices, hospitals, and healthcare systems do not actively respond to the flood of unstructured data, they risk forgoing the opportunity to use these data in managing their operations.

Small data and Real-Time Analytics are two methods of data analytics that allow practices, hospitals, and healthcare organizations to extract meaningful information.

Predictive Modeling is best suited for organizations managing large patient populations. With all three methods, the applicable information mined from raw data supports improvements in the quality of care and cost efficiency.

The use of Small Data, Real-Time Analytics, and Predictive Modeling will revolutionize the healthcare field by increasing those opportunities beyond reacting to emerging problems.





 

About RayCare:

RayCare is an Integrated HealthCare Platform Starting From Connecting Doctors, Labs, Medicine, Dieticians and Get Healthy Life Tips to Creation of Health Profile, Medical Reports, Daily Health Tracking to Predictive Diagnostic Analytics and Second Option Consultation & Recommendations. Know More!

For More, Visit - XenonStack.com/Blog