XenonStack

A Stack Innovator

Post Top Ad

Monday, 4 June 2018

6/04/2018 03:53:00 pm

Unit Testing, TDD and BDD in Machine Learning and Deep Learning

Unit Testing, TDD and BDD in Machine Learning and Deep Learning

Introduction to Test Driven Development (TDD)

A pattern built for development in performance testing is known as Test Driven Development. It is a process that enables the developers to write code and estimate the intended behavior of the application.
The requirements for the Test Driven Development process are mentioned below-
  • Detect the change in intended behavior.
  • A rapid iteration cycle that produces working software after each iteration.
  • To identify the bugs. If a test is not failing, but still a bug is found, then it is not considered as a bug, it will be considered as a new feature.
Tests can be written for functions and methods, whole classes, programs, web services, whole machine learning pipelines, neural networks, random forests, mathematical implementations and many more.
You May also Love to Read Overview of Artificial Neural Networks and its Applications

Test Driven Development Lifecycle

The TDD cycle enables the programmer to write functions in small modules. The small test modules consist of three sections that are described below -
  • Failed Test (RED) - The First step of TDD is to make a failure test of the application. In terms of Machine Learning, a failure test might be the output of an algorithm that always predicts the same thing. It is a kind of baseline test for Machine Learning algorithms.
  • Pass the Failed Test (GREEN) - After writing the failed test, next move is to pass the written failed test. The failed test is divided into a number of small failed tests and then tested by passing random values and dummy objects.
  • Refactoring the Code - After passing the failed test, there is a need to refactor the code. To implement the refactoring process, one must keep in mind that while making changes in the code the behavior should not be affected.
If the developer is adding special handling feature in the code such as an if statement, the code will no longer follow refracting process. If while refactoring the code, the previous test alters then the code has to pass the test process cycle.

Acceptance Test Driven Development (ATDD)

ATDD stands for Acceptance Test Driven Development. This technique is done before starting the development and includes customers, testers, and developers into the loop.
These all together decided acceptance criteria and work accordingly to meet the requirements.
ATDD helps to ensure that all project members understand what needs to be done and implemented. The failing tests provide us quick feedback that the requirement is not being met.

Advantages of Acceptance Test Driven Development

  • As we have ATDD very first, so it helps to reduce defect and bug fixing effort as the project progresses.
  • ATDD only focus on ‘What’ and not ‘How’. So, this makes it very easier to meet customer’s requirements.
  • ATDD makes developers, testers, and customers to work together, this helps to understand what is required from the system

 

Importance of Test Driven Development in Machine and Deep Learning

Many times, the code doesn’t raise an error. However, the result of the answers won’t be as expected or the other way around the output we get is not exactly what we wanted.
Let us assume that we want to use a package and we start to import the same. There is a chance that the imported package must have already been imported and we are importing it again.
Therefore, to avoid such a situation and we want to test if the package we wanted to import is already imported or not. So, when we submit the whole code to the test case, the test case should be able to find if the package is already imported or not. This is to avoid duplication.
Similarly as above, when we wanted to use the pre-trained models for predictions, the models sometimes will be huge and we want to load the model only once and in the process, if we load multiple times, the processing speed gets slowed down due to occupying more memory which actually is not required. Even in this case duplication has to be avoided.
Other cases that we could look at are the sufficient conditions. If we create a function, the function will take in an input and returns an output. So, when we use the concept of necessary and sufficient conditions, we’re interested in knowing the sufficient condition to say that the function is working properly. To give an example of a necessary condition, each step in the function should be error free.
If we create a function and on giving the input if it raises an error for ex: indentation error, the function is not well defined. So, one of the necessary conditions is error free steps. But, if the function runs successfully and gives an output, does that mean we have the correct answer?
Let’s say, we have two functions in a package, addition, and multiplication but the developer has actually given the code of addition for multiplication and vice-versa(a typo while defining the function).
If we use the function directly we will get the result; we won’t get the expected results. So, we could create a test case where given any two known inputs and the known output, if not one, for a few test examples, we can set the condition saying if all the test cases pass, then the given function is correct.

Simple Testing Module in Python

First of all, simple testing module implemented in Python is described which is further used for TDD in Machine Learning and Deep Learning.
To start writing the test, one has to first write the fail test. The simple failing test is described below -
In the above example, a NumGues object is initiated. Before running the testing script, the script is saved by the name which is ended with _tests.py. Then move to the current directory and run the following command -

Continue Reading:XenonStack/Blog

Wednesday, 30 May 2018

5/30/2018 04:50:00 pm

Continuous Delivery Pipeline for Deploying Microservices Based Scala Application on Kubernetes


Continuous Delivery Pipeline for Deploying Microservices Based Scala Application on Kubernetes

Overview

Running Containers at any real-world scale requires container orchestration, and scheduling platform like Docker SwarmApache MesosAWS ECS but the most popular out of it is Kubernetes. Kubernetes is an open source system for automating deployment and management of containerised applications. 
In this post, We’ll share the process how you can Develop and Deploy Microservices based PHP Laravel Application on the Container Environment -  Docker and Kubernetes and adopt DevOps in existing PHP Applications.

Prerequisites For Deploying Laravel Application on Kubernetes

To follow this guide you need -
  • Kubernetes

It is an open source platform that automates container operations, and Minikube is best for testing Kubernetes.
  • Kubectl

Kubectl is command line interface to manage Kubernetes cluster either remotely or locally. To configure kubectl on your machine follow this link.
  • Shared Persistent Storage

Shared Persistent Storage is permanent storage that we can attach to the Kubernetes container so that we don`t lose our data even when container dies. We will be using GlusterFS as the persistent data store for Kubernetes container applications.
  • Scala Application Source Code

Scala Application Source Code is source code that we want to run inside a Kubernetes container.
  • DockerFile

Dockerfile contains a bunch of commands to build PHP Laravel application.
  • Container Registry

The Registry is an online image store for container images.
Below mentioned options are few most popular registries.

CONTINUE READING:XenonStack/Blog

Tuesday, 22 May 2018

5/22/2018 05:25:00 pm

Persistent Storage Solution For Containers - Docker & Kubernetes


Persistent Storage Solution For Containers - Docker & Kubernetes

Overview of Persistent Storage

Persistent Storage is a critical part in order to run stateful containers. Kubernetes is an open source system for automating deployment and management of containerized applications.
There are a lot of options available for storage. In this blog, we are going to discuss most widely used storage which can use on on-premises or in a cloud-like GlusterFS, CephFS, Ceph RBD, OpenEBS, NFS, GCE Persistent Storage, AWS EBS, NFS & Azure Disk.

Prerequisites for Implementing Storage Solutions

To follow this guide you need -
  • Kubernetes
  • Kubectl
  • DockerFile
  • Container Registry
  • Storage technologies that will be used -
    • OpenEBS
    • CephFS
    • GlusterFS
    • AWS EBS
    • Azure Disk
    • GCE persistent storage
    • CephRBD

Kubernetes

Kubernetes is one of the best open-source orchestration platforms for deployment, autoscaling and management of containerized applications.

Kubectl

Kubectl is a command line utility used to manage kubernetes clusters either remotely or locally. To configure kubectl use this link.

Container Registry

Container Registry is the repository where we store and distribute docker images. There are several repositories available online we have DockerHub, Google Cloud,Microsoft Azure, and AWS Elastic Container Registry (ECR).
Container storage is ephemeral, meaning all the data in the container is removed when it crashes or restarted. Persistent storage is necessary for stateful containers in order to run applications like MySQL, Apache, PostgreSQL etc. so that we don’t lose our data when a container stops.

3 Types of Storage 

  • Block Storage - It is most commonly used storage and is very flexible. Block storage stores chunks of data in blocks. A block is only identified by its address. It is mostly used for databases because of its performance.
  • File Storage - It stores data as files, each file is referenced by a filename and has attributes associated with it. NFS is the most commonly used file systems. We can use file storage where we want to share data with multiple containers.
  • Object Storage - Object storage is different from file storage and block storage. In object storage data is stored as an object and is referenced by object ID. It is massively scalable and provides more flexibility than block storage but performance is slower than block storage. Most commonly used object storage are Amazon S3, Swift and Ceph Object Storage.
Emerging Storage Technologies

7 Emerging Storage Technologies

OpenEBS Container Storage

OpenEBS is a pure container based storage platform available for Kubernetes. Using OpenEBS, we can easily use persistent storage for stateful containers and the process of provisioning of a disk is automated.
It is a scalable storage solution which can run anywhere, from cloud to on-premises hardware.

Ceph Storage Cluster

Ceph is an advanced and scalable Software-defined storage which fits best with the needs of today’s requirement providing Object Storage, Block Storage and File System on a single platform.
Ceph can also be used with Kubernetes. We can either use CephFS or CephRBD for persistent storage for kubernetes pods.
  • Ceph RBD is the block storage which we assign to pod. CephRBD can’t be shared with two pods at a time in read-write mode.
  • CephFS is a POSIX-compliant file system service which stores data on top Ceph cluster. We can share CephFS with multiple pods at the same time. CephFS is now announced as stable in the latest Ceph release.

GlusterFS Storage Cluster

GlusterFS is a scalable network file system suitable for cloud storage. It is also a software-defined storage which runs on commodity hardware just like Ceph but it only provides File systems, and it is similar to CephFS.
Glusterfs provides more speed than Ceph as it uses larger block size as compared to ceph i.e Glusterfs uses a block size of 128kb whereas ceph uses a block size of 64Kb.

AWS EBS Block Storage

Amazon EBS provides persistent block storage volumes which are attached to EC2 instances. AWS provides various options for EBS, so we can choose the storage according to requirement depending on parameters like number of IOPS, storage type(SSD/HDD) etc.
We mount AWS EBS with kubernetes pods for persistent block storage using AWSElasticBlockStore. EBS disks are automatically replicated over multiple AZ’s for durability and high availability.

GCEPersistentDisk Storage

GCEPersistentDisk is a durable and high-performance block storage used with Google Cloud Platform. We can use it either with Google Compute Engine or Google Container Engine.
We can choose from HDD or SSD and can increase the size of the volume disk as the need increases. GCEPersistentDisks are automatically replicated across multiple data centres for durability and high availability.
We mount GCEPersistentDisk with kubernetes pods for persistent block storage using GCEPersistentDisk.

Azure Disk Storage

An Azure Disk is also a durable and high-performance block storage like AWS EBS and GCEPersistentDisk. Providing the option to choose from SSD or HDD for your environment and features like Point-in-time backup, easy migration etc.
An AzureDiskVolume is used to mount an Azure Data Disk into a Pod. Azure Disks are replicated within multiple data centres for high availability and durability.

Network File System Storage

NFS is the one of the oldest used file system providing the facility to share single file system on the network with multiple machines.
There are several NAS devices available for high performance or can we make our system to be used as NAS. We use NFS for persistent storage for pods and data can be shared with multiple instances.

Deployment of Storage Solutions

Now we are going to walk through with the deployments of storage solutions described above. We are going to start with Ceph.

Tuesday, 8 May 2018

5/08/2018 06:51:00 pm

Test Driven & Behavior Driven Development in Python

Test Driven & Behavior Driven Development in Python

Overview

Test Driven Development (TDD) is a great approach for software development. TDD is nothing but the development of tests before adding a feature in code.
This approach is based on the principle that we should write small codes rather than writing long codes. In TDD, whenever we want to add more functionality in our codes, we first have to write a test for that. After that, we add new functionality with small code lines combined and then test it with our test. This approach helps us to reduce the risk of encountering significant problems at the production level.

Test Driven Development (TDD)

Test Driven Development is an approach in which we build a test first, then fail the test and finally refactor our code to pass the test.

Test Driven Development (TDD) Approach

As the name suggests, we should first add the test before adding the functionality in our code. Now our target is to make the test pass by adding new code to our program. So we refactor our code to pass the written test. This uses the following process -
  • Write a failing unit test
  • Make the unit test pass
  • Repeat

Test Driven Development (TDD) Process Cycle

Test Driven Development Process Cycle
As shown in the flow
  • First, add tests for the functionality.
  • Next, we run our test to fail.
  • Next, we write code according to the error we received.
  • Then we run the tests again to see if the test fails or passes.
  • Then refactor the code and follow the process again.

Benefits of Test Driven Development (TDD)

Now the question arises why should one opt TDD approach. Practicing TDD brings lots of benefits. Some of the benefits are listed below -
  • In TDD we build test before adding any new feature to it, that means in TDD approach our entire code is covered under the test. That’s a great benefit of TDD as compared to the code which has no test coverage.
  • In TDD one should have a specific target before adding new functionality. This means before adding any new functionality one should be clear about its outcome.
  • In an application, one method depends on the other. When we write tests before the method that means we should have clear thoughts about the interfaces between the methods. That allows us to integrate our method with the entire application efficiently and help in making our application modular too.
  • As the entire code is covered by the test that means our final application will be less buggy. This is a big advantage of the TDD approach.

Acceptance Test Driven Development (ATDD)

ATDD is short for Acceptance Test Driven Development. In this process, a user, business manager and developer, all are involved.
First, they discuss what the user wants in his product; then business manager creates sprint stories for the developer. After that, the developer writes tests before starting the projects and then starts coding for their product.
Every product/software is divided into small modules, so the developer codes for the very first module then test it and sees it getting failed. If the test passes and the code are working as per user requirements, it is moved to the next user story; otherwise, some changes are made in coding or program to make the test pass.
This process is called Acceptance Test Driven Development.

Behavior Driven Development (BDD)

Behavior Driven testing is similar to Test Driven Development, in the way that in BDD also tests are written first and tested and then more code is added to it to make the test pass.
The major area where these two differ is that tests in BDD are written in plain descriptive English type grammar.
Tests in BDD aim at explaining the behaviour of the application and are also more user-focused. These tests use examples to clarify the user requirements in a better way.

Features of Behavior Driven Development (BDD)

  • The major change is in the thought process which is to be shifted from analyzing in tests to analyzing in behaviour.
  • Ubiquitous language is used; hence it is easy to be explained.
  • BDD approach is driven by business value.
  • It can be seen as an extension to TDD; it uses natural language which is easy to understand by non-technical stakeholders also.
You May also Love to Read Test Driven Development in Scala

Behavior Driven Development (BDD) Approach

We believe that the role of testing and test automation TDD(Test Driven Development) is essential to the success of any BDD initiative. Testers have to write tests that validate the behaviour of the product or system being generated.
The test results formed are more readable by the non-technical user as well. For Behavior Driven Development to be successful, it becomes crucial to classify and verify only those behaviours that give directly to business outcomes.
Developer in the BDD environment has to identify what program to test and what not to test and to understand why the test failed. Much like Test Driven Development, BDD also recommends that tests should be written first and should describe the functionalities of the product that can be suited to the requirements.
Behavior Driven Development helps greatly when building detailed automated unit tests because it focuses on testing behaviour instead of testing implementation. The developer thus has to focus on writing test cases keeping the synopsis rather than code implementation in mind.
By doing this, even when the requirements change, the developer does not have to change the test, input and output to support it. That makes unit testing automation much faster and more reliable.
Though BDD has its own sets of advantages, it can sometimes fall prey to reductions. Development teams and Tester, therefore, need to accept that while failing a test is a guarantee that the product is not ready to be delivered to the client, passing a test also does not mean that the product is ready for launch.
It will be closed when development, testing and business teams will give updates and progress report on time. Since the testing efforts are moved more towards automation and cover all business features and use cases, this framework ensures a high defect detection rate due to higher test coverage, faster changes, and timely releases.

Benefits of Behavior Driven Development (BDD)

It is highly suggested for teams and developers to adopt BDD because of several reasons, some of them are listed below -
  • BDD provides a very accurate guidance on how to be organizing communication between all the stakeholders of a project, may it be technical or non-technical.
  • BDD enables early testing in the development process, early testing means lesser bugs later.
  • By using a language that is understood by all, rather than a programming language, a better visibility of the project is achieved.
  • The developers feel more confident about their code and that it won’t break which ensures better predictability.

Test Driven Development (TDD) with Python

Here we exploring the Test-driven development approach with Python. Python official interpreter comes with the unit test module

Python Unit Testing

These are the main methods which we use with python unit testing
Method
Checks That
a == b
a != b
bool(x) is True
bool(x) is False
a is b
a is not b
x is None
x is not None
a in b
a not in b
isinstance(a, b)
not isinstance(a, b)

Test Driven Development (TDD) in Python with Examples

We are going to work on an example problem of banking. Let's say that our banking system introduced a new facility of credit money. So we have to add this to our program.
Following the TDD approach before adding this credit feature, we first write our test for this functionality.

Setting Up Environment For Test Driven Development (TDD)

This is our directory structure
Environmental Setup for Test Driven Development in Python

CONTINUE READING: XENONSTACK/BLOG

Saturday, 28 April 2018

4/28/2018 03:29:00 pm

Time Series Analysis & Forecasting Using Machine Learning & Deep Learning


Time Series Analysis & Forecasting Using Machine Learning & Deep Learning

Time Series Analysis For Business Forecasting

Time is the only moving thing in the world which never stops. When it comes to forecasting, the human mind tends to be more curious as we know that things change with time.
Hence we are interested in making predictions ahead of time. Wherever time is an influencing factor, there is a potentially valuable thing that can be predicted.
So here we are going to discuss Time Series Forecasting. Different types of forecasting we can make the machine to do in real life. The name, ‘time-series’ itself suggests that data related to it varies with time.
The primary motive in time series problems is forecasting. Time Series Analysis For Business Forecasting helps to forecast/predict the future values of a critical field which has a potential business value in the industry, predict health condition of a person, predict results of a sport or performance parameters of a player based on previous performances and previous data.

Time Series Forecasting Methods

Univariate Time Series Forecasting

A univariate time series forecasting problem will have only two variables. One is date-time, and the other is the field which we are forecasting.
For example, if we want to predict the particular weather field like average temperature tomorrow, we’ll consider the temperatures of all the previous dates and use it in the model to predict for tomorrow.

Multivariate Time Series Forecasting

In multivariate case, the target would be the same, if we consider the above example as univariate, our goal is the same to predict average temperature for tomorrow, the difference is we use all other scenarios too in the model which affect the temperature like there is a chance for rainfall, if yes, what will be the duration of the rain? What’s the wind speed at various times? Humidity, atmospheric pressure, precipitation, solar radiation and many more.
All these factors are intuitively relevant to temperature. The primary point of consideration in comparison for univariate and multivariate is that multivariate is more suited for practical scenarios.

Time Series Forecasting Models

ARIMA Model

ARIMA means Autoregressive Integrated Moving Average. The AR part of ARIMA indicates that the evolving variable of interest is regressed on its own lagged (i.e., prior) values.
The MA part suggests that the regression error is a linear combination of error terms whose values occurred contemporaneously and at various times in the past.
The I (for "integrated") indicates that the data values have been replaced with the difference between their values and the previous values (and this differencing process may have been performed more than once).
The purpose of each of these features is to make the model fit the data, and its advantage is that it depends on the accuracy over a broad domain of time series despite being more complicated.

ARCH/GARCH Model

Most importantly volatility models for time series are Autoregressive Conditional Heteroscedasticity (ARCH) and extended to its generalized version GARCH model.
These models are very well trained in capturing dynamics of volatility from time series.
Univariate GARCH models have achieved fame in volatility models, but Multivariate GARCH is still very challenging to implement in the time series.

Vector Autoregression (VAR) Model

VAR is an abbreviation for Vector Autoregression.
VAR model captures the interdependencies among various time series in the data. This model is the generalization of the Univariate Autoregression Model

LSTM Model

LSTM stands for long short-term memory, and it is a Deep Learning Model.
LSTM is a type of Recurrent Neural Network(RNN), and RNNs are designed to capture the sequence dependence.
LSTM is capable of handling large architectures during training.
LSTM Model For Time Series Forecasting

ELMAN and JORDAN Neural Network

Elman and Jordan's neural network are two types of architectures of Recurrent Neural Network (RNN). These networks combine the past values of the context unit with the current input to obtain the final output.
The Jordan context unit acts as low pass filter, which creates an output that is the standard value of some of its most current history outputs.
In these networks, the weighting over time is inflexible since we can only control the time decay. Also, a small change in time is reflected as a significant change in the weighting.
Elman network is context units added to three layers network. In this, hidden layer is connected to these context units fixed with the weight of one. At each step of this when the input is fed forward, learning rule is applied.
Jordan network is same like Elman, but context units are fed from output layer instead of hidden layer.
ELAM and Jordan Neural Network

Approaches To Time Series Analysis

Let us assume a data with a mixture of both continuous and categorical columns, and we have to forecast a column named ‘value, ’ and this column is continuous.
Let the number of columns in the dataset be 100 named as ‘col1’,’col2’,’col3’... ’col100’. Along with this let there be a categorical column ‘cat’ with ten different categories.
Let us assume that each unique item in this ‘cat’ column represents a unique sensor. So, there are ten sensors, and we are getting data from all the ten sensors in real-time which is producing data for other 100 columns.
This is now a multivariate time series problem, and we have to forecast values for ‘value’ column. The later sections describe various approaches to go on with a dataset of this kind.
Questions to be asked about Data
  • How fast are we getting the data? (once in a second, minute, hour).
  • Are all the sensors giving data at the same time or different sensors giving data at different times?
  • Are the values of the sensors related or independent?
  • To predict one's future value, should we consider all the past data or the latest subset enough? If we are considering, just a subset, how much data is good enough for future predictions?

Approach 1

Univariate Model

In this approach, we just consider the time and the field which we are forecasting.
Before modelling, we have to do some data cleaning and transformation. Data cleaning includes missing value imputation, outlier detection, and replacement.
Data transformation is required to convert non-stationary time series to stationary time series.
A stationary time series is the one which complies with certain statistical measures like mean, variance, autocorrelation.
The data shouldn’t be heteroscedastic. That means at different intervals of the time these statistical measures are expected to be constant.
In general, the data won’t be this way, so data transformation techniques like differencing, log transformation, etc. can be used to get the time series stationary.

How to determine if Time Series is Stationary?

The two things that we can visualize is Trend and Seasonality. The same value of mean with varying time denotes a constant trend. Variations in specific time intervals denote seasonality.
To capture the trend, we can create rolling mean for the data and see how the rolling mean in varying with time. Rolling mean can be explained as taking the mean of previous few values and taking average to define the next rolling mean value.
To get rolling mean value at n, we take an average of time series values of n-1 to n-10. If the rolling mean stays constant, we can say that that trend is in consensus with stationarity.
To eliminate trend, we can use log transformation. Along with rolling mean, we could also take care of rolling standard deviation. To test the stationarity of a time series, we can use Dickey-Fuller test.
Another method that can be used to eliminate both trend and seasonality is Differencing.
Differencing is simply taking the difference between present value and the previous value. This is also called first-order differencing.
Determining if Time Series Data is Stationery

Time Series Forecasting

We could use ARIMA (Autoregressive integrated moving average) for forecasting.
ARIMA has two parameters ‘p’ and ‘q.' The value ‘p’ for AR and ‘q’ for MA. ‘p’ is the lags considered for the dependent variable. If ‘p’ is 10, for predicting value at time t, we use values of t-1 to t-10 as predictors.
‘q’ is similar to ‘p.' The only difference is instead of taking the values; we take the forecast errors in prediction. We can model AR and MA separately and combine them for forecasting. After the modelling is finished, the values are back-transformed or inversely transformed to get the predictions to the original scale.
In our case, the data has 100 columns with the mixture of both continuous and categorical values and the data is collected from multiple sensors at different times.
So, using the Univariate analysis to forecast value column may not be sufficient for reasonable predictions.
However, if there are no good relations between dependent and all independent variables and each sensor data is independent of others, we could still consider using univariate analysis for each sensor separately.

Approach 2

Multivariate Model using VAR

In a Multivariate model, we consider multiple columns to have the interdependencies intact and predict the values of the required field.
VAR (Vector Autoregression) is a statistical technique which is a stochastic process which helps is considering all the interdependencies among various time series fields in the data.
VAR is the generalization of the univariate AR model. Each variable in the model has its own lagged values to explain the predictions, and along with this it also considers other model variables too.
VAR comes with parameters Akaike Information Criterion and Bayesian Information Criterion ‘AIC,' ’BIC.' These parameters are used to tune the model to select the best for the data.

Approach 3

Multivariate Time Series Forecasting Using Deep Learning Keras with Tensorflow Backend

We could use Deep Learning techniques for time series forecasting. In years, sequential models with LSTM (Long short term memory) can be used for time series forecasting.
LSTMs are one of the recurrent neural networks (RNN). Recurrent neural networks consider the dependency of the sequence in which the data is being inputted.
Besides this, LSTM has the capability of handling huge data. The deep learning models also have the support of learning in batches, saving and reusing them with the new data to continue training process.
To implement the LSTM for multivariate data, we should convert the time series problem to supervised learning problem.
In this approach, to predict one's future value of a given field, we require values of the other variables too. So, in a way we have to predict all the other values too before predicting the required field.
We could employ many methods for this, and among them one of the methods is, to predict one's future value in an independent variable, we could find the average of previous few values.
To predict 10 future values, we could loop over the process for ten times. Another technique is to implement univariate time series for each variable, and once we have future data for all the independent variables, we could use LSTM for training.
Besides this, we can use the approach to train huge dataset in batches. We can work out with the number of hidden layers, other parameters during the model fitting to get the best model.
Multivariate Time Series Forecasting Using Deep Learning

Continue Reading:XenonStack/Blog