Machine Learning and Deep Learning Algorithm Trading, Building Moving Average Models

In today’s financial markets, algorithmic trading has become an essential tool for many investors and traders. Especially, automated trading systems utilizing machine learning (ML) and deep learning (DL) technologies are gaining attention due to their efficiency and accuracy. This course will provide a detailed discussion on how to build a trading system using machine learning and deep learning algorithms, focusing on the Moving Average (MA) model.

1. Overview of Moving Averages (MA)

Moving averages are techniques used to analyze price trends of various assets such as stocks, commodities, and foreign exchange. They are calculated to reduce price volatility and identify long-term trends. There are several types of moving averages, with the two most commonly used being Simple Moving Average (SMA) and Exponential Moving Average (EMA).

1.1 Simple Moving Average (SMA)

SMA is the value calculated by simply averaging the prices over a specific period. For example, the 5-day SMA is the sum of the closing prices of the last 5 days divided by 5. While SMA is intuitive and easy to understand, it has the drawback of being insensitive to price changes.

1.2 Exponential Moving Average (EMA)

EMA is calculated by giving more weight to recent prices, making it more sensitive to recent price changes. This makes it a more effective indicator in rapidly changing markets. EMA is calculated using the following formula:

EMA = (Current Price * k) + (Previous EMA * (1 - k))
k = 2 / (N + 1)  // N is the period for calculating the moving average

2. Building Moving Average Models with Machine Learning

Moving average models applying machine learning can predict the future prices of stocks based on historical data. The next steps will involve preparing the dataset to be used in this project and selecting a machine learning algorithm to build the model.

2.1 Data Preparation

We will use a CSV file containing stock data to build the model. Typically, stock data consists of columns such as Open, High, Low, Close, and Volume. We will load this data using the pandas library:

import pandas as pd

# Load data
data = pd.read_csv('stock_data.csv')
print(data.head())

2.2 Data Preprocessing

Data preprocessing is a crucial step to ensure that the machine learning model can learn effectively. It includes handling missing values, removing outliers, selecting features, and scaling. In particular, we need to add new columns to calculate moving averages:

# Handle missing values
data.fillna(method='ffill', inplace=True)

# Add moving average columns
data['SMA_5'] = data['Close'].rolling(window=5).mean()
data['EMA_5'] = data['Close'].ewm(span=5, adjust=False).mean()

2.3 Setting Features and Target Variables

To train the machine learning model, we need to set the input features and the target variable we want to predict. For example, we can proceed with predictions for the ‘Close’ price:

X = data[['SMA_5', 'EMA_5', 'Volume']]
y = data['Close'].shift(-1)  # Predict the closing price for the next day
X = X[:-1]  # Remove the last row
y = y[:-1]

2.4 Choosing a Machine Learning Model

Among various machine learning algorithms, models such as Decision Tree, Random Forest, and XGBoost can be chosen. Here, we will use Random Forest as an example:

from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the model
model = RandomForestRegressor(n_estimators=100)
model.fit(X_train, y_train)

# Predict
y_pred = model.predict(X_test)

# Evaluate performance
mse = mean_squared_error(y_test, y_pred)
print(f'Mean Squared Error: {mse}')

3. Building Moving Average Models with Deep Learning

We will create a moving average model that learns more complex patterns using deep learning. We will implement a simple artificial neural network (ANN) using TensorFlow and Keras libraries.

3.1 Data Preparation and Preprocessing

The data is prepared similarly to how it is for machine learning, but deep learning models typically require more data, so we may use data over a longer period. Additionally, the input to the neural network must be in 3D shape, requiring a reshape:

import numpy as np

# Data scaling
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(data[['Close', 'SMA_5', 'EMA_5', 'Volume']])

# Data reshape
X = []
y = []
for i in range(60, len(scaled_data)):
    X.append(scaled_data[i-60:i])  # Use 60 days of data as input
    y.append(scaled_data[i, 0])   # The value to predict is the closing price
X, y = np.array(X), np.array(y)

3.2 Building the ANN Model

We will construct the artificial neural network model using Keras. Here, we will use a simple structure utilizing Dense layers:

from keras.models import Sequential
from keras.layers import Dense, LSTM, Dropout

model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape=(X.shape[1], X.shape[2])))
model.add(Dropout(0.2))
model.add(LSTM(50, return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(25))
model.add(Dense(1))  # Predicting the closing price

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model
model.fit(X, y, batch_size=32, epochs=50)

3.3 Prediction and Performance Evaluation

We will use the trained model to make predictions and compare the actual closing prices with the predicted prices for performance evaluation:

# Prediction
predictions = model.predict(X)

# Inverse scaling
predictions = scaler.inverse_transform(predictions)

# Performance evaluation
import matplotlib.pyplot as plt

plt.plot(data['Close'].values, color='blue', label='Actual Closing Price')
plt.plot(range(60, len(predictions) + 60), predictions, color='red', label='Predicted Closing Price')
plt.legend()
plt.show()

4. Interpretation of Results and Improvement Strategies

Interpreting the prediction results obtained from the model is also an important task. The closer the predictions are to the actual prices, the better the model’s performance. If the predictions are inaccurate, the following improvement strategies can be considered:

Increasing the amount of data
Adding diverse features
Tuning the model’s hyperparameters
Trying various machine learning and deep learning algorithms

5. Conclusion

In this course, we explored how to build an algorithmic trading model based on moving averages using machine learning and deep learning. Moving averages are fundamental yet useful indicators, and by combining them with machine learning and deep learning, more sophisticated trading strategies can be established. Furthermore, ongoing research and development are necessary through various datasets and algorithms.

References

Python for Natural Language Processing: NLTK
Python for Machine Learning: Scikit-learn
Python for Deep Learning: TensorFlow
Stock Data API: Alpha Vantage