From the course: Practical Python for Algorithmic Trading

Compute machine learning classification model - Python Tutorial

From the course: Practical Python for Algorithmic Trading

Compute machine learning classification model

- In our investment strategy, we'll use machine learning models to predict the future. Nevertheless, the models are not perfect, because some days we got it wrong. For the 14th of March, the price will go up tomorrow. Nevertheless, the machine learning model classified as down. In this tutorial, we'll show you how to create machine learning models with the secret learn library so that in the future you'll be able to create better machine learning models by applying the same reasoning. As always, we are start with the historical data and the data that we pre-processed in the previous tutorial. And the first question we need to ask when we are developing machine learning models is which variable do we want to predict? That is the change tomorrow direction. Therefore, we should save this column into a new variable that we will call it target, and then we need to choose which variables we will pass to the model so that we can predict the tomorrow's direction. Those are going to be open, highest, lowest, closed, and volume. So we can drop these two and take the rest. Drop columns. We execute. Now we can see these common columns. We save it as explanatory. Now it's time to compute the machine learning model. Where can we find the code of the machine learning algorithms within the SK Learn library? We can see many options in this library, because there are many machine learning models, but we want the three models, concretely the decision tree. Here you can see classifier and regressor. Since the variable that we want to predict is a category up or down, we need to use classifiers. So we import the class, we create the instance for this class, and now we are ready to get inside the objects and find the function that will calculate the mathematical equation which is fit, and we can see in the documentation that they are asking for the X and the Y. The X is explanatory variables and the Y, the target variable. The variable that we want to predict. So X equals to explanatory, which are these columns, and the Y equals to the targets which is this column. We execute. We would have expected to have here a mathematical equation or a tree, but we get this output. Don't worry, it's fine, we haven't gotten any error, so the mathematical equation of the machine learning model has been correctly computed. If we want to execute different operations with the machine learning model, we need to execute different functions. It's not necessary to get into the details. For example, we can visualize the model by executing the following lines of code. It will take some time, so be patient until we see the output. Here we can see this massive decision tree. In this chapter, we won't get into the details of how to interpret this visualization, so we stick to what we do so far and in future tutorials, we'll talk about this. To make the example easier to understand, let's shorten this three so we can apply max depth and we will set five levels on the tree. And here we can see the shorter tree. Now it's time to calculate the predictions. This is based on the explanatory columns. We will pass the information for each day and calculate what will happen in the price of the stock tomorrow. So we find the function in the same object that we computed the machine learning model, and if we had a function to compute the mathematical equation, we will have another function to calculate the prediction. Within the function, we can see that they ask for the X parameter, the explanatory. We execute, and here we can see the predictions. We save it into a new variable, let's call it Y breadth, and now let's put the Y breadth into a data frame to see how it compares the prediction with the reality. We can observe that we have got all of them right in this case, but in the middle probably we'll have some misclassifications. So how can we evaluate the machine learning model? We get inside the change direction tomorrow, and we check is this equal every row to the prediction, here we see all of them true. In the middle we will have some falses, so we save the results in a variable and if we sum the rows, the truth will be taken as one and the falses as zero. So we get 900 rights from the total of elements that are 1500, and if we calculate the ratio, it will give us the accuracy of the model, which tells that our model got 59% of the days right. Now I want you to reflect on one particular thing about machine learning models. We had a function to calculate the mathematical equation fit. We also had another function to calculate the predictions with that mathematical equation. Shouldn't be there a function to compare the predictions with the reality automatically? We have it and that is a score. Now this pattern that I have just shown you, it will be on any machine learning model of the secret learn library. So from this small example, you can apply it to any of the hundreds of machine learning model in the ticket learn library. Now to calculate the score, X equals to the explanatory and the Y equals to the target as they ask in the documentation. Now we execute and we get the same number that we had before calculated by hand. Now there is one more thing to reflect. We could improve the accuracy of the machine learning model that ideally should be above 80%. How can we make it? If we go to the three, we can see that is short. This three doesn't capture a lot of patterns within the data, but if we make the three larger, for example, 15 levels, it'll capture better the patterns within the data so that it can make better predictions. And here we can see that now we have 80% of accuracy. There are many more machine learning concepts to learn, but don't worry, we will explain all of them in the following chapter. Now in the following tutorial, we will use the mathematical equation that we have just computed. So how can we save this object within a file to use it in a different file for the following tutorial? We can create a pickle file that contains this object. Before, let's create a folder that will be called models. And now you will see it in this part of the screen. You see now the folder models, and to save with a pickle file, we will go to the models folder and name the file model DT classification. We pass the object where we have the mathematical equation, we execute, and in this file we have saved the mathematical equation. Now let's move on to the following tutorial where I will show you how to pass this machine learning model to an investment strategy for back testing.

Contents