Introduction to artificial intelligence: Example using Python and TensorFlow


Today's blog post will provide a very simple insight into how AI functions, aided by Python code examples and the TensorFlow software library. Strap in, and let's start this journey together.

TensorFlow, an open-source software library developed by the brilliant minds at Google Brain, forms a robust framework for constructing and designing machine learning models. Today, we will break down these complexities and navigate through a straightforward example: training an AI model to add two numbers.

You might be thinking, why not just use Python's "+" operator? While this is a valid argument from an efficiency standpoint, this simplistic example helps with learning basic AI setup .

Now, let's get our hands dirty with some coding. The first and most crucial step in any AI project is data collection. The quality of the model's predictions directly hinges on the quality of the data we feed it. As the adage goes, "Garbage in, garbage out." In today's data-centric world, anyone with access to comprehensive, high-quality data is indeed sitting on a goldmine.

For this experiment, we will generate the data ourselves. We'll utilize two incredibly handy Python libraries for data science: Pandas and NumPy. Think of these as Excel on steroids, but for Python. Additionally, we'll leverage the sklearn Python library to partition our data , a concept we'll discuss later in this post.

To get started, we can effortlessly install these libraries alongside TensorFlow using pip:


  pip install numpy pandas scikit-learn tensorflow
  

We will kick off by generating 1,000 random numbers between 1 and 10,000, distributed across two columns, A and B:


  import pandas as pd
  import numpy as np
  import tensorflow as tf
  from sklearn.model_selection import train_test_split

  # Define the size of the dataframe
  size = 1000

  # Generate random integers for columns A and B
  A = np.random.randint(1, 10001, size)
  B = np.random.randint(1, 10001, size)

  # Create the DataFrame
  df = pd.DataFrame({
      'A': A,
      'B': B
  })
  

Now let's create a third column, C, which will be the sum of columns A and B:


  # Create column C as the sum of A and B
  df['C'] = df['A'] + df['B']
  

The next step involves splitting our dataset into training and testing sets. We use the sklearn library for this purpose. Our goal is for the model to figure out the relationship between A, B, and C (in this case, addition). Once the model grasps this relationship, it should be able to add any pair of numbers—even numbers beyond 10,000 that it has not yet encountered. This is a concept known as generalization. Column A and B are considered our Inputs. Column C is the value we want the model to "predict" and it is usually called a "target".


  # Split the data into train and test sets
  train_df, test_df = train_test_split(df, test_size=0.2)

  # Separate inputs and targets
  train_inputs = train_df[['A', 'B']]
  train_targets = train_df['C']
  test_inputs = test_df[['A', 'B']]
  test_targets = test_df['C']
  

In the code snippet above, note the use of test_size = 0.2. This implies that we reserve 20% of the data for testing, while the remaining 80% is used for training. In numerical terms, 800 rows for training and 200 rows for testing.

Now, onto the challenging part—model design. This requires a keen understanding of the problem at hand and often involves a fair amount of trial and error. For this demonstration, our model will consist of two Dense layers: one with a size of 4 and another with a size of 1. The latter serves as our output layer. We discuss Keras layers a bit more in depth in this blog post

We will also employ "adam" as our optimizer—a method used to tweak the model's parameters to minimize the error or loss function. In this case, we will use "mean square error" as our loss function. Here's how our model looks:


  # Define the model
  model = tf.keras.Sequential([
      tf.keras.layers.Dense(units=4, activation='linear'),
      tf.keras.layers.Dense(units=1, activation='linear')
  ])

  # Compile the model
  model.compile(optimizer='adam', loss='mean_squared_error')
  

Time to train our model:


  # Train the model
  model.fit(train_inputs, train_targets, batch_size=32 ,epochs=1000, verbose=1)
  

Our model will be fed with training inputs (train_inputs), which consist of the values from columns A and B. Similarly, it will be passed the training targets (train_targets), containing the values from column C. In essence, given inputs A and B, our model will aim to predict a corresponding value, C, which we will refer to as predicted_C.

The model compares predicted_C with the actual value of C found in the train_targets variable. It then calculates an error value based on a predefined loss function. This error essentially quantifies how much our prediction deviates from the actual value. For our case, we calculate this error using the mean square error (MSE), expressed mathematically as the mean of (predicted_C – C)².

The optimizer, in our case 'Adam', is then responsible for guiding the model to minimize this error. The smaller the error, the more accurate the prediction. Calculating the mean square error for a single value doesn't provide a meaningful measure, hence the fit function automatically divides the data into batches based on the dataset size. In our example, however, we specify the batch size manually to be 32. This means that our dataset will be devided in 25 batches each containing 32 rows.(25*32=800 rows)

Next, let's clarify the concept of epochs. An epoch signifies a single pass through the entire dataset during training. In our case, epochs=1000 means the model will iterate over the entire dataset a thousand times. If the epoch number is too low, the model will not have enought time to learn, resulting in high error. Conversely, if the epoch number is too high, the model will memorize the input values rather than learning to generalize from them.This is called overfitting. This could lead to poor predictions for unseen data.

Having trained the model, what remains now is to test it.


  # Evaluate the model
  loss = model.evaluate(test_inputs, test_targets, verbose=0)
  print(f"Test loss: {loss}")

  A=900
  B=-300
  test_data = np.array([[A, B]])
  prediction = model.predict(test_data)

  print(f"The model predicts that the sum of {A} and {B} is: {prediction[0][0]}")
  

The first line calculates the model's loss (or error) using test data, which is our remaining 200 rows unseen by the model during training. This evaluation allows us to gauge how well the model generalizes. If the training loss is significantly lower than the test loss, it's usually a sign of overfitting.

Subsequent lines of the code are used to test the model with arbitrary numbers of our choice. You can input any values for A and B and examine the model's output. Do keep in mind, however, that the output won't always be 100% accurate since there will invariably be a margin of error.

Output Example


  Epoch 999/1000
25/25 [==============================] - 0s 1ms/step - loss: 6.2732e-06
Epoch 1000/1000
25/25 [==============================] - 0s 1ms/step - loss: 4.8342e-06
Test loss: 6.022305569786113e-06
1/1 [==============================] - 0s 78ms/step
The model predicts that the sum of 900 and-300 is: 599.99755859375
  

Interestingly, the value of B can be negative. Although the model was trained with values between 1 and 10000, it can still compute 900 + (-300). This indicates the model has indirectly learned to perform subtraction. While it's not magical, it serves as an important reminder that training can lead to unintended learning outcomes.

Stay tuned for our upcoming posts, where we'll dive deeper into model design and other advanced topics. Until then, keep experimenting and remember—every piece of code is a step closer to unravelling the mysteries of AI!

Get updates directly in your mailbox by signing up for our newsletter. Signup Now


You can test and play with the code yourself by going to our google colab

Comments

Popular Posts