Markov Chains Tutorial - Part II

Today we shall code the Markov Chain. I will be using Python for this tutorial. If you are coming from another language, the syntax of Python won't cause trouble to you and you will be able to code it in your desired language. The Markov Chain, is made from a few functions, first being the "dictionary". This will create a dictionary, with all the words and the words which neighbor them, from our data. The second function, is the "generation"/"make string", this will generate new strings from our dictionary. The last one is optional, but to make our project fun, thus we will use it. It is the "speak" function. This will speak out the text that is generated.

Dictionary Function

First we will begin by declaring an empty dictionary and by splitting the text into individual words. Then we will make our index equal to our order (please see the previous blog for order). The index, is number that will help us move through our data and create the dictionary. 

def dictionary(data, order):
    Dict = {}
    words = data.split(' ')
    index = order

Now we shall loop through the created list 'words', starting from the index, till the end. Then we will find the words that exist before the each word in the data. Then we will run a check that, if that phrase is not the only then we will append it to the dictionary, although if it is the only one then we will make the word equal to the phrase.

for word in words[index:]:
    key = ' '.join(words[index - order:index]) 
    if key in Dict: 
        Dict[key].append(word)
    else: 
        Dict[key] = [word] 
    index += 1
return Dict

The dictionary when created for the string below with order 3, will look something like this. 

"Hello world, I am a Coder"
{'Hello world, I': ['am'], 'world, I am': ['a'], 'I am a': ['Coder']}

Make_String Function

To create our randomized string, we will first have to import the random library. Then we will have to randomly choose a starting point for our phrase and then add that to a empty string 

import random as r
def make_string(Dict, length): 
    oldWords = r.choice(list(Dict.keys())).split(' ') 
    string = ' '.join(oldWords) + ' '

Now we will define another loop to repeat for the desired length of the text. In the loop we add a try-except statements. This will help handle the error, because they are common, very common. Now we will take a variable 'key', which will be equal to our starting words. Then we will search the dictionary for the key and randomly choose any one. Then we pass it through a loop which will add the new random word to the list of the old words and make the last item of the list our newly chosen word. I call this bit the transformer. It does something like this

It makes this

['like', 'to', 'sleep.']

into 

['to', 'sleep.', 'And']

The code

for i in range(length): 
    try:
        key = ' '.join(oldWords)
        newWords = r.choice(Dict[key])
        string += newWords + ' '
        # The Transformer
        for word in range(len(oldWords)):
            oldWords[word] = oldWords[(word + 1) % len(oldWords)] 
            print("oldwords:", oldWords)
        oldWords[-1] = newWords
        # End of Transformer
    except KeyError:
        return string
return string

The Speak Function

This function involves gTTS and playsound libraries. We will first call the gTTS function. Save the file and then play. Something peculiar about this that first you need to save the speech which was converted from the text and then play that file.

from gtts import gTTS
import playsound as ps
def speak(text):
    tts = gTTS(text = text, lang = 'en', slow = False)
    filename = 'Speech.mp3'
    tts.save(filename)
    ps.playsound(filename)

Generation

After all this find some data and store it as a string variable data. Now we shall see how to call all of these functions. First we will call the dictionary function, then the make_string function and finally the speak function. 

data = "Any random text, from any random source"
order = 3
length = 100
dict = dictionary(data, order)
string = make_string(dict, length)
speak(string)

That's it for today! Experiment with length of the data string and the order. Have fun playing around with this Markov Chain!!

Comments

Most Popular Posts