Markov Chains Tutorial - Part II
Today we shall code the Markov Chain. I will be using Python for this tutorial. If you are coming from another language, the syntax of Python won't cause trouble to you and you will be able to code it in your desired language. The Markov Chain, is made from a few functions, first being the "dictionary". This will create a dictionary, with all the words and the words which neighbor them, from our data. The second function, is the "generation"/"make string", this will generate new strings from our dictionary. The last one is optional, but to make our project fun, thus we will use it. It is the "speak" function. This will speak out the text that is generated.
Dictionary Function
First we will begin by declaring an empty dictionary and by splitting the text into individual words. Then we will make our index equal to our order (please see the previous blog for order). The index, is number that will help us move through our data and create the dictionary.
def dictionary(data, order): Dict = {} words = data.split(' ') index = order
Now we shall loop through the created list 'words', starting from the index, till the end. Then we will find the words that exist before the each word in the data. Then we will run a check that, if that phrase is not the only then we will append it to the dictionary, although if it is the only one then we will make the word equal to the phrase.
for word in words[index:]: key = ' '.join(words[index - order:index]) if key in Dict: Dict[key].append(word) else: Dict[key] = [word] index += 1 return Dict
The dictionary when created for the string below with order 3, will look something like this.
"Hello world, I am a Coder"
{'Hello world, I': ['am'], 'world, I am': ['a'], 'I am a': ['Coder']}
Make_String Function
To create our randomized string, we will first have to import the random library. Then we will have to randomly choose a starting point for our phrase and then add that to a empty string
import random as r
def make_string(Dict, length): oldWords = r.choice(list(Dict.keys())).split(' ') string = ' '.join(oldWords) + ' '
Now we will define another loop to repeat for the desired length of the text. In the loop we add a try-except statements. This will help handle the error, because they are common, very common. Now we will take a variable 'key', which will be equal to our starting words. Then we will search the dictionary for the key and randomly choose any one. Then we pass it through a loop which will add the new random word to the list of the old words and make the last item of the list our newly chosen word. I call this bit the transformer. It does something like this
It makes this
['like', 'to', 'sleep.']
into
['to', 'sleep.', 'And']
The code
for i in range(length): try: key = ' '.join(oldWords) newWords = r.choice(Dict[key]) string += newWords + ' ' # The Transformer for word in range(len(oldWords)): oldWords[word] = oldWords[(word + 1) % len(oldWords)] print("oldwords:", oldWords) oldWords[-1] = newWords # End of Transformer except KeyError: return string return string
The Speak Function
This function involves gTTS and playsound libraries. We will first call the gTTS function. Save the file and then play. Something peculiar about this that first you need to save the speech which was converted from the text and then play that file.
from gtts import gTTS import playsound as ps
def speak(text): tts = gTTS(text = text, lang = 'en', slow = False) filename = 'Speech.mp3' tts.save(filename) ps.playsound(filename)
Generation
After all this find some data and store it as a string variable data. Now we shall see how to call all of these functions. First we will call the dictionary function, then the make_string function and finally the speak function.
data = "Any random text, from any random source" order = 3 length = 100 dict = dictionary(data, order) string = make_string(dict, length) speak(string)
That's it for today! Experiment with length of the data string and the order. Have fun playing around with this Markov Chain!!
Comments