Learning Scala in 5 parts: Part 3

Caesar Cipher

Today we’re gonna look at a project which was part of the exam for the introductory programming course at my university last year. The project requires some knowledge about how the Caesar Cipher works. Luckily the idea can be explained swiftly:

Caesar Cipher works by creating a mapping between the alphabet (English for this tutorial) to a version of the alphabet that has been cyclically shifted by k to the right. A cipher text is created by applying this mapping to each letter in the input text. To decrypt the text the reverse mapping can be used on each letter yielding the original text.
The only information needed to decrypt a text is the key, k, with which the text was encrypted.


For more information on Caesar Ciphers please check: Caesar cipher at wikipedia. Also, this is fun 😀

Requirements

The project has been altered a bit to fit this tutorial and the requirements are now as follows:

  • Implement an encryption function using the Caesar Cipher encryption scheme
  • Implement the corresponding decryption function
  • Define at least one class and one object in separate files for the application
  • Write a program that takes the following parameters:

To test the program we are going to use a large input text which you can download here (right click and select download).

Math brushup

The modulo operation is defined as the remainder of integer division and is written as “a mod b”. In scala, % is used to denote modulo. If you fire up your REPL, you can test the operator to get a feeling of how it works:

Now, understanding the modulo operator shouldn’t be too hard, so since we’re here, take your time to make sure you understood how the for loop used in the above works!
If you’re having trouble understanding it or want the juicy details, you can google it 🙂.

Approaching the project

You should of course spend some time thinking through the task and making sure you understand everything. Once this is done it is time to design the algorithms we need. This boils down to designing the encryption and decryption functions. I’ll pretend we have some oracle to design our application and simply take you through the steps.

First up we always need to define an object which contains the main function. This is mandatory. The name of the object, however, isn’t. Let’s call it CaesarCipher. Create CaesarCipher.scala and open it in your text editor. Then type in the minimum object like in part 2:

Here’s another rule of thumb for you: Try to avoid implementing too much inside objects. This is helpful for the design of your software for reasons that are severely out of scope for this tutorial. With this in mind though, let’s create another file. Call it Cryptor.scala and open it. Put the following inside:

That’s possibly even less exciting than the object we just made. Notice that we only need one main function in an entire program and thus Cryptor doesn’t need one. We can add some simple function first:

Now let’s change the CaesarCipher object to use this functionality:

This time, compile BOTH THE OBJECT AND THE CLASS:

Things just got complicated and you probably didn’t even notice. We now have two different kinds of objects. The object we defined in the text file CaesarCipher.scala is a singleton object. This means that there is only one of this kind of object and it is automatically created when we start the program. The class Cryptor on the other hand can create many objects and no objects are automatically created. This is why we need the new Cryptor() part in the program. When we create an object from a class we can use the functions and values provided on that kind of object. Those are exactly the values and functions we defined in the class-file, namely the function printThis(…).

Encrypt, take a deep breath…

Now is the time to brew some coffee as we start with the encryption function.

The encryption function is the main challenge in this project. We could create a map for each letter showing that “a->c,b->d,c->e” and so on, depending on the key we get, but this wouldn’t be general in the sense that we would need to create a new map every time we got a new key. Of course there are only 25 different keys if we use the English alphabet, but imagine using the Chinese alphabet. The memory needed to store this map would be a bit too big.
I suggest we try to find a closed formula for encrypting a single character. Luckily letters are represented as numbers in the computer. Let’s see what REPL can tell us:

First of all, ‘a’ to ‘z’ gives us the alphabet in a list (or sequence). We just map the function that converts each character to the integer representation. So ‘a’.toInt is equivalent to 97 and ‘z’.toInt is 122. 122-97 is 25 which is what we expect. In fact, in Scala you can do mathematical operations on characters:

You should have a basic idea on how to encrypt a single letter now. If you paid attention with the modulo operation, you can even figure out how to prevent going out of the alphabet. What we need is a cyclic shift, so using the modulo operation we can get numbers between 0 and 25.
To get the position in the alphabet we can just subtract the current letter from ‘a’. If we then add k, we should get the desired position in the alphabet after the encryption. To prevent going outside the alphabet, we just use modulo with 26.

This function would give us the index in the alphabet after the encryption. However, we need the actual character so we need to add back ‘a’ to the result. Finally we convert the integer to a character:

Depending on how well you understand these two expressions you could consider combining them like so:

Let’s try it in REPL before we add it to our program:

Okay, it works…kinda! It didn’t ignore the space. Futhermore we didn’t handle the case of capital letters. If you understood this so far please pad yourself on the back as we get ready to finish the program.

Hit the ground running

Now we’re gonna go faster. Using pattern matching we can fix the problems with the function. Make your Cryptor.scala file look like this:

private simply means that this function can only be used inside this class. But what good is a class where we cant use any of the functions in our main program (i.e. in the Cryptor object)? Not very good at all! So we should add a few functions which aren’t private. It could be nice to have functions to encrypt and decrypt an entire String. In the end, that is really what we would use it for. We add the functions by noting that decrypting is the same as encrypting with 26-key:

That’s it for the Cryptor class. Now we just need to read the text from a file, create a Cryptor object from the Cryptor class, encrypt the text with the cryptor and finally write the encrypted text to a file.

The only new thing happening here is the fact that we are using imports. Import is like saying “I need to use this class from another library in this part of my program”. Simple, right? You can find out which classes, objects and fuctions you can import on the Scala api. Notice that we are importing both functions and classes and the fact that we have to call the close function on the FileWriter before it actually writes to the file. Also, fromFile is a function on an object called Source. To distinguish between objects created from a class and objects like Source and CaesarCipher, I will denote the latter as singletons from now on.
Something I have been doing a lot without explaining is calling functions in two different ways. In Scala a function is always called on an object or singleton. Functions can be called infix:

Yes, 1 and 2 are objects. + is a function being called on 1. We can also call it using the dot-notation:

pretty cool huh? And pretty weird that this example returns a Double value! I’ll try to find out why later 🙂 I like to use inifix notation when I create single-line functions and the dot-notation for multi-line functions. Let’s continue with Caesar.
Update: A few tests in REPL reveals that 1. is interpreted as 1.0, which is a double. Consider the following example for insight:

Test

Finally, let’s test the program. We encrypt bible.txt with the key 7 and then we decrypt it with the same key. If everything works right the decrypted text should be plain English, whitespace and symbols untouched. The head command shows the first few lines of a file in Ubuntu. Alternatively you can open the file in a text editor (can take a while because of the text size). Let’s go:

Advanced: Bonus knowledge

You can change the Cryptor to use currying:

By changing the caesar function to take two arguments in each their own parenthesis it allows the function to be curried. By doing this you can use a much more concise syntax in the encrypt function.

Part 4

CategoriesUncategorized

Leave a Reply

Your email address will not be published. Required fields are marked *