Binary and Hexadecimal: A Quick Java Refresher

7pm on Wednesday 7th February, 2018

Many Java Developers will learn, quickly forget about, and never think about ever again, the ins and outs of data types they use every day. This could be considered a quick refresher for anyone who suddenly discovers they're working with raw data and the message they've been sent is 0x0000FEAD and not the regular looking 65197 they were expecting. Hopefully this all feels incredibly familiar and easy.

Note: We only look at Little Endian here, but it's all true for Big Endian too just organised differently. Click here once you've finished reading this post. Also note that internally Java uses Big Endian.

Binary, Octal and Hex

Binary, Octal, Decimal and Hex are all ways of representing numbers. We often have to consider these different ways of writing the same numbers because of how computers work with numbers, and the effect this has on our numbers.

Binary

Binary numbers are numbers that represented by only 2 symbols - which is why it's also called base 2 - usually 0 and 1 (we call these bits). If we're writing a number in binary we will either write them with a prefix of \(0b\) or with a bracketed subscript with the base number like this: \((11)_2\), so this number is not 11, but 3 when converted to decimal. Like in decimal, we count up at each digit until we hit our max, then add a new digit as required. This gives us the following table:

Decimal 1 2 3 4 5 6 7 8 9 10 ... 100
Binary 1 10 11 100 101 110 111 1000 1001 1010 ... 1100100

That is, each digit in binary represents a power of 2, starting with \(2^0\) from the right. So a conversion from binary to decimal can be done by taking the individual digits, multiplying each by their corresponding power of 2 and adding them all together. For example if we wish to convert \((11011010)_2\) to decimal we may calculate \(1 \times 2^7 + 1 \times 2^6 + 0 \times 2^5 + 1 \times 2^4 + 1 \times 2^3 + 0 \times 2^2 + 1 \times 2^1 + 0 \times 2^0 = (218)_{10}\). We can compare this to how a decimal number, for example 335, is \(2 \times 10^2 + 2 \times 10^1 + 5 \times 10^0\).

This is what computers use fundamentally when doing all their calculations. Your typical binary number on a 32-bit computer is also 32 bits in size, and similarly on a 64-bit machine a 64-bit number is the default. This means that the largest number that can be natively represented by a n-bit machine is n-bits. Although, we can work with larger numbers with some tricks if we need to (such as how BigInteger or BigDecimal work). It is rare to see a 32-bit machine these days because of memory limitations. This memory limitation comes from 32-bit machines only being able to store memory in 2^32 different places, which equates to around 4GB of "addressable" memory. This literally means if we were to try to use any more than this limit, we wouldn't be able to recall the data from memory because we cannot represent a number big enough to find its position in memory.

Octal

Octal is sometimes used and is represented with 0o. Exactly as with binary, except we now use 8 symbols, so each digit represents a power of 8. Our table now is:

Decimal 1 2 3 4 5 6 7 8 9 10 ... 100
Binary 1 10 11 100 101 110 111 1000 1001 1010 ... 1100100
Octal 1 2 3 4 5 6 7 8 10 11 ... 144

Hexadecimal

Hexadecimal numbers are numbers where each digit represents a power of 16. This means we need 16 symbols, and since we usually count in decimal we only have ten to use, so for the extra six we use A,B,C,D,E and F. Where A is equivalent to 10 in decimal and F is equivalent to 15 in decimal. Our prefix for hex is 0x. Hexadecimal is often used because if we are talking about locations in memory, or large numbers, we would be using a lot of extra characters to represent that number. We don't use decimal for this because 10 is not a power of 2.

Data stored in bits are grouped into 8 bits, which we call a byte. If we had stored the decimal number 218 in a computer, it would store the value 0b011011010. With hex we can instead write this as 0xDA, and you should note that the maximum number represented by 8 bits (0b11111111) is also the maximum number represented by 2 digits in hex (0xFF), which is 255 in decimal. When we include 0, we have 256 total values from either of these.

Our table now is:

Decimal 1 2 3 4 5 6 7 8 9 10 ... 100
Binary 1 10 11 100 101 110 111 1000 1001 1010 ... 1100100
Octal 1 2 3 4 5 6 7 8 10 11 ... 144
Hex 1 2 3 4 5 6 7 8 9 10 ... 64

Numeric Java Data Types

For the sake of speeding things along, I'm only considering primitive types here, starting with the most familiar.

  1. int's are 32 bits or 4 bytes in size. Since the first bit of an integer determines whether it is positive or negative, the maximum value is \(2^{31}-1\), where the -1 is the value 0. In total there are \(2^{32}\) possible values when we include 0 and all the negative numbers. In Java we can specify a number in a different base, such as binary or hex, by using their prefixes. For example, we can write int a = 0xFE; to give the integer the value 254. All integral data types in Java are signed.
  2. long's are 64 bits or 8 bytes in size. They are exactly like ints, but can store much larger values since they use 8 bytes instead of 4.
  3. short's are also just like int's, but only use 2 bytes, giving a maximum value of \(2^{15}-1\).
  4. byte's are the smallest number types you can have in Java, and are only a single byte as the name suggests. They have a maximum value of 127 and a maximum negative value of -128.

Lastly, float's and double's can be used to represent floating point numbers (such as 15.23). They are 4 and 8 bytes in size respectively. Since these have little to do with integral numbers, see here for how these represent floating point numbers: https://www.doc.ic.ac.uk/~eedwards/compsys/float/

Now when we say that a type has a size of n-bytes, we mean that regardless of what number is represented we use the same amount of memory to store that number. For example, if we are calculating with the number 7 of type int, the bits stored in memory are actually 0b00000000000000000000000000000111. We could also write this in hex as 0x00000007.

A Hex Example

Suppose we have received a message from a socket that we know is a 4 byte integer, and we have the 4 bytes in a byte[] array. If we were to inspect the array we'd see something along the lines of 4 entries all of the form 0x??. We could use ByteBuffer.wrap(myArray).getInt(); to retrieve the integer, and normally this is how we'd do this for performance reasons (because ByteBuffer will always try to use memory directly, instead of copying things to different memory locations). However, this is a refresher, so we could also consider...

Bitwise Operations

We know we have 4 bytes each of the form 0x??, for an example we will use (0x0E, 0x45, ox86, oxEF). When we put all these bytes together we can see that if we were to use int a = 0x0e4586ef; we would have an integer with decimal value 239437551. Unfortunately we can't just concatenate the bytes this way. However, using bitwise operations we can get the integer from these 4 bytes. Here are the bitwise operations:

  1. & is a logical AND operation, e.g. 0b10101000 & 0b11111001 = 0b10101110
  2. ^ is exclusive OR, e.g. 0b10101000 & 0b11111001 = 0b01010001
  3. | is inclusive OR, e.g. 0b10101000 & 0b11111001 = 0b11111001
  4. << and >> are arithmetic shifts bits to the left and right respectively, e.g. 0b10101000 << 8 = 1010100000000000 (shifts to the left by 8 places). Since these are arithmetic shifts, they preserve the sign of the first bit.

So we can convert our byte array as such:

int i = (myArray[0]<<24) & 0xff000000 | 
        (myArray[1]<<16) & 0x00ff0000 | 
        (myArray[2]<< 8) & 0x0000ff00 |
        (myArray[3]<< 0) & 0x000000ff;

That is, we shift the byte into the correct position, then do bit wise AND to get the byte in the correct place in a 4 byte integer, then use inclusive OR to put all the bytes together.

Permalink Java, Programming

The encryption genie is out of the bottle. - Jan Koum

Public Key Encryption, Perfect (Forward) Secrecy and Backdoors

1pm on Wednesday 18th October, 2017

Textbook RSA

The first thing discussed here is textbook RSA, to illustrate how simply a secure system can be understood. Having said that, textbook RSA as presented here is not secure. In terms of mathematics, all we need to be familiar with is multiplication since anything else we need can be covered minimally. Having some intuition into what public key encryption looks like will be helpful in some of the discussion later.

Some Quick Notation

First, we say \(a\) evenly divides \(b\) if when \(b\) is divided by \(a\) there is no remainder left. For example, \(\frac{10}{5}=2\), so \(5\) evenly divides \(10\). This is written as \(a | b\) and if \(a|b\) we say \(a\) is a divisor of \(b\).

Now for \(5 \times 5 \times ... \times 5\) where we have multiplied \(k\;5\)'s we write this as \(5^k\), we will refer to this as exponentiation by \(k\). Some may also need to refamiliarise themselves with prime numbers (wiki link).

On top of this, we can say two numbers are co-prime if they do not share any divisors. \(7\) and \(10\) are co-prime because the divisors of \(10\) (\(2\) and \(5\)) are not divisors of \(7\). We will write the number of numbers that are less than a number \(N\) and co-prime to that number \(N\) as \(\phi{(N)}\). So as above \(\phi{(7)}=6\) since the numbers co-prime to (and less than) \(7\) are \(1,2,3,4,5\textrm{ and }6\). You will probably notice that for a prime number \(p\) we have \(\phi{(p)}=p-1\) since the only non-one divisor of \(p\) is itself, which obviously isn't less than \(p\). We will use the result \(\phi{(pq)} =(p-1)(q-1)\) where \(p,q\) are both prime when we describe RSA - note that this isn't true if they aren't both prime.

Lastly, when we reduce a number \(\mod N\) we get it's remainder when divided by \(N\). For example: $$10 \equiv 3 \mod 7, \\ 12 \equiv 5 \mod 7.$$

The RSA Cryptosystem

To set up RSA we need two different prime numbers which we will refer to the first as \(p\) and second as \(q\), in a real world setting these numbers would be huge numbers (typically more than 300 digits each in base 10) so that it would take a long time for computers to factor their product \(N=pq\). There are easy, efficient methods to find candidates for \(p,q\) which I won't cover here.

Now for an important theorem: If \(a\) and \(N\) are co-prime, then \(a^{\phi{(N)}}\equiv1 \mod N\). I won't cover the proof of this here because it requires more background. This theorem is the key to RSA encryption and decryption. We now have all the tools we need to understand how to encrypt and decrypt with RSA.

For some message \(M < N\) we encrypt to encrypted text \(c\) with \(c = M^e \mod N\) where \(e\) is some number co-prime to \(N\) that we can choose freely. When we have decided on \(e\) we can use Euclid's Algorithm to find a \(d\) such that \(e \times d \equiv 1 \mod \phi{(N)}\) where we have found \(\phi{(N)}\) by \(\phi{(N)}=(p-1)(q-1)\). We can use such a \(d\) to decrypt since $$\begin{aligned} (M^e)^d &\equiv M^{1+k\phi{(N)}} &\mod (N) &\hspace{3em} \textrm{for some integer }k\\ &\equiv M(M^{k\phi{(N)}}) &\mod (N) &\\ &\equiv M(M^{\phi{(N)}})^k &\mod (N) &\\ &\equiv M \cdot 1^k &\mod (N) &\hspace{3em} \textrm{by the theorem}\\ &\equiv M &\mod (N) \end{aligned}$$

Hence it follows that exponentiation by \(d\) is the inverse of exponentiation by \(e\), so we can use these operations to encrypt and decrypt. The key to this system is that given some \(e\), we cannot find \(d\) such that \(ed \equiv 1 \mod \phi{(N)}\) unless we know what \(\phi{(N)}\) is. The value of \(\phi{(N)}\) cannot be calculated without first knowing \(p,q\) such that \(N=pq\). So as long as we keep \(p,q\) and \(d\) secret we can publicly share \(N,e\) so that other people can encrypt messages, but only we can decrypt them. For just anyone to decrypt they would need the value of \(\phi{(N)}\), to do this it would be necessary to first find \(p,q\) by factoring \(N\).

Example

Setup: \(p=3, q=11, N=p \times q=33,\,\,\phi{(33)} = 2 \times 10 = 20\) and we choose \(e = 3\) so by the euclidean algorithm \(d = 7\) since \(e \times d \equiv 3 \times 7 \equiv 1 \mod (\phi{(N)} = 20)\). We then publish \(N\) and \(e\) and keep \(d\) private.

Encrypt: Someone wishes to send us the message \(5\). Using the public values of \(N\) and \(e\) our encrypted message is then \(c \equiv m^e \equiv 5^3 \equiv 125 \equiv 26 \mod (N = 33)\). So they would then send the encrypted message, \(26\) which we will refer to as \(c\).

Decrypt: To retrieve the message we can then use our private value \(d\) to calculate \(c^d \equiv 26^7 \mod 33 \equiv 5 \mod 33\) and we have retrieved the secret message \(5\).

With our small values of \(p\) and \(q\) we can easily find what was encrypted by checking every value, for example if \(16\) was sent we could simply calculate every value \(1^3, 2^3, 3^3, 4^3, ...\) until we find one that equals \(16\). This doesn't require the secret value of \(d\), but when we use large enough values of \(p\) and \(q\) the number of possibilities to check becomes computationally infeasible.

Real-world RSA

I have covered RSA here in a very basic way which is ideally accessible to those without any mathematical background, there are many nuances and improvements that are considered in RSA as it is used today but they all hinge on the above theorem and calculations. Many of the problems with textbook RSA can be solved by padding schemes such as OAEP padding. There are some performance improvements that are also considered such as using the Chinese Remainder Theorem for decryption and recommendations on parameter choices that are also considered in real world cases, but these are not covered here. The details of these are easy to find online or in many textbooks on the matter if one wished to implement the RSA cryptosystem themselves - although it is advised an open source, widely vetted implementation such as in OpenSSL is used.

What is Perfect Secrecy and Perfect Forward Secrecy?

Perfect secrecy is the idea that encrypted ciphertext conveys no information about the original message - that it is just as likely that you sent any message as you sending any other message. Perfect forward secrecy is the idea that even if a private key is leaked at a later date, the previously sent messages are still unable to be decoded because a specific session key was generated and then discarded for that session.

Encryption Backdoors, Kerckhoffs's principle and Open Source Cryptography

There is much talk of Governments introducing so called 'backdoors' into cryptography, but there's a few key problems with this:

  • Mathematics is mathematics, we can't change the algorithm to support this idea and since these algorithms are widely known and published anyone can create secure encryption using them. In the case of AES many processors already include instruction sets to implement AES.
  • In cryptography we have Kerchhoff's principle - a cryptosystem should be secure even if everything, except the key, is public knowledge. This means that for us to be even moderately convinced that a cryptosystem is secure, all of it's inner workings and details are public. Now when we pair this idea with Open Source...
  • All widely used encryption software is Open Source. Following from the last point it's much more secure to use an Open Source encryption package than a proprietary one, because more eyes have been able to spot and fix flaws. It also means that an introduction of a backdoor into the software side isn't ever going to be inconspicuous.

Considering the above, backdooring the cryptosystems themselves doesn't really make sense. If one were backdoored it would be seen and fixed, or if necessary anyone could simply go and create their own. The answer to encryption isn't to backdoor the encryption itself, it's up to companies to design their systems in a way that doesn't hinder law enforcement. Ultimately, if encryption is being used by ne'er do wells, there is not much that can be done to stop them from having access to encryption.

Permalink Maths, Cryptography, Encryption

udp servers with netty...

Netty (Doesn't?) suck at UDP Servers

10pm on Monday 23rd May, 2016

I've been using Netty 4 (and the now defunct 5 alpha) at work for a UDP server that receives messages from OBD devices. So far my experience has been spotty, and here's why...

The UDP Example

Here's the UDP Quote Of The Moment example from Netty's own guide:

public final class QuoteOfTheMomentServer {

    private static final int PORT = Integer.parseInt(System.getProperty("port", "7686"));

    public static void main(String[] args) throws Exception {
        EventLoopGroup group = new NioEventLoopGroup();
        try {
            Bootstrap b = new Bootstrap();
            b.group(group)
             .channel(NioDatagramChannel.class)
             .option(ChannelOption.SO_BROADCAST, true)
             .handler(new QuoteOfTheMomentServerHandler());

            b.bind(PORT).sync().channel().closeFuture().await();
        } finally {
            group.shutdownGracefully();
        }
    }
}

With something akin to the above code on our test environment we were seeing ~250 packets/sec being dealt with, even if an external executor group was used for our blocking code. After a few tests had been ran we noticed that running our application multiple times on different ports could easily triple our throughput, which signalled something wasn't quite right within our application.

The fact that Netty's own guide gets things wrong is more a result of the actual problem rather than bad documentation - and this is probably why it's not been fixed. The problem itself is that even with an NioEventLoopGroup with 2*Cores threads, the DatagramChannel will only use one thread. It's a simple consequence of the way Netty handles channels - one thread for any given channel. UDP is connectionless and so we only ever have the main DatagramChannel. Having the boss thread delegate packets to worker threads would solve this problem (very easy to say, much harder to actually implement), but this hasn't made it's way into Netty yet.

The simple answer here is to just re-bind the event group 2*Cores times and all the threads will be used. Whilst this sounds like a simple solution, we actually need to consider some options and compatibility issues first...

SO_REUSEADDR? No wait, it's SO_REUSEPORT...

If we try to rebind to a port that's already been bound we'll just get a SocketException: Address already in use quite obviously. To get around this we can set the only cross platform option .option(ChannelOption.SO_REUSEADDR, true) and then we can bind multiple times. Okay! We're getting somewhere, but there's a small issue with how SO_REUSEADDR is defined.

SO_REUSEADDR is great for broadcast/multicast packets, but in some operating systems the OS layer will only deliver packets to one of the bound sockets and no others. SO_REUSEADDR does not have a common definition between operating systems. For example, on Windows another application can bind a socket to the port without this option set and kick all the previous bound sockets off (although there is a fix for this).

Now the actual socket option we need is SO_REUSEPORT, which will happily distribute packets to listening sockets in a fair fashion. Of course this comes with downsides though, it's actually not available through Netty on anything but Linux (3.9+) through Netty's native epoll transport layer. On top of this, like SO_REUSEADDR, it's not well defined across platforms, and Windows doesn't have the option at all, since for Windows SO_REUSEADDR also does this (which actually makes sense for once).

Below is a Linux only implementation which will ensure all threads are used:

public final class QuoteOfTheMomentServer {

    private static final int PORT = Integer.parseInt(System.getProperty("port", "7686"));

    private static final int THREADS = Runtime.getRuntime().availableProcessors() * 2; // Default EventLoopGroup Size

    public static void main(String[] args) throws Exception {
        EventLoopGroup group = new EpollEventLoopGroup(THREADS);
        try {
            Bootstrap b = new Bootstrap();
            b.group(group)
             .channel(EpollDatagramChannel.class)
             .option(ChannelOption.SO_BROADCAST, true)
             .option(EpollChannelOption.SO_REUSEPORT, true)
             .handler(new QuoteOfTheMomentServerHandler());

            List<ChannelFuture> futures = new ArrayList<>(THREADS);
            // Bind THREADS times
            for(int i = 0; i < THREADS; ++i) {
               futures.add(bootstrap.bind(host, port).await());
            }

            // Now wait for all to be closed (if ever)
            for (final ChannelFuture future : futures) {
                future.channel().closeFuture().await();
            }
        } finally {
            group.shutdownGracefully();
        }
    }
}

Since many of the developers I work with aren't too familiar with Linux, it was important that we fell back to the Nio implementation on other operating systems, which creates somewhat messy and confusing platfom specific code. Not really something expected in Java. It also leaves a TODO for some future developer to deal with:

* TODO - Evaluate JDK9 for SO_REUSEPORT availability and remove platform specific code

Permalink Java, Programming, Netty

Creating this Blog

11pm on Wednesday 18th June, 2014

Note: This blog post is terribly out of date.

Why?

I started creating this blog because of a few simple reasons:

  1. WordPress is too much
  2. Nothing is more extend-able than what I build myself
  3. I wanted to try out a microframework (in this case dispatch).

Getting Started

Initially I started with a very simple (and common!) page layout for inspiration,

center block

This flourished into the design you see today, originally with harsh reds and blacks but now with a variety of greys with a variety of header images. This was completed in Photoshop, with a full page layout as you see now completely planned out.

It was from this that the page was replicated exactly, which allowed me to tweak it as necessary.

Development

The entire development for the blog (so far) probably took around 8 hours including the amount of time playing about with the design and testing. I'm also using a modified version of parsedown. Most of the changes relate to code blocks and allowing me to create cool code boxes that allow line numbers which don't affect copy and pasting! Here's an example of some javascript that detects key presses:

document.addEventListener("DOMContentLoaded", function(){
    // vanilla document ready
    document.addEventListener("keyup", function(e){
        if(e.keyCode==83 && e.altKey==true){
            e.preventDefault();
            alert("Alt+S pressed");
       }
       if(e.keyCode==80 && e.altKey==true){
            e.preventDefault();
            alert("Alt+P pressed");
        }
        if(e.keyCode==79 && e.altKey==true){
            e.preventDefault();
            alert("Alt+O pressed");
        }
    });
}); 

I then started work on the actual blogging part of the site, created in php with dispatch (as discussed earlier).

What's Next?

There are still a lot of things I would like to add, namely

  • Comments
  • A proper user-system [1]
  • Ability to integrate with my personal projects more

For now though, this is my blog, and hopefully the ease it affords me entices me to blog some more. We'll see though.

[1]: Since writing this I've added one!

Permalink Blog, Programming, PHP, Design

Page 1 of 1