Dare mighty things – ASCII

Introduction

NASA’s Perseverance Mars Rover landing was a historic moment. First time, we saw those moments of a rover landing on the Mars. That event was huge for many people watching this on their screens and it will inspire us for years.

Every single image we saw after this landing contains a new surprise and one of those were talks about hidden message found on Nasa Mars rover parachute.

Internet sleuths claim to have decoded a hidden message displayed on the parachute that helped Nasa’s Perseverance Rover land safely on Mars last week. They claim that the phrase “Dare mighty things” – used as a motto by Nasa’s Jet Propulsion Laboratory – was encoded on the parachute using a pattern representing letters as binary computer code and this can be translated (skipping some details) using not surprisingly ASCII encoding.

Reddit users and social media posters on Twitter noticed that the red-and-white pattern on the parachute looked deliberate, and arrived at the result by using the red to represent the figure one, and the white to represent zero.

This story was published in media all over the world and with that ASCII was in the news again.

In this post, we will discuss ASCII, a little bit of history and learn that why it is so hard not to fell in love with ASCII.

ASCII protocol

“ASCII is an encoding”. It’s not a protocol; protocols can be built on top of ASCII.

That’s true, but in a lot of documents on the internet you will find that ASCII is refereed to as protocol as well. So, I just want to say this upfront if you see me talking about ASCII as protocol as well.

ASCII is The simplest communications protocol for text. It transmits only ASCII characters and uses ASCII control codes. It implies little or no error checking.

You may not believe this but ASCII is really powerful. It’s the only data format that can be universally decoded by any computer on the planet.

Not many people know that ASCII was birthed in the 1960s when Bell Labs had a need for a standard way to send text. They reorganized telegraphic codes, sorted them, and worked with the American Standards Association (ASA) to form ASCII (American Standard Code for Information Interchange). As computers were developed in the 1960s, ASCII became THE standard for sending information.

XML, the eXtensible Markup Language, is a way of transferring computer data from one place to another and is built solely with ASCII codes. Every XML element starts and ends with the angled bracket ASCII codes. Even numeric data is coded in ASCII. An Ink pressure is encoded as a long series of ASCII codes such as:

<InkPressure>2.3145</InkPressure>

Why is there so much ASCII, you might ask? It’s how humans communicate. We use letters, numbers, and symbols. There are labels on products, markings on boxes, etchings on car tires, and much, much more. And computers must read this stuff, and they do that with bar code readers that translate those funny lines into ASCII characters that those products above have to move around.

You will find a common use of ASCII in command/query scenarios, where you can send ASCII commands to industrial devices and these devices perform actions or send status information back.

What is a Protocol

You are meeting with the queen, someone from the palace communicate the protocol and it is followed. You type a web address in the browser address-bar, Press enter and a whole bunch of TCP/HTTP communication happens behind the scene and a web page is shown in the browser. This communication follows certain protocols.

Network Protocol has many levels of meaning. Most commonly, it is the means by which data-packets are transferred between computers. However today, we will ignore the means of transport and examine the contents of the transmissions. Specifically, the messages that application programs must send to the network interface.

Bits, Bytes and Representation of Information

Ok, we talked a little bit about ASCII and later I will show you some sample code regarding ASCII usage. But before that, lets take a detour and learn a little bit information about encoding and storage in digital world.

I have written a post on encoding which you can read if more information is needed. Here I will just mention it briefly.

Digital representation means that everything is represented by numbers only. The usual sequence:

  • Something (sound, pictures, text, instructions….) is converted into numbers by some mechanism.
  • These numbers can be stored, retrieved, processed, transmitted.
  • The numbers might be reconstituted into a version of the original.

Binary, Octal, Decimal, Hex

There are many ways to write numbers. e.g.

10011111 (binary) is equal to:

237 in octal.
157 in decimal.
9F in hexadecimal

They all represent the same value, but hexadecimal is shorter and easier to read than binary. That’s why you will see hexadecimal commands in communication manuals of many industrial devices.

To send data back and forth over the Network you will need to use bytes. A byte is a group of 8 bits. One byte can represent a decimal number b/w 0-255.

[STX][status][type][length][user data…][checksum][ETX]

See the command above, and you may already know that, A computer cannot store “letters”, “numbers”, “pictures” or anything else. The only thing it can store and work with are bits. A bit can only have two values: yes or no, true or false, 1 or 0 or whatever else you want to call these two values.

To use bits to represent anything at all besides bits, we need rules. We need to convert a sequence of bits into something like letters, numbers and pictures using an encoding scheme or encoding for short.

the above encoding scheme happens to be ASCII. In total there are 128 characters defined in the ASCII encoding.

You’re never actually directly dealing with “characters” or “text”, you’re always dealing with bits as seen through several layers of abstractions.

You can convert and see Byte as string only if the Binary Code match to a Unicode Character otherwise conversion from Byte to String will result on corrupted characters.

Unicode is a standard that specifies, amongst other things, what characters are available.

UTF-8 is a character encoding that specifies how these characters shall be physically encoded in 1s and 0s. UTF-8 can use 1 byte for ASCII (<= 127) and up to 4 bytes to represent other Unicode characters.

If two systems are talking to each other, they always need to specify what encoding they want to talk to each other in. the simplest example of this is this website from where you are reading this article, it is telling your browser that it’s encoded in UTF-8.

STX / ETX (ASCII Protocol)

Rememberd our command? It uses STX and ETX usually refer to ASCII control characters.

[STX][status][type][length][user data…][checksum][ETX]

An ASCII control character can’t be displayed, thus to show you where to put them, it is a common usage to show them as [STX] and [ETX] but you have to replace them by their ASCII codes.

It means that [STX] must be replaced by a single ASCII character of value 0x02.
It means that [ETX] must be replaced by a single ASCII character of value 0x03.

STX/ETX is a simple packet protocol, that can be wrapped around user data. Following are two of the reasons of doing so:

  • Packetization is needed.
  • STX/ETX packets will add a checksum to your data: This allows the receiver to verify that data was received correctly and is error-free.
  • STX the ASCII character for start-of-text (0x02).
  • ETX the ASCII character for end-of-text (0x03).

You might already have seen these commands structures for communication with devices. Typically device provides some sort of interface e.g. Ethernet or RS232 port and then these commands are sent to device and device will send back the expected response.

What is 0X

The “0X” means the number is hex.
0x01 = 1 in decimal = 00000001 in binary
0x10 = 16 in decimal = 00010000 in binary

Code Example – Sending ASCII Commands over TCP/IP

We will see an example shortly, but here is the general flow:

Assuming you already have your communication link established, just send the appropriate string to that link. so STX would be “\x02”, and thus H would be “\x02H\x04”

Ok, so, we can send ASCII commands to devices using various communication channels e.g. TCP/IP, RS232 etc. I will be using TCP for the demo and the implementation is based in C# (we will see NodeJS example as well), but if you have JAVA background, the code will look familiar:

What are we building?

So, the example is a simple TCP Client/Server application.

Now client can be any thing, it can be a device, it can be a process running on Linux or a .NET Core console application (which the case here).

Same goes for the server, it can an industrial device or a Rover on Mars. Remember, when I mentioned that ASCII is the only data format that can be universally decoded by any computer on the planet, well its true for computers on Mars too 🙂 . However, for this demo, it will be a second .NET Core console application.

Here is the output from our TCP Client/Server application:

Wait…What??? – What’s that…?

I am glad that you asked :). I will now explain the code. The code available on GitHub on this link, so you can download it for your reference if you want.

First, lets see the solution structure. In the solution I have a .NET Core application for TCP Client and another .NET Core application for TCP Server. There is also a Standard Library project for some Reference code.

TCP Client

AsciiDemo.TestApp is our TCP Client. and here the code from Program.cs file:

Main method is the entry point to application:

The code is very simple and if you have some questions, feel free to ask. We are preparing some commands using ASCII encoding, converting them to bytes and then sending then over the wire to our TCP Server. Here, again, is the console output from our TCP client.

Notice, that we are getting back Some ACK/NAK from TCP Server (We will see that part next). So our TCP Client can send commands to Server and can also receive response from Server.

You also notice, some funny characters on the console. These are Control Characters, we were talking about earlier, you will see this type of output on console when you try to print those on console.

Here is the code for Building a command using ASCII:

I am skipping the code of TCP Client SendCommand part, as this is very simple typical C# code and we will see similar code in TCP Server part. It is available on GitHub anyway.

TCP Sever

AsciiDemo.TCPListener App is our super simple TCP Listener. It listens for commands on a port. Once command is recieved, it simple prints it on console (A device would may be perform a shutdown or read a sensor value) and then it send back a response. In this case we are sending bytes for ACK or NAK to simulate if it was successful or not. But you can implement whatever processing is needed.

Let’s see the console output from server again:

As you can see, for every command it received from the TCP Client, server prints it on console and sent back ACK or NAK which was then received on TCP Client, we saw earlier.

Here is the code for TCP Server:

The code is very simple, we are starting up a server on an IP Address and port and then listening for data to arrive in a listening loop.

Inside the loop, when there is a connection, we are reading the data bytes using NetworkStream.

We are translating the bytes into ASCII, printing it on Console and then sending back ACK/NAK bytes back to TCP Client.

What about Node.JS Implementation

As mentioned above, you can implement this functionality in almost every programming languages. Here is a some basic TCP Client implementation in NodeJS for your reference:

When we ran TCP Server (which is built using .NET Core) and Nodejs TCP Client and following screen shows the communication b/w them:

NodeJS (TCP Client) Output:

It connects with server, send two commands and display ACK/NAK response from the server.

.NET Core Server Output:

Here is the output from server, it receives commands from TCP Client (Nodejs), prints it on console and send back ACK/NAK to TCP Client:

Summary

Well, we saw how awesome ASCII is. Its simple, powerful and easy to implement protocol. As long as humans communicate in letters and numbers, we’ll need to move ASCII data around the environments.

ASCII usage for Command/Query was originated in the days of the IBM mainframe when people typed on terminals. They typed a message and then hit the return key to send a string to a mainframe computer. All communication, because it was done by humans, was done in standard ASCII.

If you think about it, anything with a label contains ASCII data. Every barcode reader is really just a series of ASCII characters. These characters need to be stored, they need to be printed, and they sometimes need to be converted into numeric data.

Even today when we have modern protocols for industrial devices, ASCII is still around and it will be for a long long time

If you have some questions or comments, feel free to ask. Till Next time, Happy Coding.

My Recent Books