SMS Length and Unicode

Conventionally, SMS (Short Message Service) messaging is recognized to be one of the most used mediums of communicating. The techniques used in constructing SMS messages shape the way messages are sent and received by recipients. With that noted, understanding some concepts and integrating some of the best practices will be key in ensuring that messages are delivered the way it is  expected and also reduce the risk of unexpected costs. Getting answers to these and many other information would definitely keep us all well informed and appreciate the way to properly put out text messages going forward.

We start by taking a brief look at texts and how they play a role  in SMS messaging. First of all, it is important to note that all characters that we see on our screens are just symbolic representations of data being fed to computers for human understanding. Every character on your  keyboard is encoded in GSM 03.38 when it is being used to create an SMS including the spaces between words. SMS basically supports two standards of character set: GSM-7 and UCS-2(Also known as Unicode).

A  single payload of an SMS message can only contain a capacity of 140 bytes . Hence for a message of more than 140 bytes (i.e for 160 standard GSM characters), it is considered as a longer message and hence it is split into individual messages called  segments . This is described as a concatenated message/multi-part message. In a multi-part message, some of the allocated bytes from the payload are used to create a user data header (UDH). This provides identification and ordering information so that the receiving device knows how to order the separate payloads, which may arrive at the handset out of order into a single readable message. The UDH takes 6 bytes or 48 bits. Hence for multi-segment messages, this leaves 153 GSM-7 characters or 67 UCS-2 characters per segment.

Messages that are not encoded by GSM-7 (Unicode) are limited to 70 characters.Here are a few points to note which will be of benefit to us as we continue:

  • GSM-7 : This character encoding standard uses most but not all, characters for languages that use the Latin-based alphabet, such as English, Spanish, French. The GSM character encoding uses 7 bits to represent each character. One SMS message that uses GSM can contain a capacity of 160 characters (packed in up to 140 octets) in one SMS message in the GSM network. The basic character set GSM & can be found here.
    Beside the standard GSM characters, characters such as  “€ ^ { } [ ] ~ | “  symbols can be used in SMS messages. The 160 segment count remains, though these characters use additional space; twice the amount of space per character.
  • Unicode Encoding: Wikipedia describes this type of character as encoding which allows use of a greater range of characters and languages. Unicode also known as UCS-2 can represent the most commonly used characters including punctuation marks, mathematical symbols, technical symbols, arrows, and characters making up non-Latin alphabets, such as Thai, Chinese, or Arabic script and Emojis characters at the cost of a greater space expense.
    When UCS-2 characters are included in the body of a message(using GSM/UCS-2 encoding ), the message automatically changes to UCS-2 encoding (which will change the message body limit to 70 characters each).

So going forward we do understand that the character limit for an SMS  is 160 or 70 per message segment depending on the type of characters being used. So for instance if an SMS is sent with texts of about 170 character limit, the message is going to be segmented into 2 parts if it is using GSM encoding (i.e 153+17 ) or 3 parts if it is Unicode encoding (i.e 67+67+36). The character length of an SMS message is 160 only if it is GSM encoding. If special characters are used in a text message, it will be a Unicode encoding, and it will reduce the character limit to 70 per segment.
An SMS message is charged per segment, so with the above example the sender is charged respective to the total of the message segments. With that in mind, we do realize it can be very worrying for individuals who would want to communicate with as much freedom and flexibility and not worrying so much about the cost involved, and Arkesel takes care of just that!  Arkesel provides a platform where users are able to send as many messages with as many characters for the users’ satisfaction at very affordable rates.

Popular Posts

Security today is more important than ever, especially as cyber threats evolve. One of the most common ways to secure user accounts is through authentication systems. Traditional methods, like passwords, are increasingly vulnerable to hacks

Securing user data and ensuring safe transactions are necessary in the digital space. One of the most effective ways to support security is implementing One-Time Password (OTP) authentication. This article will explore implementing OTP APIs

OTP, or One-Time Password, is important in cybersecurity and user authentication. As digital activities increase, reliable identity verification methods have become essential. Traditional password systems are often inadequate against modern cyber threats, making OTPs an

Scroll to Top