what is character encoding and decoding

Characters are abstract entities that can be represented in many different ways. A character encoding is a system that pairs each character in a supported character set with some value that represents that character. For example, Morse code is a character encoding that pairs each character in the Roman alphabet with a pattern of dots and dashes that are suitable for transmission over telegraph lines. A character encoding for computers pairs each character in a supported character set with a numeric value (also known as code point) that represents that character. A character encoding has two distinct components:
 

  • An encoder, which translates a sequence of characters into a sequence of numeric values (bytes).
  • A decoder, which translates a sequence of bytes into a sequence of characters.
  
A computer can only work with 0s and 1s. It does not understand human languages like English, Hindi etc.
Hence you need to mention the correct character set while either rendering or transferring the information, to prevent data loss.


References

 What every programmer absolutely, positively needs to know about encodings and character sets to work with text

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) 

 

Comments

Back To Top

Popular posts from this blog

error 18 at 0 depth lookup: self signed certificate

How to check fragmentation in MySQL tables

How to Drop or Remove or Decommission a Database in Oracle