A Computational Model for Studying the Characteristics of Languages using Coded Character Sets


FORTRAN programming language,, Basic character set, Code presentation, Coded character sets, Languages,, Computational model



Computer languages possess the structure for studying all categories of languages due to the fact that all languages have basic character sets which are the fundamental building blocks for their syntaxes. This is the basis for this paper wherein a general theoretical model is presented for studying the characteristics of languages. The model hinges on the data structure of the basic character set of FORTRAN programming language when considered as a subset of three standard coded character sets which are subsets of Unicode. The model is based on the application of a method for representing binary uniform digital codes, called ‘code presentation’. The focus of the paper is on the suitability of the method for representing codes, even though the method is a lossless compression algorithm. The model, for example, provides insight into whether the composition of words in a particular language belongs to that language or another. Further work may be done to establish mathematical relationship, using code presentation, to show when two languages belong to the same family