Computer Science Notes v.3
Chapter Three - Data Representation
data compression 0 reduction in the amount of space needed to store a piece of data
compression ratio - the size of the compressed data divided by the size of the original data
a data compression tecniques can be
-lossless, which means the data can be retrieved without any loss of the original information
-lossy, which means some information may be lost in the process of compaction
analog & digital information
Computers are finite!
How do we represent an infinite world?
We represent enough of the world to satisfy our computational needs and our senses of sight and sound.
Information can be represented in one of two ways: analog or digital
Analog data
-A continuous representation, analogous to the actual information it represents
Digital data
-A discrete representation, breaking the information up into separate elements.
Thermometer is an analog device.
Computers cannot work well with analog data, so we digitize the data.
Digitize
-Breaking data into pieces and representing those pieces separately.
How do we use binary to represent digitized data?
Electronic Signals
Important facts about electronic signals
-An analog signal continually fluctuates in voltage up and down
-A digital signal has only a high or low state, corresponding to the two binary digits.
-All electronic signals (both analog and digital) degrade as they moved down a line
-The voltage of the signal flcutuates tdue to envrinomental factors.
Periodically, a digital signal is reclocked to regain its original shape.
Binary Representations
One bit can be either 0 or 1
One bit can repesent two things (Why?) 0 1 2 3
Two bits can represent four things (Why?) 00 01 10 11
Chances of winning Cash 3 Lottery is 1,000. Because you have 000 & 999 and is represented by the 10^umpth power.
2^n power will get you the number in bytes. 2^1=2 2^2=4 2^3=8 2^4=16 2^5=32 2^6=64 2^7=128 2^8=256
How many things can three bits represent?
How many things can four bits represent?
How many things can five bits represent?
When you increase the number of bits it doubles.
Representing Negative Values
Signed-magnitude number representation
The sign represents the ordering, and the digitis represent the magnitude of the number.
Using two decimal digits,
let 1 through 49 represent 1 through 49
let 50 through 99 repesent -50 through -1
To perform addition, add the numbers and discard any carry.
A-B=A+(-B)
Add the negative of the second to the first.
Formula to computer the negative represesntaiton of a number
Negative(I) =10^k - I, where k is the number of digits
The presesntation is called the ten's complement.
Negative(5)=100-50=50
Negative(49)=100-49=51
Negative(35)=100-35=65
Negative(1)=100-1=99
Two's Complement
(Veritcal line is easier to read)
00000000
-128-127
Negative(128)=256-128=128
Negative(127)=256-127=129
Addition and subtraction are the same as in 10's complement arithmetic.
-127 10000001
+ 1 00000001
------
-126 10000010
Real numbers**
A number with a whole part and a fractional part
104.32, 0.999999, 357.0, and 3.14159
Positions to the right of the decimal point are the teneths position: 10^-1, 10^-2, 10^-3
Same ryles apply in binary as in decimal. Decimal point is actually the radix point. Positions to the right of the radix point binary are 2^-1 (one half) 2^-2 (one quarter), 2^-3 (one eight)
Representing text
There are finite number of caracters to represent, so list them all and assign each a binary string.
01000001=65=A
01000010=66=B
01000011=67=C
01000100=68=D
32 is the difference between capital and lower case letters on this string.
01100001=97=a
01100010=98=b
Unicode uses 16bits 2^16=65,536 (for foreign languages like Chinese...etc)
1111111111111111
ASCII stnads form American Standard Code for Information Interchange.
ASCII originally used seven bits to represent each character, allowing for 128 bits.
Text Compression
Assigning 16 bits to each character in a document uses too much files space.
We need ways to store and transmit text effeicently.
Text compression techniques.
-keyword encoding
-run-length encoding
-Huffman encoding
Keyword Encoding
Replace frequently used words with a single character.
Run-length encoding.
A signle character may bre repeated over and over again in a long squence.
Replace a repeated sequence with
-a flag character
-repeated character
-number....
Huffman Encoding
Huffman codes use variable-length bit strings to repesent each character.
More frequently used letters have shorter strings to represent them.
Morse code
ETAOINS
*_ dit and dah
Huffman
Posted at 09:33 am by
lemony