Entry: Computer Science Notes v.3 Wednesday, August 26, 2009



Chapter Three - Data Representation

data compression 0 reduction in the amount of space needed to store a piece of data

compression ratio - the size of the compressed data divided by the size of the original data

a data compression tecniques can be
 -lossless, which means the data can be retrieved without any loss of the original information

-lossy, which means some information may be lost in the process of compaction

analog & digital information

Computers are finite!
How do we represent an infinite world?

We represent enough of the world to satisfy our computational needs and our senses of sight and sound.

Information can be represented in one of two ways: analog or digital

Analog data
-A continuous representation, analogous to the actual information it represents

Digital data
-A discrete representation, breaking the information up into separate elements.

Thermometer is an analog device.

Computers cannot work well with analog data, so we digitize the data.

Digitize
-Breaking data into pieces and representing those pieces separately.
  How do we use binary to represent digitized data?

Electronic Signals
Important facts about electronic signals
-An analog signal continually fluctuates in voltage up and down
-A digital signal has only a high or low state, corresponding to the two binary digits.
-All electronic signals (both analog and digital) degrade as they moved down a line
-The voltage of the signal flcutuates tdue to envrinomental factors.

Periodically, a digital signal is reclocked to regain its original shape.

Binary Representations

One bit can be either 0 or 1

One bit can repesent two things (Why?) 0 1 2 3

Two bits can represent four things (Why?) 00 01  10  11

Chances of winning Cash 3 Lottery is 1,000. Because you have 000 & 999 and is represented by the 10^umpth power.

2^n power will get you the number in bytes. 2^1=2 2^2=4 2^3=8 2^4=16 2^5=32 2^6=64 2^7=128 2^8=256

How many things can three bits represent?

How many things can four bits represent?

How many things can five bits represent?

When you increase the number of bits it doubles.

Representing Negative Values

Signed-magnitude number representation

The sign represents the ordering, and the digitis represent the magnitude of the number.

Using two decimal digits,
 let 1 through 49 represent 1 through 49
 let 50 through 99 repesent -50 through -1

To perform addition, add the numbers and discard any carry.

A-B=A+(-B)
Add the negative of the second to the first.

Formula to computer the negative represesntaiton of a number

Negative(I) =10^k - I, where k is the number of digits

The presesntation is called the ten's complement.

Negative(5)=100-50=50
Negative(49)=100-49=51
Negative(35)=100-35=65
Negative(1)=100-1=99

Two's Complement
(Veritcal line is easier to read)

00000000

-128-127

Negative(128)=256-128=128
Negative(127)=256-127=129

Addition and subtraction are the same as in 10's complement arithmetic.

-127 10000001
+   1  00000001
------
-126  10000010

Real numbers**

A number with a whole part and a fractional part
104.32, 0.999999, 357.0, and 3.14159

Positions to the right of the decimal point are the teneths position: 10^-1, 10^-2, 10^-3

Same ryles apply in binary as in decimal. Decimal point is actually the radix point. Positions to the right of the radix point binary are 2^-1 (one half) 2^-2 (one quarter), 2^-3 (one eight)

Representing text

There are finite number of caracters to represent, so list them all and assign each a binary string.

01000001=65=A
01000010=66=B
01000011=67=C
01000100=68=D

32 is the difference between capital and lower case letters on this string.

01100001=97=a
01100010=98=b

Unicode uses 16bits 2^16=65,536 (for foreign languages like Chinese...etc)

1111111111111111

ASCII stnads form American Standard Code for Information Interchange.

ASCII originally used seven bits to represent each character, allowing for 128 bits.

Text Compression

Assigning 16 bits to each character in a document uses too much files space.

We need ways to store and transmit text effeicently.

Text compression techniques.
-keyword encoding
-run-length encoding
-Huffman encoding

Keyword Encoding

Replace frequently used words with a single character.

Run-length encoding.

A signle character may bre repeated over and over again in a long squence.

Replace a repeated sequence with
-a flag character
-repeated character
-number....

Huffman Encoding

Huffman codes use variable-length bit strings to repesent each character.

More frequently used letters have shorter strings to represent them.

Morse code
ETAOINS
*_                  dit and dah

Huffman


   0 comments

Leave a Comment:

Name


Homepage (optional)


Comments