Chapter 16
Input/Output
In this chapter we discuss the I/O subsystem. The I/O subsystem is the means by which the CPU communicates with the outside world. By “outside world” we mean devices other than the CPU and memory.
As you have learned, the CPU executes instructions, and memory provides a place to store data and instructions. Most programs read data from one or more input devices, process the data, then write the results to one or more output devices.
Typical input devices are keyboards and mice. Common output devices are display screens and printers. Although most people do not think of them as such, magnetic disks, CD drives, etc. are considered as I/O devices. It may be a little more obvious that a connection with the internet is also seen as I/O. The reasons will become clearer in this chapter, where we discuss how I/O devices are programmed.
16.1 Memory Timing
Since the CPU accesses I/O devices via the same buses as memory (see Figure 1.1, page 9), it might seem that the CPU could access the I/O devices in the same way as memory. That is, it might seem that I/O could be performed by using the movb instruction to transfer bytes of data between the CPU and the specific I/O device. This can be done with many devices, but there are other issues that must be taken into account in order to make it work correctly. One of the main issues lies in the timing differences between memory and I/O. Before tackling the I/O timing issues, let us consider memory timing characteristics.
Aside: As pointed out in Section 1.2 (page 10), the three-bus description given here shows the logical interaction between the CPU and I/O. Most modern general purpose computers employ several types of buses. The way in which the CPU connects to the various buses is handled by hardware controllers. A programmer generally deals only with the logical view.
Two types of RAM are commonly used in PCs.
- SRAM holds its values as long as power is on. Access times are very fast. It requires more components to do this, so it is more expensive and larger.
- DRAM uses passive components that hold data values for only a few fractions of a second. Thus DRAM includes circuitry that automatically refreshes the data values before the values are completely lost. It is much less expensive than SRAM, but also much slower.
Most of the memory in a PC is DRAM because it is much less expensive and smaller than SRAM. Of course, each instruction must be fetched from memory, so slow memory access would limit CPU speed. This problem is solved by using cache systems made from SRAM.
A cache is a small amount of fast memory placed between the CPU and main memory. When the CPU needs to access a byte in main memory, that byte, together with several surrounding bytes, are copied into the cache memory. There is a high probability that the surrounding bytes will be accessed soon, and the CPU can work with the values in the much faster cache. This is handled by the system hardware. See [28] and [31] for more details.
Modern CPUs include cache memory on the same chip, which can be accessed at CPU speeds. Even small cache systems are very effective in speeding up memory access. For example, the CPU in my desktop system (built in 2005) has 64 KB of Level 1 instruction cache, 64 KB of Level 1 data cache, and 512 KB of Level 2 cache (both instructions and data). In contrast, most of the memory in the system consists of 1 GB of DDR 400 memory.
The important point here is that memory is matched to the CPU by the hardware. Very seldom is memory access speed a programming issue.
Aside: There are some cases where knowing how to manipulate memory caches can speed up execution time. The x86 has instructions for working directly with cache. Optimizing cache usage is an advanced topic beyond the scope of this book.
16.2 I/O Device Timing
I/O devices are much slower than memory. Consider a common input device, the keyboard. Typing at 120 words per minute is equivalent to 10 characters per second, or 100 milliseconds between each character. A CPU running at 2 GHz can execute approximately 200 million instructions during that time. And the time intervals between keystrokes are very inconsistent. Many will be much longer than this.
Even a magnetic disk is very slow compared to memory. What if the byte that needs to be read has just passed under the read/write head on a disk that is rotating at 7200 RPM? The system must wait for a full revolution of the disk, which takes 8.33 milliseconds. Again, there is a great deal of variability in the rotational delay between reads from the disk.
In addition to being much slower, I/O devices exhibit much more variance in their timing. Some people type very fast on a keyboard, some very slow. The required byte on a magnetic disk might be just coming up to the read/right head, or it may have just passed. We need a mechanism to determine whether an input device has a byte ready for our program to read, and whether an output device is ready to accept a byte that is sent to it.
16.3 Bus Timing
Thus far in this book buses have been shown simply as wires connecting the subsystems. Since more than one device is connected to the same wires, the devices must follow a protocol for deciding which two devices can use the bus at any given time. There are many protocols in use, which fall into one of two types:
Synchronous— data transfer is controlled by a clock signal. Typically, a centralized bus controller generates the clock signal, which is sent on a separate control line in the bus. Asynchronous— data transfer is controlled by a “handshaking” exchange between the two devices. Many asynchronous protocols are handled by the devices themselves over the data and address lines in the bus.Modern computer systems employ both types of buses. A typical PC arrangement is shown in Figure 16.1.
Figure 16.1: Typical bus controllers in a modern PC. The Memory Controller is often called the North Bridge; it provides synchronous communication with main memory and the graphics interface. The I/O Controller is often called the South Bridge; it provides asynchronous communication with the several types of buses that connect to I/O devices.
16.4 I/O Interfacing
In addition to a very wide range in their timing, there is an enormous range of I/O devices that are commonly attached to computers, which differ greatly in how they handle data. A mouse provides position information. A monitor displays graphic information. Most computers have speakers connected to them. Ultimately, the CPU must be able to communicate with I/O devices in bit patterns at the speed of the device.
The hardware between the CPU and the actual I/O device consists of two subsystems — the controller and the interface. The controller is the portion that works directly with the device. For example, a keyboard controller detects which keys are pressed and converts this to a code. It also detects whether a key is pressed or not. A disk controller moves the read/write head to the requested track. It then detects the sector number and waits until the requested sector comes into position. Some very simple devices do not need a controller.
The interface subsystem provides registers that the CPU can read from or write to. An I/O device is programmed through the interface registers. In general, the following types of registers are provided:
- Transmit — Allows data to be written to an output device.
- Receive — Allows data to be read from an input device.
- Status — Provides information about the current state of the device, including the controller.
- Control — Allows a program to send commands to the controller and to change its settings.
It is common for one register to provide multiple functionality. For example, there may be one register for transmitting and receiving, its functionality depending on whether the CPU writes to or reads from the register. And it is common for an interface to have more than one register of the same type, especially control registers.
16.5 I/O Ports
The CPU communicates with an I/O device through I/O ports. The specific port is specified by a value on the address bus. There are two ways to distinguish an I/O port address from a physical memory address:
- Isolated I/O
- Memory-Mapped I/O
With isolated I/O, the I/O ports can be numbered from 0x0000 to 0xffff. This address space is separate from the physical memory address space. Instructions are provided for accessing the I/O address space. The distinction between the two addressing spaces is made in the control bus.
One instruction to perform input is:
where s denotes the size of the operand:
s | meaning | number of bits |
b | byte | 8 |
w | word | 16 |
l | longword | 32 |
q | quadword | 64 |
Intel® Syntax | in | destination, source | |
The in instruction moves data from the I/O port specified by the source into the register specified by the destination. The source operand can be either an immediate value, or a value in the dx register. The destination must be al, ax, or eax, consistent with the operand size. For example, the instruction
inb $4, %al
reads I/O port number 4, placing the value in the al register.
An instruction to perform output is:
where s denotes the size of the operand:
s | meaning | number of bits |
b | byte | 8 |
w | word | 16 |
l | longword | 32 |
q | quadword | 64 |
Intel® Syntax | out | destination, source | |
The out instruction moves data to the I/O port specified by the destination from the register specified by the source. The destination operand can be either an immediate value, or a value in the dx register. The source must be al, ax, or eax, consistent with the operand size. For example, the instruction
outb %al, $6
writes the value in the al register to I/O port number 6.
16.6 Programming Issues
One of the primary jobs of an operating system is to handle I/O. The software that does this is called a device handler. The operating system coordinates the activities of all the device handlers so that the hardware is utilized in an efficient manner. In Linux, a device handler may either be compiled into the kernel or in a separate module that is loaded into memory only if needed.
Thus, programming I/O devices generally means changing the operating system kernel. This can be done, but it requires considerably more knowledge than is provided in this book. It is possible to give user applications permission to directly access specific I/O devices, but this can produce disastrous results, especially in a multi-user environment.
We will not do any direct I/O programming in this book, but we will look at the general concepts. Listing 16.1 sketches the general algorithms in C. The code was abstracted from some I/O routines that work with a Dual Asynchronous Universal Receiver/Transmitter (DUART) on a single board computer. It is incomplete code and does not run on any known computer, but it illustrates the basic concepts.
This example uses memory-mapped I/O. The program calls three functions:
- initio — Initialize the I/O interface. This includes placing the hardware in an “all clear” state and setting parameters such as speed, etc.
- charin — Read one character from the input.
- charout — Write one character to the output.
We will examine what each does.
1/*
2 * io_sketch_mm.c
3 * This code sketches the algorithms to initialize
4 * a DUART, read one character and echo it using
5 * isolated I/O.
6 * WARNING: This code does not run on any known
7 * device. It is meant to sketch some
8 * general I/O concepts only.
9 * Bob Plantz - 18 June 2009
10 */
11
12/* register offsets */
13#define MR 0x01 /* mode register */
14#define SR 0x03 /* status register */
15#define CSR 0x03 /* clock select register */
16#define CR 0x05 /* command register */
17#define RR 0x07 /* receiver register */
18#define TR 0x07 /* transmitter register */
19#define ACR 0x09 /* auxiliary control register */
20#define IMR 0x0B /* interrupt mask register */
21
22/* status bits */
23#define RxRDY 1 /* receiver ready */
24#define TxRDY 4 /* transmitter ready */
25
26/* commands */
27#define RESETRECEIVER 0x20
28#define RESETTRANSMIT 0x30
29#define RESETERROR 0x40
30#define RESETMODE 0x10
31#define TIMER 0xF0
32#define NOPARITY8BITS 0x13
33#define STOPBIT2 0x0F
34#define BAUD19200 0xC
35#define BAUDRATE BAUD19200+(BAUD19200<<4)
36#define ENABLE 0x05
37#define NOINTERRUPT 0x00
38
39void init_io();
40unsigned char charin();
41void charout( unsigned char c );
42
43int main() {
44 unsigned char aCharacter;
45
46 init_io();
47 aCharacter = charin();
48 charout(aCharacter);
49
50 return 0;
51}
52
53void init_io() {
54 unsigned char* port = (unsigned char*) 0xff000;
55
56 *(port+CR) = RESETRECEIVER; /* reset receiver */
57 *(port+CR) = RESETTRANSMIT; /* reset transmitter */
58 *(port+CR) = RESETERROR; /* clear any errors */
59 *(port+CR) = RESETMODE; /* make sure we’re using MR1 */
60
61 *(port+ACR) = TIMER; /* baud set 2, crystal divide by 16 */
62 *(port+MR) = NOPARITY8BITS; /* no parity, 8 bits */
63 *(port+MR) = STOPBIT2; /* stop bit length 2.000 */
64 *(port+CSR) = BAUDRATE; /* set baud */
65 *(port+IMR) = 0; /* turn off interrupts */
66 *(port+CR) = ENABLE; /* enable receiver and transmitter */
67}
68
69unsigned char charin() {
70 unsigned char* port = (unsigned char*) 0xff000;
71 unsigned char character, status;
72
73 do
74 {
75 status = *(port+SR);
76 } while ((status & RxRDY) != 0);
77 character = *(port+RR);
78 return character;
79}
80
81void charout( unsigned char c )
82{
83 unsigned char* port = (unsigned char*) 0xff000;
84 unsigned char status;
85 do
86 {
87 status = *(port+SR);
88 } while ((status & TxRDY) != 0);
89 *(port+TR) = c;
90}
Listing 16.1: Sketch of basic I/O functions using memory-mapped I/O — C version.
Lines 12 – 37 define symbolic names for values that are used to program the device. Notice that some names have the same value. For example, on lines 17 and 18 the receiver register (RR) and transmitter register (TR) are actually the same register. The CPU receives when it reads from this register and transmits when it writes to it. A similar situation is seen on lines 14 and 15. Reading from register 0x03 provides status information, and the clock selection commands are written to the same register. This illustrates an important point — I/O interface registers are not simply data storage places like CPU registers. It would probably be more accurate to call them “interface ports,” but “registers” is the commonly used terminology.
This example uses memory-mapped I/O, so simple assignment statements are used to access the I/O interface registers. The memory addresses 0xff000 – 0xff020 are associated with I/O registers for this device instead of physical memory. The base address of the device is assigned to a pointer variable on line 54 in the initio function. Then the commands to initialize the device are written to the appropriate registers on lines 56 – 66. It is not important that you completely understand what this function is doing, but the comments should give you a rough idea.
Lines 56 – 59 assign four different values to the same location:
*(port+CR) = RESETRECEIVER; /* reset receiver */
*(port+CR) = RESETTRANSMIT; /* reset transmitter */
*(port+CR) = RESETERROR; /* clear any errors */
*(port+CR) = RESETMODE; /* make sure we’re using MR1 */
If these were assignment to an actual memory location or to a CPU register, only the final statement would be required. But the Command Register is an I/O interface register. And as described above, it really is not a storage register, even on the I/O interface. In fact, these are four different commands that are sent to the Command Register “port” on the I/O interface.
The order in which commands are sent to the I/O interface may also be important. For example, on this particular device, the sequence on lines 62 – 63
*(port+MR) = NOPARITY8BITS; /* no parity, 8 bits */
*(port+MR) = STOPBIT2; /* stop bit length 2.000 */
must be performed in this order. There are actually two Mode Registers, which are both accessed through the same I/O interface register. The first time the register is accessed, it is connected to Mode Register 1. This access causes the hardware to automatically switch to Mode Register 2 for all subsequent accesses. Now you can understand the reason for sending the “RESETMODE” command to the Command Register on line 59. It’s important to ensure that the first access will be to Mode Register 1.
When compiling I/O functions, it is very important not to use optimization. If you do, the compiler may try coalesce command values into one value. (See Exercise 1.)
The next function is charin(). Its job is to read a character from the DUART. In the lab where this code was used, the DUART receiver was connected to a keyboard. The DUART must wait until somebody presses a key on the keyboard, then convert the code for that key to an eight-bit ASCII code representing the character. When the DUART has a character ready to be read from its receiver register, it sets the “receiver ready” bit in its status register to one. The do-while loop on lines 73 – 76 in charin show how the code must wait for this event.
When the status indicates that a character is ready, line 77 shows how it is read from the receiver register.
The charout() function writes a character to the transmitter. As you might expect, the transmitter was connected to a computer monitor. Although it is clear that keyboard input is very slow, writing on a monitor screen is also slow compared to CPU processing. Thus, we need a similar do-while loop (lines 83 – 88) to wait until the monitor is ready to accept a new character. Once the value provided by the status register shows it is ready, line 89 shows how the character is written to the DUART’s transmitter register.
Listing 16.2 shows the assembly language generated by the gcc compiler for the C program in Listing 16.1. Some comments have been added to explain the general concepts.
1 .file "io_sketch_mm.c"
2 .text
3 .globlmain
4 .type main, @function
5main:
6 pushq %rbp
7 movq %rsp, %rbp
8 subq $16, %rsp
9 movl $0, %eax
10 call init_io
11 movl $0, %eax
12 call charin
13 movb %al, -1(%rbp)
14 movzbl-1(%rbp), %eax
15 movl %eax, %edi
16 call charout
17 movl $0, %eax
18 leave
19 ret
20 .size main, .-main
21 .globlinit_io
22 .type init_io, @function
23init_io:
24 pushq %rbp
25 movq %rsp, %rbp
26 movq $1044480, -8(%rbp) # initialize pointer variable to 0xff000
27 movq -8(%rbp), %rax # base address of DUART
28 addq $5, %rax # address of command register
29 movb $32, (%rax) # reset receiver
30 movq -8(%rbp), %rax
31 addq $5, %rax
32 movb $48, (%rax) # reset transmitter
33 movq -8(%rbp), %rax
34 addq $5, %rax
35 movb $64, (%rax) # reset error
36 movq -8(%rbp), %rax
37 addq $5, %rax
38 movb $16, (%rax) # reset mode
39 movq -8(%rbp), %rax # base address of DUART
40 addq $9, %rax # address of auxiliary control register
41 movb $-16, (%rax) # baud set, crystal rate
42 movq -8(%rbp), %rax
43 addq $1, %rax
44 movb $19, (%rax)
45 movq -8(%rbp), %rax
46 addq $1, %rax
47 movb $15, (%rax)
48 movq -8(%rbp), %rax
49 addq $3, %rax
50 movb $-52, (%rax)
51 movq -8(%rbp), %rax
52 addq $11, %rax
53 movb $0, (%rax)
54 movq -8(%rbp), %rax
55 addq $5, %rax
56 movb $5, (%rax)
57 popq %rbp
58 ret
59 .size init_io, .-init_io
60 .globlcharin
61 .type charin, @function
62charin:
63 pushq %rbp
64 movq %rsp, %rbp
65 movq $1044480, -8(%rbp) # initialize pointer variable to 0xff000
66.L5:
67 movq -8(%rbp), %rax # base address of DUART
68 movzbl3(%rax), %eax # read status register
69 movb %al, -10(%rbp) # and save locally
70 movzbl-10(%rbp), %eax
71 andl $1, %eax # check receiver status
72 testl %eax, %eax # if bit is 0
73 jne .L5 # recheck
74 movq -8(%rbp), %rax # receiver ready, get DUART address
75 movzbl7(%rax), %eax # read input byte
76 movb %al, -9(%rbp) # store locally
77 movzbl-9(%rbp), %eax # return value
78 popq %rbp
79 ret
80 .size charin, .-charin
81 .globlcharout
82 .type charout, @function
83charout:
84 pushq %rbp
85 movq %rsp, %rbp
86 movl %edi, %eax
87 movb %al, -20(%rbp)
88 movq $1044480, -8(%rbp) # initialize pointer variable to 0xff000
89.L8:
90 movq -8(%rbp), %rax # base address of DUART
91 movzbl3(%rax), %eax # read status register
92 movb %al, -9(%rbp) # and save locally
93 movzbl-9(%rbp), %eax
94 andl $4, %eax # check transmitter status
95 testl %eax, %eax # if bit is 0
96 jne .L8 # recheck
97 movq -8(%rbp), %rax # transmitter ready, get DUART address
98 leaq 7(%rax), %rdx # address of transmitter register
99 movzbl-20(%rbp), %eax # load byte to send
100 movb %al, (%rdx) # send it
101 popq %rbp
102 ret
103 .size charout, .-charout
104 .ident"GCC: (Ubuntu/Linaro 4.7.0-7ubuntu3) 4.7.0"
105 .section .note.GNU-stack,"",@progbits
Listing 16.2: Memory-mapped I/O in assembly language. Comments have been added to explain the code.
The comments on line 26 – 41 in the init_io function describe how values are written to the appropriate memory addresses, which are mapped to I/O registers.
Lines 66 – 73 in the charin function make up a loop that waits until the receiver has a character ready to be read. The readiness of the receiver is indicated by bit 2 in the status register. The character is read from that register on line 75. A similar loop is used on lines 89 – 96 in the charout function to wait until the status register shows that the transmitter is ready for another character. When it is ready, the address of the transmitter register is computed on lines 97 – 98, the byte to be sent is loaded into the eax register on line 99, and it is written to the transmitter register on line 100.
As we saw in Section 16.5, special instructions are required to access isolated I/O. The Linux kernel source includes macros to use these instructions. The macros are defined in the file io.h. Listing 16.3 illustrates the use of these macros to write the same program as in Listing 16.1 if the DUART interface were connected to the isolated I/O system.
1/*
2 * io_sketch_iso.c
3 * This code sketches the algorithms to initialize
4 * a DUART, read one character and echo it using
5 * isolated I/O.
6 * WARNING: This code does not run on any known
7 * device. It is meant to sketch some
8 * general I/O concepts only.
9 * Bob Plantz - 18 June 2009
10 */
11#include <sys/io.h>
12
13/* register offsets */
14#define MR 0x01 /* mode register */
15#define SR 0x03 /* status register */
16#define CSR 0x03 /* clock select register */
17#define CR 0x05 /* command register */
18#define RR 0x07 /* receiver register */
19#define TR 0x07 /* transmitter register */
20#define ACR 0x09 /* auxiliary control register */
21#define IMR 0x0B /* interrupt mask register */
22
23/* status bits */
24#define RxRDY 1 /* receiver ready */
25#define TxRDY 4 /* transmitter ready */
26
27/* commands */
28#define RESETRECEIVER 0x20
29#define RESETTRANSMIT 0x30
30#define RESETERROR 0x40
31#define RESETMODE 0x10
32#define TIMER 0xF0
33#define NOPARITY8BITS 0x13
34#define STOPBIT2 0x0F
35#define BAUD19200 0xC
36#define BAUDRATE BAUD19200+(BAUD19200<<4)
37#define ENABLE 0x05
38#define NOINTERRUPT 0x00
39#define NOINTERRUPT 0x00
40
41void init_io();
42unsigned char charin();
43void charout( unsigned char c );
44
45int main() {
46 unsigned char aCharacter;
47
48 init_io();
49 aCharacter = charin();
50 charout(aCharacter);
51
52 return 0;
53}
54
55void init_io() {
56 outb(CR, RESETRECEIVER);
57 outb(CR, RESETTRANSMIT);
58 outb(CR, RESETERROR);
59 outb(CR, RESETMODE);
60 outb(ACR, TIMER);
61 outb(MR, NOPARITY8BITS);
62 outb(MR, STOPBIT2);
63 outb(CSR, BAUDRATE);
64 outb(IMR, NOINTERRUPT);
65 outb(CR, ENABLE);
66}
67
68unsigned char charin() {
69 unsigned char character, status;
70
71 do
72 {
73 status = inb(SR);
74 } while ((status & RxRDY) != 0);
75 character = inb(RR);
76 return character;
77}
78
79void charout( unsigned char c )
80{
81 unsigned char status;
82 do
83 {
84 status = inb(SR);
85 } while ((status & TxRDY) != 0);
86 outb(TR, c);
87}
Listing 16.3: Sketch of basic I/O functions, isolated I/O — C version.
On line 11 we need to include the file containing the macros:
#include <sys/io.h>
The use of the outb() macro can be seen in lines 56 – 65. And on line 75 we see the inb() macro being used to read the status register.
The gcc compiler generates assembly language as shown in Listing 16.4
1 .file "io_sketch_iso.c"
2 .text
3 .type inb, @function # begin inb function
4inb:
5 pushq %rbp
6 movq %rsp, %rbp
7 pushq %rbx
8 movl %edi, %eax
9 movw %ax, -28(%rbp)
10 movzwl-28(%rbp), %edx
11 movw %dx, -30(%rbp)
12 movzwl-30(%rbp), %edx
13#APP
14# 48 "/usr/include/x86_64-linux-gnu/sys/io.h" 1
15 inb %dx,%al # read the byte
16# 0 "" 2
17#NO_APP
18 movl %eax, %ebx
19 movb %bl, -9(%rbp)
20 movzbl-9(%rbp), %eax
21 popq %rbx
22 popq %rbp
23 ret
24 .size inb, .-inb
25 .type outb, @function # begin outb function
26outb:
27 pushq %rbp
28 movq %rsp, %rbp
29 movl %edi, %edx
30 movl %esi, %eax
31 movb %dl, -4(%rbp)
32 movw %ax, -8(%rbp)
33 movzbl-4(%rbp), %eax
34 movzwl-8(%rbp), %edx
35#APP
36# 99 "/usr/include/x86_64-linux-gnu/sys/io.h" 1
37 outb %al,%dx # write the byte
38# 0 "" 2
39#NO_APP
40 popq %rbp
41 ret
42 .size outb, .-outb
43 .globlmain
44 .type main, @function
45main:
46 pushq %rbp
47 movq %rsp, %rbp
48 subq $16, %rsp
49 movl $0, %eax
50 call init_io
51 movl $0, %eax
52 call charin
53 movb %al, -1(%rbp)
54 movzbl-1(%rbp), %eax
55 movl %eax, %edi
56 call charout
57 movl $0, %eax
58 leave
59 ret
60 .size main, .-main
61 .globlinit_io
62 .type init_io, @function
63init_io:
64 pushq %rbp
65 movq %rsp, %rbp
66 movl $32, %esi
67 movl $5, %edi
68 call outb # outb(CR, RESETRECEIVER);
69 movl $48, %esi
70 movl $5, %edi
71 call outb # outb(CR, RESETTRANSMIT);
72 movl $64, %esi
73 movl $5, %edi
74 call outb # outb(CR, RESETERROR);
75 movl $16, %esi
76 movl $5, %edi
77 call outb # outb(CR, RESETMODE);
78 movl $240, %esi
79 movl $9, %edi
80 call outb # outb(ACR, TIMER);
81 movl $19, %esi
82 movl $1, %edi
83 call outb # outb(MR, NOPARITY8BITS);
84 movl $15, %esi
85 movl $1, %edi
86 call outb # outb(MR, STOPBIT2);
87 movl $204, %esi
88 movl $3, %edi
89 call outb # outb(CSR, BAUDRATE);
90 movl $0, %esi
91 movl $11, %edi
92 call outb # outb(IMR, NOINTERRUPT);
93 movl $5, %esi
94 movl $5, %edi
95 call outb # outb(CR, ENABLE);
96 popq %rbp
97 ret
98 .size init_io, .-init_io
99 .globlcharin
100 .type charin, @function
101charin:
102 pushq %rbp
103 movq %rsp, %rbp
104 subq $16, %rsp
105.L8:
106 movl $3, %edi # address of status register
107 call inb # read status
108 movb %al, -2(%rbp)
109 movzbl-2(%rbp), %eax
110 andl $1, %eax # check receiver status
111 testl %eax, %eax # if bit is 0
112 jne .L8 # recheck
113 movl $7, %edi # ready, address of receiver register
114 call inb # read input byte
115 movb %al, -1(%rbp) # store locally
116 movzbl-1(%rbp), %eax # return value
117 leave
118 ret
119 .size charin, .-charin
120 .globlcharout
121 .type charout, @function
122charout:
123 pushq %rbp
124 movq %rsp, %rbp
125 subq $24, %rsp
126 movl %edi, %eax
127 movb %al, -20(%rbp)
128.L11:
129 movl $3, %edi # address of status register
130 call inb # read status
131 movb %al, -1(%rbp)
132 movzbl-1(%rbp), %eax
133 andl $4, %eax # check transmitter status
134 testl %eax, %eax # if bit is 0
135 jne .L11 # recheck
136 movzbl-20(%rbp), %eax # load byte to send
137 movl %eax, %esi
138 movl $7, %edi # address of transmitter
139 call outb # send it
140 leave
141 ret
142 .size charout, .-charout
143 .ident"GCC: (Ubuntu/Linaro 4.7.0-7ubuntu3) 4.7.0"
144 .section .note.GNU-stack,"",@progbits
Listing 16.4: Isolated I/O in assembly language. Comments have been added to explain the code.
Looking at lines 3 – 24 and lines 25 – 42, we see that the inb() and outb() macros generate functions. The actual inb instruction is used on line 15 and outb is used on line 37.
At the points where the macros are called in the C source code, the compiler generates calls to the appropriate function. For example, the C sequence
56 outb(CR, RESETRECEIVER);
57 outb(CR, RESETTRANSMIT);
generates the assembly language (comments added)
66 movl $32, %esi
67 movl $5, %edi
68 call outb # outb(CR, RESETRECEIVER);
69 movl $48, %esi
70 movl $5, %edi
71 call outb # outb(CR, RESETTRANSMIT);
16.7 Interrupt-Driven I/O
Reading the code in Section 16.6, you probably realize that the CPU can waste a lot of time simply waiting for I/O devices. Most I/O interfaces include hardware that can send an interrupt signal to the CPU when they have data ready for input or are able to accept output (see Section 15.1, page 871). While waiting for an I/O device, the operating system will suspend the requesting process and allow another process, perhaps being run by another user, to use the CPU.
The device handler for each I/O device that can interrupt includes a special interrupt handler function. The address of each interrupt handler is stored in a table in the operating system. When the requested I/O device is ready for I/O, it sends an interrupt signal to the CPU on the control bus. The device identifies itself to the CPU, and the CPU consults the table to obtain the address of the corresponding interrupt handler. CPU execution control then transfers to the interrupt handler function, which contains code to read from or write to the device as needed. When the interrupt handler function completes its servicing of the I/O device, the last instruction in the function is an iret (see Section 15.5 on page 875). This causes CPU execution control to return to the control flow where it was interrupted.
This is a highly simplified description. The operating system must perform a great deal of “bookkeeping” in this transfer of control. For example, before allowing the interrupt handler function to execute, at least any registers that will be used in the function must be saved. And more than one process may be waiting for I/O to complete. The operating system must keep track of which process is waiting for which I/O device and make sure that the process gets or sends the correct input or output.
Many other issues face the device handler programmer. For example, I/O devices are left to run on their own time, so one device may attempt to interrupt while another device’s interrupt handling function is being executed. The programmer must decide whether the interrupt should be allowed or not. In general, it cannot be ignored because this would cause the loss of I/O data. On the other hand, spending too much time handling the second interrupt may cause the first device to lose data.
16.8 I/O Instructions
opcode | source | destination | page |
ins | $imm/%reg | %reg/mem | 890 |
outs | $imm/%reg | %reg/mem | 891 |
s = b, w, l, q | |||
16.9 Exercises
16-1(§16.6) Enter the C program in Listing 16.1. Compile it to the assembly language stage (use the -S option) with different levels of optimization. For example, -O1, -O2. Compare the results with the non-optimized version in Listing 16.2.
16-2(§16.6) Enter the C program in Listing 16.3. Compile it to the assembly language stage (use the -S option) with different levels of optimization. For example, -O1, -O2. Compare the results with the non-optimized version in Listing 16.4.