xv6 Source Code
xv6 Source Code
xv6 is a reimplementation of Dennis Ritchies and Ken Thompsons Unix The numbers to the left of the file names in the table are sheet numbers.
Version 6 (v6). xv6 loosely follows the structure and style of v6, The source code has been printed in a double column format with fifty
but is implemented for a modern x86based multiprocessor using ANSI C. lines per column, giving one hundred lines per sheet (or page).
Thus there is a convenient relationship between line numbers and sheet numbers.
ACKNOWLEDGMENTS
xv6 is inspired by John Lionss Commentary on UNIX 6th Edition (Peer # basic headers # system calls # string operations
to Peer Communications; ISBN: 1573980137; 1st edition (June 14, 01 types.h 28 traps.h 59 string.c
2000)). See also http://pdos.csail.mit.edu/6.828/2007/v6.html, which 01 param.h 28 vectors.pl
provides pointers to online resources for v6. 02 memlayout.h 29 trapasm.S # lowlevel hardware
02 defs.h 29 trap.c 61 mp.h
xv6 borrows code from the following sources: 04 x86.h 31 syscall.h 62 mp.c
JOS (asm.h, elf.h, mmu.h, bootasm.S, ide.c, console.c, and others) 06 asm.h 31 syscall.c 64 lapic.c
Plan 9 (entryother.S, mp.h, mp.c, lapic.c) 07 mmu.h 33 sysproc.c 66 ioapic.c
FreeBSD (ioapic.c) 09 elf.h 67 picirq.c
NetBSD (console.c) # file system 68 kbd.h
# entering xv6 34 buf.h 69 kbd.c
The following people made contributions: 10 entry.S 34 fcntl.h 70 console.c
Russ Cox (context switching, locking) 11 entryother.S 35 stat.h 73 timer.c
Cliff Frey (MP) 12 main.c 35 fs.h 74 uart.c
Xiao Yu (MP) 36 file.h
Nickolai Zeldovich # locks 36 ide.c # userlevel
Austin Clements 14 spinlock.h 38 bio.c 75 initcode.S
14 spinlock.c 40 log.c 75 usys.S
In addition, we are grateful for the patches contributed by Greg 42 fs.c 76 init.c
Price, Yandong Mao, and Hitoshi Mitake. # processes 50 file.c 76 sh.c
16 vm.c 52 sysfile.c
The code in the files that constitute xv6 is 20 proc.h 57 exec.c # bootloader
Copyright 20062011 Frans Kaashoek, Robert Morris, and Russ Cox. 21 proc.c 82 bootasm.S
26 swtch.S # pipes 83 bootmain.c
ERROR REPORTS 27 kalloc.c 58 pipe.c
To run xv6, you can use Bochs or QEMU, both PC simulators. indicates that swtch is defined on line 2658 and is mentioned on five lines
Bochs makes debugging easier, but QEMU is much faster. on sheets 03, 24, and 26.
To run in Bochs, run "make bochs" and then type "c" at the bochs prompt.
To run in QEMU, run "make qemu".
4401 4427 4452 4455 4461 2833 3034 7442 7443 KERNBASE 0207 6525 6556 6557 6559 6568
4487 4488 4502 4534 4552 IRQ_ERROR 2835 0207 0208 0212 0213 0217 6569
4574 4610 4654 4685 4702 2835 6477 0218 0220 0221 1321 1533 lcr3 0590
4752 4811 4812 4852 4856 IRQ_IDE 2834 1832 1889 2730 0590 1764 1779
4953 4956 4988 4995 5316 2834 3023 3027 3706 3707 KERNLINK 0208 lgdt 0512
5361 5403 5456 5460 5506 IRQ_KBD 2832 0208 1727 0512 0520 1133 1633 8241
5554 5569 5604 5716 7251 2832 3030 7325 7326 KEY_DEL 6828 lidt 0526
7301 IRQ_SLAVE 6710 6828 6869 6891 6915 0526 0534 2981
INPUT_BUF 7200 6710 6714 6752 6767 KEY_DN 6822 LINT0 6433
7200 7203 7224 7236 7238 IRQ_SPURIOUS 2836 6822 6865 6887 6911 6433 6468
7240 7268 2836 3039 6457 KEY_END 6820 LINT1 6434
insl 0462 IRQ_TIMER 2831 6820 6868 6890 6914 6434 6469
0462 0464 3767 8373 2831 3014 3073 6464 7380 KEY_HOME 6819 LIST 7660
install_trans 4071 isdirempty 5361 6819 6868 6890 6914 7660 7740 7907 8183
4071 4119 4140 5361 5368 5427 KEY_INS 6827 listcmd 7690 7901
INT_DISABLED 6619 ismp 6215 6827 6869 6891 6915 7690 7711 7741 7901 7903
6619 6667 0338 1233 6215 6312 6320 KEY_LF 6823 8046 8157 8184
ioapic 6627 6340 6343 6655 6675 6823 6867 6889 6913 loadgs 0551
6307 6329 6330 6624 6627 itrunc 4654 KEY_PGDN 6826 0551 1634
6636 6637 6643 6644 6658 4274 4561 4654 6826 6866 6888 6912 loaduvm 1803
IOAPIC 6608 iunlock 4534 KEY_PGUP 6825 0426 1803 1809 1812 5745
6608 6658 0293 4534 4537 4576 4971 6825 6866 6888 6912 log 4040 4050
ioapicenable 6673 5084 5114 5178 5334 5532 KEY_RT 6824 4040 4050 4061 4063 4064
0309 3707 6673 7326 7443 5613 7256 7305 6824 6867 6889 6913 4065 4075 4076 4077 4089
ioapicid 6217 iunlockput 4574 KEY_UP 6821 4092 4093 4094 4104 4107
0310 6217 6330 6347 6661 0294 4574 4966 4975 4978 6821 6865 6887 6911 4108 4109 4120 4127 4128
6662 5327 5340 5343 5354 5428 kfree 2756 4129 4131 4132 4138 4141
ioapicinit 6651 5439 5443 5451 5468 5472 0316 1871 1873 1893 1896 4145 4146 4147 4148 4163
0311 1224 6651 6662 5496 5521 5529 5561 5582 2265 2369 2747 2756 2761 4165 4168 4169 4172 4173
ioapicread 6634 5610 5748 5797 5852 5873 4177 4178
6634 6659 6660 iupdate 4427 kill 2575 logheader 4035
ioapicwrite 6641 0295 4427 4563 4680 4778 0362 2575 3059 3334 7567 4035 4046 4057 4058 4090
6641 6667 6668 6681 6682 5333 5353 5437 5442 5483 kinit 2740 4105
IO_PIC1 6707 5487 0317 1236 2740 LOGSIZE 0160
6707 6720 6735 6744 6747 I_VALID 3628 KSTACKSIZE 0151 0160 4037 4163 5167
6752 6762 6776 6777 3628 4516 4526 4555 0151 1054 1063 1300 1775 log_write 4159
IO_PIC2 6708 kalloc 2777 2177 0333 4159 4295 4318 4344
6708 6721 6736 6765 6766 0315 1792 1794 1839 1846 kvmalloc 1753 4416 4440 4630 4772
6767 6770 6779 6780 1923 1931 1934 2173 2209 0418 1218 1753 ltr 0538
IO_RTC 6535 2777 5731 5829 lapiceoi 6522 0538 0540 1776
6535 6548 6549 KBDATAP 6804 0326 3021 3025 3032 3036 mappages 1679
IO_TIMER1 7359 6804 6967 3042 6522 1679 1745 1794 1846 1934
7359 7368 7378 7379 kbdgetc 6956 lapicinit 6451 MAXARG 0159
IPB 3582 6956 6998 0327 1220 1256 6451 0159 5622 5714 5760
3582 3585 3591 4412 4433 kbdintr 6996 lapicstartap 6540 MAXARGS 7663
4518 0321 3031 6996 0328 1304 6540 7663 7671 7672 8140
iput 4552 KBS_DIB 6803 lapicw 6444 MAXFILE 3569
0292 2320 4552 4558 4577 6803 6965 6444 6457 6463 6464 6465 3569 4765
4860 4982 5072 5344 5614 KBSTATP 6802 6468 6469 6474 6477 6480 memcmp 5965
IRQ_COM1 2833 6802 6964 6481 6484 6487 6488 6493 0386 5965 6245 6288
Sep 5 23:39 2011 crossreferences Page 7 Sep 5 23:39 2011 crossreferences Page 8
memmove 5981 6238 6264 6268 6271 2418 2557 2580 2619 5490 5494 7063 7105 7112
0387 1285 1795 1933 1982 multiboot_header 1025 NPTENTRIES 0822 7701 7720 7753 7832 7845
4078 4174 4283 4439 4524 1024 1025 0822 1867 8028 8072 8106 8110 8136
4721 4771 4929 4931 5981 namecmp 4803 NSEGS 2001 8141
6004 7173 0296 4803 4825 5418 1611 2001 2008 panicked 7018
memset 5954 namei 4989 nulterminate 8152 7018 7118 7188
0388 1666 1741 1793 1845 0297 2223 4989 5320 5517 8015 8030 8152 8173 8179 parseblock 8101
2190 2213 2733 2764 4294 5606 5720 8180 8185 8186 8191 8101 8106 8125
4414 5432 5629 5954 7175 nameiparent 4996 NUMLOCK 6813 parsecmd 8018
7787 7858 7869 7885 7906 0298 4954 4969 4981 4996 6813 6846 7702 7825 8018
7919 5336 5410 5463 O_CREATE 3453 parseexec 8117
microdelay 6531 namex 4954 3453 5510 8078 8081 8014 8055 8117
0329 6531 6558 6560 6570 4954 4992 4998 O_RDONLY 3450 parseline 8035
7458 NBUF 0155 3450 5520 8075 8012 8024 8035 8046 8108
min 4273 0155 3881 3903 O_RDWR 3452 parsepipe 8051
4273 4720 4770 ncpu 6216 3452 5538 7614 7616 7807 8013 8039 8051 8058
mp 6102 1222 1287 2019 3707 6216 outb 0471 parseredirs 8064
6102 6208 6237 6244 6245 6318 6319 6323 6324 6325 0471 3711 3720 3731 3732 8064 8112 8131 8142
6246 6255 6260 6264 6265 6345 3733 3734 3735 3736 3738 PCINT 6432
6268 6269 6280 6283 6285 NCPU 0152 3741 6353 6354 6548 6549 6432 6474
6287 6294 6304 6310 6350 0152 2018 6213 6720 6721 6735 6736 6744 pde_t 0103
mpbcpu 6220 NDEV 0157 6747 6752 6762 6765 6766 0103 0420 0421 0422 0423
0339 1220 6220 0157 4708 4758 5007 6767 6770 6776 6777 6779 0424 0425 0426 0427 0430
MPBUS 6152 NDIRECT 3567 6780 7160 7162 7178 7179 0431 1210 1270 1317 1610
6152 6333 3567 3569 3578 3624 4615 7180 7181 7377 7378 7379 1654 1656 1679 1733 1736
mpconf 6113 4620 4624 4625 4660 4667 7423 7426 7427 7428 7429 1739 1786 1803 1827 1855
6113 6279 6282 6287 6305 4668 4675 4676 7430 7431 7459 8228 8236 1883 1903 1915 1916 1918
mpconfig 6280 NELEM 0434 8364 8365 8366 8367 8368 1952 1968 2055 5718
6280 6310 0434 1744 2622 3282 5631 8369 PDX 0812
mpenter 1252 nextpid 2116 outsl 0483 0812 1659
1252 1301 2116 2169 0483 0485 3739 PDXSHIFT 0827
mpinit 6301 NFILE 0154 outw 0477 0812 0818 0827 1321
0340 1219 6301 6319 6339 0154 5010 5026 0477 1181 1183 8274 8276 peek 8001
mpioapic 6139 NINDIRECT 3568 O_WRONLY 3451 8001 8025 8040 8044 8056
6139 6307 6329 6331 3568 3569 4622 4670 3451 5537 5538 8078 8081 8069 8105 8109 8124 8132
MPIOAPIC 6153 NINODE 0156 P2V 0218 PGROUNDDOWN 0830
6153 6328 0156 4385 4461 0218 1726 6262 6550 7152 0830 1685 1686 1975
MPIOINTR 6154 NO 6806 panic 7105 7832 PGROUNDUP 0829
6154 6334 6806 6852 6855 6857 6858 0270 1478 1505 1569 1571 0829 1837 1863 2732 2745
MPLINTR 6155 6859 6860 6862 6874 6877 1691 1743 1778 1791 1809 5752
6155 6335 6879 6880 6881 6882 6884 1812 1871 1888 1909 1927 PGSIZE 0823
mpmain 1262 6902 6903 6905 6906 6907 1929 2210 2310 2340 2458 0823 0829 0830 1316 1666
1209 1239 1257 1262 6908 2460 2462 2464 2506 2509 1695 1696 1741 1790 1793
mpproc 6128 NOFILE 0153 2731 2761 3055 3728 3809 1794 1808 1810 1814 1817
6128 6306 6317 6326 0153 2064 2277 2313 5220 3811 3813 3946 3967 3977 1838 1845 1846 1864 1867
MPPROC 6151 5236 4058 4164 4166 4326 4342 1925 1933 1934 1979 1985
6151 6316 NPDENTRIES 0821 4422 4473 4508 4528 4537 2212 2219 2733 2734 2746
mpsearch 6256 0821 1317 1890 4558 4636 4818 4822 4867 2760 2764 5753 5755
6256 6285 NPROC 0150 4875 5043 5058 5117 5184 PHYSTOP 0203
mpsearch1 6238 0150 2111 2161 2329 2362 5189 5368 5426 5434 5477 0203 1728 1742 1743 2746
Sep 5 23:39 2011 crossreferences Page 9 Sep 5 23:39 2011 crossreferences Page 10
2760 2955 3004 3006 3008 3051 readsb 4278 0365 1267 2006 2408 2428
picenable 6725 3059 3060 3062 3068 3073 0285 4062 4278 4311 4337 2466
0344 3706 6725 7325 7380 3077 3155 3167 3179 3197 4409 SCROLLLOCK 6814
7442 3210 3226 3279 3281 3283 readsect 8360 6814 6847
picinit 6732 3286 3287 3306 3340 3358 8360 8395 SECTSIZE 8312
0345 1223 6732 3375 3657 4267 4961 5205 readseg 8379 8312 8373 8386 8389 8394
picsetmask 6717 5220 5237 5238 5296 5614 8314 8327 8338 8379 SEG 0769
6717 6727 6783 5615 5633 5639 5664 5704 recover_from_log 4116 0769 1625 1626 1627 1628
pinit 2123 5781 5784 5785 5786 5787 4052 4066 4116 1631
0363 1227 2123 5788 5789 5804 5887 5907 REDIR 7658 SEG16 0773
pipe 5811 6211 6306 6317 6318 6319 7658 7730 7870 8171 0773 1772
0254 0352 0353 0354 3605 6322 7013 7261 7410 redircmd 7675 7864 SEG_ASM 0660
5069 5109 5159 5811 5823 procdump 2604 7675 7713 7731 7864 7866 0660 1190 1191 8284 8285
5829 5835 5839 5843 5861 0364 2604 7220 8075 8078 8081 8159 8172 segdesc 0752
5880 5901 7563 7752 7753 proghdr 0974 REG_ID 6610 0509 0512 0752 0769 0773
PIPE 7659 0974 5717 8320 8334 6610 6660 1611 2008
7659 7750 7886 8177 PTE_ADDR 0844 REG_TABLE 6612 seginit 1616
pipealloc 5821 0844 1661 1813 1869 1892 6612 6667 6668 6681 6682 0417 1221 1255 1616
0351 5659 5821 1930 1961 REG_VER 6611 SEG_KCODE 0741
pipeclose 5861 PTE_P 0833 6611 6659 0741 1150 1625 2972 2973
0352 5069 5861 0833 1319 1321 1660 1670 release 1502 8253
pipecmd 7684 7880 1690 1692 1868 1891 1928 0381 1502 1505 2164 2170 SEG_KCPU 0743
7684 7712 7751 7880 7882 1957 2377 2384 2435 2477 2487 0743 1631 1634 2916
8058 8158 8178 PTE_PS 0840 2519 2532 2568 2586 2590 SEG_KDATA 0742
piperead 5901 0840 1319 1321 2770 2785 3019 3376 3381 0742 1154 1626 1774 2913
0353 5109 5901 pte_t 0847 3394 3759 3778 3833 3928 8258
PIPESIZE 5809 0847 1653 1657 1661 1663 3942 3991 4132 4148 4464 SEG_NULLASM 0654
5809 5813 5886 5894 5916 1683 1806 1857 1905 1919 4480 4492 4514 4542 4560 0654 1189 8283
pipewrite 5880 1954 4569 5029 5033 5045 5060 SEG_TSS 0746
0354 5159 5880 PTE_U 0835 5066 5872 5875 5888 5897 0746 1772 1773 1776
popcli 1566 0835 1670 1794 1846 1910 5908 5919 7101 7248 7262 SEG_UCODE 0744
0383 1521 1566 1569 1571 1934 1959 7282 7309 0744 1627 2214
1780 PTE_W 0834 ROOTDEV 0158 SEG_UDATA 0745
printint 7026 0834 1319 1321 1670 1726 0158 4062 4065 4959 0745 1628 2215
7026 7077 7081 1728 1729 1794 1846 1934 ROOTINO 3556 SETGATE 0921
proc 2053 PTX 0815 3556 4959 0921 2972 2973
0255 0358 0398 0399 0428 0815 1672 run 2711 setupkvm 1734
1205 1458 1606 1638 1769 PTXSHIFT 0826 2611 2711 2712 2717 2758 0420 1734 1755 1923 2209
1775 2015 2030 2053 2059 0815 0818 0826 2767 2779 5731
2106 2111 2114 2154 2157 pushcli 1555 runcmd 7706 SHIFT 6808
2161 2204 2235 2237 2240 0382 1476 1555 1771 7706 7720 7737 7743 7745 6808 6836 6837 6985
2243 2244 2257 2264 2270 rcr2 0582 7759 7766 7777 7825 skipelem 4915
2271 2272 2278 2279 2280 0582 3054 3061 RUNNING 2050 4915 4963
2284 2306 2309 2314 2315 readeflags 0544 2050 2427 2461 2611 3073 sleep 2503
2316 2320 2321 2326 2329 0544 1559 1568 2463 6508 safestrcpy 6032 0367 2389 2503 2506 2509
2330 2338 2355 2362 2363 read_head 4087 0389 2222 2284 5781 6032 2609 3379 3830 3931 4129
2383 2389 2410 2418 2425 4087 4118 sched 2453 4512 5892 5911 7266 7579
2428 2433 2461 2466 2475 readi 4702 0366 2339 2453 2458 2460 spinlock 1401
2505 2523 2524 2528 2555 0299 1818 4702 4821 4866 2462 2464 2476 2525 0256 0367 0377 0379 0380
2557 2577 2580 2615 2619 5112 5367 5368 5726 5737 scheduler 2408 0381 0409 1401 1459 1462
Sep 5 23:39 2011 crossreferences Page 11 Sep 5 23:39 2011 crossreferences Page 12
1474 1502 1544 2107 2110 0258 0285 3560 4060 4278 3111 3261 3248 3266 5277
2503 2709 2716 2958 2963 4308 4334 4407 sys_kill 3328 SYS_write 3117
3660 3675 3876 3880 4003 SVR 6415 3237 3256 3328 3117 3266
4041 4268 4384 5005 5009 6415 6457 SYS_kill 3106 taskstate 0851
5807 5812 7008 7021 7202 switchkvm 1762 3106 3256 0851 2007
7406 0429 1254 1756 1762 2429 sys_link 5313 TDCR 6439
STA_R 0669 0786 switchuvm 1769 3238 3269 5313 6439 6463
0669 0786 1190 1625 1627 0428 1769 1778 2244 2426 SYS_link 3120 T_DEV 3502
8284 5789 3120 3269 3502 4707 4757 5578
start 1125 7508 8211 swtch 2658 sys_mkdir 5551 T_DIR 3500
1124 1125 1167 1175 1177 0374 2428 2466 2657 2658 3239 3270 5551 3500 4817 4965 5326 5427
4042 4063 4076 4089 4104 syscall 3275 SYS_mkdir 3121 5435 5485 5520 5557 5609
4173 7507 7508 8210 8211 0400 3007 3157 3275 3121 3270 T_FILE 3501
8267 SYSCALL 7553 7560 7561 7562 7563 75 sys_mknod 5567 3501 5470 5512
startothers 1274 7560 7561 7562 7563 7564 3240 3267 5567 ticks 2964
1208 1235 1274 7565 7566 7567 7568 7569 SYS_mknod 3118 0407 2964 3017 3018 3373
stat 3504 7570 7571 7572 7573 7574 3118 3267 3374 3379 3393
0257 0281 0300 3504 4265 7575 7576 7577 7578 7579 sys_open 5501 tickslock 2963
4685 5079 5203 5304 7603 7580 3241 3265 5501 0409 2963 2975 3016 3019
stati 4685 sys_chdir 5601 SYS_open 3116 3372 3376 3379 3381 3392
0300 4685 5083 3229 3259 5601 3116 3265 3280 3282 3394
STA_W 0668 0785 SYS_chdir 3109 sys_pipe 5651 TICR 6437
0668 0785 1191 1626 1628 3109 3259 3242 3254 5651 6437 6465
1631 8285 sys_close 5289 SYS_pipe 3104 TIMER 6429
STA_X 0665 0782 3230 3271 5289 3104 3254 6429 6464
0665 0782 1190 1625 1627 SYS_close 3122 sys_read 5265 TIMER_16BIT 7371
8284 3122 3271 3243 3255 5265 7371 7377
sti 0563 sys_dup 5251 SYS_read 3105 TIMER_DIV 7366
0563 0565 1573 2414 3231 3260 5251 3105 3255 7366 7378 7379
stosb 0492 SYS_dup 3110 sys_sbrk 3351 TIMER_FREQ 7365
0492 0494 5960 8340 3110 3260 3244 3262 3351 7365 7366
stosl 0501 sys_exec 5620 SYS_sbrk 3112 timerinit 7374
0501 0503 5958 3232 3257 5620 3112 3262 0403 1234 7374
strlen 6051 SYS_exec 3107 sys_sleep 3365 TIMER_MODE 7368
0390 5762 5763 6051 7819 3107 3257 7512 3245 3263 3365 7368 7377
8023 sys_exit 3315 SYS_sleep 3113 TIMER_RATEGEN 7370
strncmp 6008 3233 3252 3315 3113 3263 7370 7377
0391 4805 6008 SYS_exit 3102 sys_unlink 5401 TIMER_SEL0 7369
strncpy 6018 3102 3252 7517 3246 3268 5401 7369 7377
0392 4872 6018 sys_fork 3309 SYS_unlink 3119 T_IRQ0 2829
STS_IG32 0800 3234 3251 3309 3119 3268 2829 3014 3023 3027 3030
0800 0927 SYS_fork 3101 sys_uptime 3388 3034 3038 3039 3073 6457
STS_T32A 0797 3101 3251 3249 3264 3388 6464 6477 6667 6681 6747
0797 1772 sys_fstat 5301 SYS_uptime 3114 6766
STS_TG32 0801 3235 3258 5301 3114 3264 TPR 6413
0801 0927 SYS_fstat 3108 sys_wait 3322 6413 6493
sum 6226 3108 3258 3247 3253 3322 trap 3001
6226 6228 6230 6232 6233 sys_getpid 3338 SYS_wait 3103 2852 2854 2922 3001 3053
6245 6292 3236 3261 3338 3103 3253 3055 3058
superblock 3560 SYS_getpid 3111 sys_write 5277 trapframe 0602
Sep 5 23:39 2011 crossreferences Page 13
0100 typedef unsigned int uint; 0150 #define NPROC 64 // maximum number of processes
0101 typedef unsigned short ushort; 0151 #define KSTACKSIZE 4096 // size of perprocess kernel stack
0102 typedef unsigned char uchar; 0152 #define NCPU 8 // maximum number of CPUs
0103 typedef uint pde_t; 0153 #define NOFILE 16 // open files per process
0104 0154 #define NFILE 100 // open files per system
0105 0155 #define NBUF 10 // size of disk block cache
0106 0156 #define NINODE 50 // maximum number of active inodes
0107 0157 #define NDEV 10 // maximum major device number
0108 0158 #define ROOTDEV 1 // device number of file system root disk
0109 0159 #define MAXARG 32 // max exec arguments
0110 0160 #define LOGSIZE 10 // max data sectors in ondisk log
0111 0161
0112 0162
0113 0163
0114 0164
0115 0165
0116 0166
0117 0167
0118 0168
0119 0169
0120 0170
0121 0171
0122 0172
0123 0173
0124 0174
0125 0175
0126 0176
0127 0177
0128 0178
0129 0179
0130 0180
0131 0181
0132 0182
0133 0183
0134 0184
0135 0185
0136 0186
0137 0187
0138 0188
0139 0189
0140 0190
0141 0191
0142 0192
0143 0193
0144 0194
0145 0195
0146 0196
0147 0197
0148 0198
0149 0199
Sheet 01 Sheet 01
Sep 5 23:39 2011 xv6/memlayout.h Page 1 Sep 5 23:39 2011 xv6/defs.h Page 1
Sheet 02 Sheet 02
Sep 5 23:39 2011 xv6/defs.h Page 2 Sep 5 23:39 2011 xv6/defs.h Page 3
Sheet 03 Sheet 03
Sep 5 23:39 2011 xv6/defs.h Page 4 Sep 5 23:39 2011 xv6/x86.h Page 1
0400 void syscall(void); 0450 // Routines to let C code use special x86 instructions.
0401 0451
0402 // timer.c 0452 static inline uchar
0403 void timerinit(void); 0453 inb(ushort port)
0404 0454 {
0405 // trap.c 0455 uchar data;
0406 void idtinit(void); 0456
0407 extern uint ticks; 0457 asm volatile("in %1,%0" : "=a" (data) : "d" (port));
0408 void tvinit(void); 0458 return data;
0409 extern struct spinlock tickslock; 0459 }
0410 0460
0411 // uart.c 0461 static inline void
0412 void uartinit(void); 0462 insl(int port, void *addr, int cnt)
0413 void uartintr(void); 0463 {
0414 void uartputc(int); 0464 asm volatile("cld; rep insl" :
0415 0465 "=D" (addr), "=c" (cnt) :
0416 // vm.c 0466 "d" (port), "0" (addr), "1" (cnt) :
0417 void seginit(void); 0467 "memory", "cc");
0418 void kvmalloc(void); 0468 }
0419 void vmenable(void); 0469
0420 pde_t* setupkvm(char* (*alloc)()); 0470 static inline void
0421 char* uva2ka(pde_t*, char*); 0471 outb(ushort port, uchar data)
0422 int allocuvm(pde_t*, uint, uint); 0472 {
0423 int deallocuvm(pde_t*, uint, uint); 0473 asm volatile("out %0,%1" : : "a" (data), "d" (port));
0424 void freevm(pde_t*); 0474 }
0425 void inituvm(pde_t*, char*, uint); 0475
0426 int loaduvm(pde_t*, char*, struct inode*, uint, uint); 0476 static inline void
0427 pde_t* copyuvm(pde_t*, uint); 0477 outw(ushort port, ushort data)
0428 void switchuvm(struct proc*); 0478 {
0429 void switchkvm(void); 0479 asm volatile("out %0,%1" : : "a" (data), "d" (port));
0430 int copyout(pde_t*, uint, void*, uint); 0480 }
0431 void clearpteu(pde_t *pgdir, char *uva); 0481
0432 0482 static inline void
0433 // number of elements in fixedsize array 0483 outsl(int port, const void *addr, int cnt)
0434 #define NELEM(x) (sizeof(x)/sizeof((x)[0])) 0484 {
0435 0485 asm volatile("cld; rep outsl" :
0436 0486 "=S" (addr), "=c" (cnt) :
0437 0487 "d" (port), "0" (addr), "1" (cnt) :
0438 0488 "cc");
0439 0489 }
0440 0490
0441 0491 static inline void
0442 0492 stosb(void *addr, int data, int cnt)
0443 0493 {
0444 0494 asm volatile("cld; rep stosb" :
0445 0495 "=D" (addr), "=c" (cnt) :
0446 0496 "0" (addr), "1" (cnt), "a" (data) :
0447 0497 "memory", "cc");
0448 0498 }
0449 0499
Sheet 04 Sheet 04
Sep 5 23:39 2011 xv6/x86.h Page 2 Sep 5 23:39 2011 xv6/x86.h Page 3
Sheet 05 Sheet 05
Sep 5 23:39 2011 xv6/x86.h Page 4 Sep 5 23:39 2011 xv6/asm.h Page 1
0600 // Layout of the trap frame built on the stack by the 0650 //
0601 // hardware and by trapasm.S, and passed to trap(). 0651 // assembler macros to create x86 segments
0602 struct trapframe { 0652 //
0603 // registers as pushed by pusha 0653
0604 uint edi; 0654 #define SEG_NULLASM \
0605 uint esi; 0655 .word 0, 0; \
0606 uint ebp; 0656 .byte 0, 0, 0, 0
0607 uint oesp; // useless & ignored 0657
0608 uint ebx; 0658 // The 0xC0 means the limit is in 4096byte units
0609 uint edx; 0659 // and (for executable segments) 32bit mode.
0610 uint ecx; 0660 #define SEG_ASM(type,base,lim) \
0611 uint eax; 0661 .word (((lim) >> 12) & 0xffff), ((base) & 0xffff); \
0612 0662 .byte (((base) >> 16) & 0xff), (0x90 | (type)), \
0613 // rest of trap frame 0663 (0xC0 | (((lim) >> 28) & 0xf)), (((base) >> 24) & 0xff)
0614 ushort gs; 0664
0615 ushort padding1; 0665 #define STA_X 0x8 // Executable segment
0616 ushort fs; 0666 #define STA_E 0x4 // Expand down (nonexecutable segments)
0617 ushort padding2; 0667 #define STA_C 0x4 // Conforming code segment (executable only)
0618 ushort es; 0668 #define STA_W 0x2 // Writeable (nonexecutable segments)
0619 ushort padding3; 0669 #define STA_R 0x2 // Readable (executable segments)
0620 ushort ds; 0670 #define STA_A 0x1 // Accessed
0621 ushort padding4; 0671
0622 uint trapno; 0672
0623 0673
0624 // below here defined by x86 hardware 0674
0625 uint err; 0675
0626 uint eip; 0676
0627 ushort cs; 0677
0628 ushort padding5; 0678
0629 uint eflags; 0679
0630 0680
0631 // below here only when crossing rings, such as from user to kernel 0681
0632 uint esp; 0682
0633 ushort ss; 0683
0634 ushort padding6; 0684
0635 }; 0685
0636 0686
0637 0687
0638 0688
0639 0689
0640 0690
0641 0691
0642 0692
0643 0693
0644 0694
0645 0695
0646 0696
0647 0697
0648 0698
0649 0699
Sheet 06 Sheet 06
Sep 5 23:39 2011 xv6/mmu.h Page 1 Sep 5 23:39 2011 xv6/mmu.h Page 2
0700 // This file contains definitions for the 0750 #ifndef __ASSEMBLER__
0701 // x86 memory management unit (MMU). 0751 // Segment Descriptor
0702 0752 struct segdesc {
0703 // Eflags register 0753 uint lim_15_0 : 16; // Low bits of segment limit
0704 #define FL_CF 0x00000001 // Carry Flag 0754 uint base_15_0 : 16; // Low bits of segment base address
0705 #define FL_PF 0x00000004 // Parity Flag 0755 uint base_23_16 : 8; // Middle bits of segment base address
0706 #define FL_AF 0x00000010 // Auxiliary carry Flag 0756 uint type : 4; // Segment type (see STS_ constants)
0707 #define FL_ZF 0x00000040 // Zero Flag 0757 uint s : 1; // 0 = system, 1 = application
0708 #define FL_SF 0x00000080 // Sign Flag 0758 uint dpl : 2; // Descriptor Privilege Level
0709 #define FL_TF 0x00000100 // Trap Flag 0759 uint p : 1; // Present
0710 #define FL_IF 0x00000200 // Interrupt Enable 0760 uint lim_19_16 : 4; // High bits of segment limit
0711 #define FL_DF 0x00000400 // Direction Flag 0761 uint avl : 1; // Unused (available for software use)
0712 #define FL_OF 0x00000800 // Overflow Flag 0762 uint rsv1 : 1; // Reserved
0713 #define FL_IOPL_MASK 0x00003000 // I/O Privilege Level bitmask 0763 uint db : 1; // 0 = 16bit segment, 1 = 32bit segment
0714 #define FL_IOPL_0 0x00000000 // IOPL == 0 0764 uint g : 1; // Granularity: limit scaled by 4K when set
0715 #define FL_IOPL_1 0x00001000 // IOPL == 1 0765 uint base_31_24 : 8; // High bits of segment base address
0716 #define FL_IOPL_2 0x00002000 // IOPL == 2 0766 };
0717 #define FL_IOPL_3 0x00003000 // IOPL == 3 0767
0718 #define FL_NT 0x00004000 // Nested Task 0768 // Normal segment
0719 #define FL_RF 0x00010000 // Resume Flag 0769 #define SEG(type, base, lim, dpl) (struct segdesc) \
0720 #define FL_VM 0x00020000 // Virtual 8086 mode 0770 { ((lim) >> 12) & 0xffff, (uint)(base) & 0xffff, \
0721 #define FL_AC 0x00040000 // Alignment Check 0771 ((uint)(base) >> 16) & 0xff, type, 1, dpl, 1, \
0722 #define FL_VIF 0x00080000 // Virtual Interrupt Flag 0772 (uint)(lim) >> 28, 0, 0, 1, 1, (uint)(base) >> 24 }
0723 #define FL_VIP 0x00100000 // Virtual Interrupt Pending 0773 #define SEG16(type, base, lim, dpl) (struct segdesc) \
0724 #define FL_ID 0x00200000 // ID flag 0774 { (lim) & 0xffff, (uint)(base) & 0xffff, \
0725 0775 ((uint)(base) >> 16) & 0xff, type, 1, dpl, 1, \
0726 // Control Register flags 0776 (uint)(lim) >> 16, 0, 0, 1, 0, (uint)(base) >> 24 }
0727 #define CR0_PE 0x00000001 // Protection Enable 0777 #endif
0728 #define CR0_MP 0x00000002 // Monitor coProcessor 0778
0729 #define CR0_EM 0x00000004 // Emulation 0779 #define DPL_USER 0x3 // User DPL
0730 #define CR0_TS 0x00000008 // Task Switched 0780
0731 #define CR0_ET 0x00000010 // Extension Type 0781 // Application segment type bits
0732 #define CR0_NE 0x00000020 // Numeric Errror 0782 #define STA_X 0x8 // Executable segment
0733 #define CR0_WP 0x00010000 // Write Protect 0783 #define STA_E 0x4 // Expand down (nonexecutable segments)
0734 #define CR0_AM 0x00040000 // Alignment Mask 0784 #define STA_C 0x4 // Conforming code segment (executable only)
0735 #define CR0_NW 0x20000000 // Not Writethrough 0785 #define STA_W 0x2 // Writeable (nonexecutable segments)
0736 #define CR0_CD 0x40000000 // Cache Disable 0786 #define STA_R 0x2 // Readable (executable segments)
0737 #define CR0_PG 0x80000000 // Paging 0787 #define STA_A 0x1 // Accessed
0738 0788
0739 #define CR4_PSE 0x00000010 // Page size extension 0789 // System segment type bits
0740 0790 #define STS_T16A 0x1 // Available 16bit TSS
0741 #define SEG_KCODE 1 // kernel code 0791 #define STS_LDT 0x2 // Local Descriptor Table
0742 #define SEG_KDATA 2 // kernel data+stack 0792 #define STS_T16B 0x3 // Busy 16bit TSS
0743 #define SEG_KCPU 3 // kernel percpu data 0793 #define STS_CG16 0x4 // 16bit Call Gate
0744 #define SEG_UCODE 4 // user code 0794 #define STS_TG 0x5 // Task Gate / Coum Transmitions
0745 #define SEG_UDATA 5 // user data+stack 0795 #define STS_IG16 0x6 // 16bit Interrupt Gate
0746 #define SEG_TSS 6 // this processs task state 0796 #define STS_TG16 0x7 // 16bit Trap Gate
0747 0797 #define STS_T32A 0x9 // Available 32bit TSS
0748 0798 #define STS_T32B 0xB // Busy 32bit TSS
0749 0799 #define STS_CG32 0xC // 32bit Call Gate
Sheet 07 Sheet 07
Sep 5 23:39 2011 xv6/mmu.h Page 3 Sep 5 23:39 2011 xv6/mmu.h Page 4
0800 #define STS_IG32 0xE // 32bit Interrupt Gate 0850 // Task state segment format
0801 #define STS_TG32 0xF // 32bit Trap Gate 0851 struct taskstate {
0802 0852 uint link; // Old ts selector
0803 // A virtual address la has a threepart structure as follows: 0853 uint esp0; // Stack pointers and segment selectors
0804 // 0854 ushort ss0; // after an increase in privilege level
0805 // +10+10+12+ 0855 ushort padding1;
0806 // | Page Directory | Page Table | Offset within Page | 0856 uint *esp1;
0807 // | Index | Index | | 0857 ushort ss1;
0808 // ++++ 0858 ushort padding2;
0809 // \ PDX(va) / \ PTX(va) / 0859 uint *esp2;
0810 0860 ushort ss2;
0811 // page directory index 0861 ushort padding3;
0812 #define PDX(va) (((uint)(va) >> PDXSHIFT) & 0x3FF) 0862 void *cr3; // Page directory base
0813 0863 uint *eip; // Saved state from last task switch
0814 // page table index 0864 uint eflags;
0815 #define PTX(va) (((uint)(va) >> PTXSHIFT) & 0x3FF) 0865 uint eax; // More saved state (registers)
0816 0866 uint ecx;
0817 // construct virtual address from indexes and offset 0867 uint edx;
0818 #define PGADDR(d, t, o) ((uint)((d) << PDXSHIFT | (t) << PTXSHIFT | (o))) 0868 uint ebx;
0819 0869 uint *esp;
0820 // Page directory and page table constants. 0870 uint *ebp;
0821 #define NPDENTRIES 1024 // # directory entries per page directory 0871 uint esi;
0822 #define NPTENTRIES 1024 // # PTEs per page table 0872 uint edi;
0823 #define PGSIZE 4096 // bytes mapped by a page 0873 ushort es; // Even more saved state (segment selectors)
0824 0874 ushort padding4;
0825 #define PGSHIFT 12 // log2(PGSIZE) 0875 ushort cs;
0826 #define PTXSHIFT 12 // offset of PTX in a linear address 0876 ushort padding5;
0827 #define PDXSHIFT 22 // offset of PDX in a linear address 0877 ushort ss;
0828 0878 ushort padding6;
0829 #define PGROUNDUP(sz) (((sz)+PGSIZE1) & ~(PGSIZE1)) 0879 ushort ds;
0830 #define PGROUNDDOWN(a) (((a)) & ~(PGSIZE1)) 0880 ushort padding7;
0831 0881 ushort fs;
0832 // Page table/directory entry flags. 0882 ushort padding8;
0833 #define PTE_P 0x001 // Present 0883 ushort gs;
0834 #define PTE_W 0x002 // Writeable 0884 ushort padding9;
0835 #define PTE_U 0x004 // User 0885 ushort ldt;
0836 #define PTE_PWT 0x008 // WriteThrough 0886 ushort padding10;
0837 #define PTE_PCD 0x010 // CacheDisable 0887 ushort t; // Trap on task switch
0838 #define PTE_A 0x020 // Accessed 0888 ushort iomb; // I/O map base address
0839 #define PTE_D 0x040 // Dirty 0889 };
0840 #define PTE_PS 0x080 // Page Size 0890
0841 #define PTE_MBZ 0x180 // Bits must be zero 0891
0842 0892
0843 // Address in page table or page directory entry 0893
0844 #define PTE_ADDR(pte) ((uint)(pte) & ~0xFFF) 0894
0845 0895
0846 #ifndef __ASSEMBLER__ 0896
0847 typedef uint pte_t; 0897
0848 0898
0849 0899
Sheet 08 Sheet 08
Sep 5 23:39 2011 xv6/mmu.h Page 5 Sep 5 23:39 2011 xv6/elf.h Page 1
0900 // Gate descriptors for interrupts and traps 0950 // Format of an ELF executable file
0901 struct gatedesc { 0951
0902 uint off_15_0 : 16; // low 16 bits of offset in segment 0952 #define ELF_MAGIC 0x464C457FU // "\x7FELF" in little endian
0903 uint cs : 16; // code segment selector 0953
0904 uint args : 5; // # args, 0 for interrupt/trap gates 0954 // File header
0905 uint rsv1 : 3; // reserved(should be zero I guess) 0955 struct elfhdr {
0906 uint type : 4; // type(STS_{TG,IG32,TG32}) 0956 uint magic; // must equal ELF_MAGIC
0907 uint s : 1; // must be 0 (system) 0957 uchar elf[12];
0908 uint dpl : 2; // descriptor(meaning new) privilege level 0958 ushort type;
0909 uint p : 1; // Present 0959 ushort machine;
0910 uint off_31_16 : 16; // high bits of offset in segment 0960 uint version;
0911 }; 0961 uint entry;
0912 0962 uint phoff;
0913 // Set up a normal interrupt/trap gate descriptor. 0963 uint shoff;
0914 // istrap: 1 for a trap (= exception) gate, 0 for an interrupt gate. 0964 uint flags;
0915 // interrupt gate clears FL_IF, trap gate leaves FL_IF alone 0965 ushort ehsize;
0916 // sel: Code segment selector for interrupt/trap handler 0966 ushort phentsize;
0917 // off: Offset in code segment for interrupt/trap handler 0967 ushort phnum;
0918 // dpl: Descriptor Privilege Level 0968 ushort shentsize;
0919 // the privilege level required for software to invoke 0969 ushort shnum;
0920 // this interrupt/trap gate explicitly using an int instruction. 0970 ushort shstrndx;
0921 #define SETGATE(gate, istrap, sel, off, d) \ 0971 };
0922 { \ 0972
0923 (gate).off_15_0 = (uint)(off) & 0xffff; \ 0973 // Program section header
0924 (gate).cs = (sel); \ 0974 struct proghdr {
0925 (gate).args = 0; \ 0975 uint type;
0926 (gate).rsv1 = 0; \ 0976 uint off;
0927 (gate).type = (istrap) ? STS_TG32 : STS_IG32; \ 0977 uint vaddr;
0928 (gate).s = 0; \ 0978 uint paddr;
0929 (gate).dpl = (d); \ 0979 uint filesz;
0930 (gate).p = 1; \ 0980 uint memsz;
0931 (gate).off_31_16 = (uint)(off) >> 16; \ 0981 uint flags;
0932 } 0982 uint align;
0933 0983 };
0934 #endif 0984
0935 0985 // Values for Proghdr type
0936 0986 #define ELF_PROG_LOAD 1
0937 0987
0938 0988 // Flag bits for Proghdr flags
0939 0989 #define ELF_PROG_FLAG_EXEC 1
0940 0990 #define ELF_PROG_FLAG_WRITE 2
0941 0991 #define ELF_PROG_FLAG_READ 4
0942 0992
0943 0993
0944 0994
0945 0995
0946 0996
0947 0997
0948 0998
0949 0999
Sheet 09 Sheet 09
Sep 5 23:39 2011 xv6/entry.S Page 1 Sep 5 23:39 2011 xv6/entry.S Page 2
1000 # Multiboot header, for multiboot boot loaders like GNU Grub. 1050 orl $(CR0_PG|CR0_WP), %eax
1001 # http://www.gnu.org/software/grub/manual/multiboot/multiboot.html 1051 movl %eax, %cr0
1002 # 1052
1003 # Using GRUB 2, you can boot xv6 from a file stored in a 1053 # Set up the stack pointer.
1004 # Linux file system by copying kernel or kernelmemfs to /boot 1054 movl $(stack + KSTACKSIZE), %esp
1005 # and then adding this menu entry: 1055
1006 # 1056 # Jump to main(), and switch to executing at
1007 # menuentry "xv6" { 1057 # high addresses. The indirect call is needed because
1008 # insmod ext2 1058 # the assembler produces a PCrelative instruction
1009 # set root=(hd0,msdos1) 1059 # for a direct jump.
1010 # set kernel=/boot/kernel 1060 mov $main, %eax
1011 # echo "Loading ${kernel}..." 1061 jmp *%eax
1012 # multiboot ${kernel} ${kernel} 1062
1013 # boot 1063 .comm stack, KSTACKSIZE
1014 # } 1064
1015 1065
1016 #include "asm.h" 1066
1017 #include "memlayout.h" 1067
1018 #include "mmu.h" 1068
1019 #include "param.h" 1069
1020 1070
1021 # Multiboot header. Data to direct multiboot loader. 1071
1022 .p2align 2 1072
1023 .text 1073
1024 .globl multiboot_header 1074
1025 multiboot_header: 1075
1026 #define magic 0x1badb002 1076
1027 #define flags 0 1077
1028 .long magic 1078
1029 .long flags 1079
1030 .long (magicflags) 1080
1031 1081
1032 # By convention, the _start symbol specifies the ELF entry point. 1082
1033 # Since we havent set up virtual memory yet, our entry point is 1083
1034 # the physical address of entry. 1084
1035 .globl _start 1085
1036 _start = V2P_WO(entry) 1086
1037 1087
1038 # Entering xv6 on boot processor. Machine is mostly set up. 1088
1039 .globl entry 1089
1040 entry: 1090
1041 # Turn on page size extension for 4Mbyte pages 1091
1042 movl %cr4, %eax 1092
1043 orl $(CR4_PSE), %eax 1093
1044 movl %eax, %cr4 1094
1045 # Set page directory 1095
1046 movl $(V2P_WO(entrypgdir)), %eax 1096
1047 movl %eax, %cr3 1097
1048 # Turn on paging. 1098
1049 movl %cr0, %eax 1099
Sheet 10 Sheet 10
Sep 5 23:39 2011 xv6/entryother.S Page 1 Sep 5 23:39 2011 xv6/entryother.S Page 2
Sheet 11 Sheet 11
Sep 5 23:39 2011 xv6/main.c Page 1 Sep 5 23:39 2011 xv6/main.c Page 2
1200 #include "types.h" 1250 // Other CPUs jump here from entryother.S.
1201 #include "defs.h" 1251 static void
1202 #include "param.h" 1252 mpenter(void)
1203 #include "memlayout.h" 1253 {
1204 #include "mmu.h" 1254 switchkvm();
1205 #include "proc.h" 1255 seginit();
1206 #include "x86.h" 1256 lapicinit(cpunum());
1207 1257 mpmain();
1208 static void startothers(void); 1258 }
1209 static void mpmain(void) __attribute__((noreturn)); 1259
1210 extern pde_t *kpgdir; 1260 // Common CPU setup code.
1211 1261 static void
1212 // Bootstrap processor starts running C code here. 1262 mpmain(void)
1213 // Allocate a real stack and switch to it, first 1263 {
1214 // doing some setup required for memory allocator to work. 1264 cprintf("cpu%d: starting\n", cpu>id);
1215 int 1265 idtinit(); // load idt register
1216 main(void) 1266 xchg(&cpu>started, 1); // tell startothers() were up
1217 { 1267 scheduler(); // start running processes
1218 kvmalloc(); // kernel page table 1268 }
1219 mpinit(); // collect info about this machine 1269
1220 lapicinit(mpbcpu()); 1270 pde_t entrypgdir[]; // For entry.S
1221 seginit(); // set up segments 1271
1222 cprintf("\ncpu%d: starting xv6\n\n", cpu>id); 1272 // Start the nonboot (AP) processors.
1223 picinit(); // interrupt controller 1273 static void
1224 ioapicinit(); // another interrupt controller 1274 startothers(void)
1225 consoleinit(); // I/O devices & their interrupts 1275 {
1226 uartinit(); // serial port 1276 extern uchar _binary_entryother_start[], _binary_entryother_size[];
1227 pinit(); // process table 1277 uchar *code;
1228 tvinit(); // trap vectors 1278 struct cpu *c;
1229 binit(); // buffer cache 1279 char *stack;
1230 fileinit(); // file table 1280
1231 iinit(); // inode cache 1281 // Write entry code to unused memory at 0x7000.
1232 ideinit(); // disk 1282 // The linker has placed the image of entryother.S in
1233 if(!ismp) 1283 // _binary_entryother_start.
1234 timerinit(); // uniprocessor timer 1284 code = p2v(0x7000);
1235 startothers(); // start other processors (must come before kinit) 1285 memmove(code, _binary_entryother_start, (uint)_binary_entryother_size);
1236 kinit(); // initialize memory allocator 1286
1237 userinit(); // first user process (must come after kinit) 1287 for(c = cpus; c < cpus+ncpu; c++){
1238 // Finish setting up this processor in mpmain. 1288 if(c == cpus+cpunum()) // Weve started already.
1239 mpmain(); 1289 continue;
1240 } 1290
1241 1291 // Tell entryother.S what stack to use, where to enter, and what
1242 1292 // pgdir to use. We cannot use kpgdir yet, because the AP processor
1243 1293 // is running in low memory, so we use entrypgdir for the APs too.
1244 1294 // kalloc can return addresses above 4Mbyte (the machine may have
1245 1295 // much more physical memory than 4Mbyte), which arent mapped by
1246 1296 // entrypgdir, so we must allocate a stack using enter_alloc();
1247 1297 // this introduces the constraint that xv6 cannot use kalloc until
1248 1298 // after these last enter_alloc invocations.
1249 1299 stack = enter_alloc();
Sheet 12 Sheet 12
Sep 5 23:39 2011 xv6/main.c Page 3 Sep 5 23:39 2011 xv6/main.c Page 4
Sheet 13 Sheet 13
Sep 5 23:39 2011 xv6/spinlock.h Page 1 Sep 5 23:39 2011 xv6/spinlock.c Page 1
Sheet 14 Sheet 14
Sep 5 23:39 2011 xv6/spinlock.c Page 2 Sep 5 23:39 2011 xv6/spinlock.c Page 3
1500 // Release the lock. 1550 // Pushcli/popcli are like cli/sti except that they are matched:
1501 void 1551 // it takes two popcli to undo two pushcli. Also, if interrupts
1502 release(struct spinlock *lk) 1552 // are off, then pushcli, popcli leaves them off.
1503 { 1553
1504 if(!holding(lk)) 1554 void
1505 panic("release"); 1555 pushcli(void)
1506 1556 {
1507 lk>pcs[0] = 0; 1557 int eflags;
1508 lk>cpu = 0; 1558
1509 1559 eflags = readeflags();
1510 // The xchg serializes, so that reads before release are 1560 cli();
1511 // not reordered after it. The 1996 PentiumPro manual (Volume 3, 1561 if(cpu>ncli++ == 0)
1512 // 7.2) says reads can be carried out speculatively and in 1562 cpu>intena = eflags & FL_IF;
1513 // any order, which implies we need to serialize here. 1563 }
1514 // But the 2007 Intel 64 Architecture Memory Ordering White 1564
1515 // Paper says that Intel 64 and IA32 will not move a load 1565 void
1516 // after a store. So lock>locked = 0 would work here. 1566 popcli(void)
1517 // The xchg being asm volatile ensures gcc emits it after 1567 {
1518 // the above assignments (and after the critical section). 1568 if(readeflags()&FL_IF)
1519 xchg(&lk>locked, 0); 1569 panic("popcli interruptible");
1520 1570 if(cpu>ncli < 0)
1521 popcli(); 1571 panic("popcli");
1522 } 1572 if(cpu>ncli == 0 && cpu>intena)
1523 1573 sti();
1524 // Record the current call stack in pcs[] by following the %ebp chain. 1574 }
1525 void 1575
1526 getcallerpcs(void *v, uint pcs[]) 1576
1527 { 1577
1528 uint *ebp; 1578
1529 int i; 1579
1530 1580
1531 ebp = (uint*)v 2; 1581
1532 for(i = 0; i < 10; i++){ 1582
1533 if(ebp == 0 || ebp < (uint*)KERNBASE || ebp == (uint*)0xffffffff) 1583
1534 break; 1584
1535 pcs[i] = ebp[1]; // saved %eip 1585
1536 ebp = (uint*)ebp[0]; // saved %ebp 1586
1537 } 1587
1538 for(; i < 10; i++) 1588
1539 pcs[i] = 0; 1589
1540 } 1590
1541 1591
1542 // Check whether this cpu is holding the lock. 1592
1543 int 1593
1544 holding(struct spinlock *lock) 1594
1545 { 1595
1546 return lock>locked && lock>cpu == cpu; 1596
1547 } 1597
1548 1598
1549 1599
Sheet 15 Sheet 15
Sep 5 23:39 2011 xv6/vm.c Page 1 Sep 5 23:39 2011 xv6/vm.c Page 2
1600 #include "param.h" 1650 // Return the address of the PTE in page table pgdir
1601 #include "types.h" 1651 // that corresponds to virtual address va. If alloc!=0,
1602 #include "defs.h" 1652 // create any required page table pages.
1603 #include "x86.h" 1653 static pte_t *
1604 #include "memlayout.h" 1654 walkpgdir(pde_t *pgdir, const void *va, char* (*alloc)(void))
1605 #include "mmu.h" 1655 {
1606 #include "proc.h" 1656 pde_t *pde;
1607 #include "elf.h" 1657 pte_t *pgtab;
1608 1658
1609 extern char data[]; // defined by kernel.ld 1659 pde = &pgdir[PDX(va)];
1610 pde_t *kpgdir; // for use in scheduler() 1660 if(*pde & PTE_P){
1611 struct segdesc gdt[NSEGS]; 1661 pgtab = (pte_t*)p2v(PTE_ADDR(*pde));
1612 1662 } else {
1613 // Set up CPUs kernel segment descriptors. 1663 if(!alloc || (pgtab = (pte_t*)alloc()) == 0)
1614 // Run once on entry on each CPU. 1664 return 0;
1615 void 1665 // Make sure all those PTE_P bits are zero.
1616 seginit(void) 1666 memset(pgtab, 0, PGSIZE);
1617 { 1667 // The permissions here are overly generous, but they can
1618 struct cpu *c; 1668 // be further restricted by the permissions in the page table
1619 1669 // entries, if necessary.
1620 // Map "logical" addresses to virtual addresses using identity map. 1670 *pde = v2p(pgtab) | PTE_P | PTE_W | PTE_U;
1621 // Cannot share a CODE descriptor for both kernel and user 1671 }
1622 // because it would have to have DPL_USR, but the CPU forbids 1672 return &pgtab[PTX(va)];
1623 // an interrupt from CPL=0 to DPL=3. 1673 }
1624 c = &cpus[cpunum()]; 1674
1625 c>gdt[SEG_KCODE] = SEG(STA_X|STA_R, 0, 0xffffffff, 0); 1675 // Create PTEs for virtual addresses starting at va that refer to
1626 c>gdt[SEG_KDATA] = SEG(STA_W, 0, 0xffffffff, 0); 1676 // physical addresses starting at pa. va and size might not
1627 c>gdt[SEG_UCODE] = SEG(STA_X|STA_R, 0, 0xffffffff, DPL_USER); 1677 // be pagealigned.
1628 c>gdt[SEG_UDATA] = SEG(STA_W, 0, 0xffffffff, DPL_USER); 1678 static int
1629 1679 mappages(pde_t *pgdir, void *va, uint size, uint pa,
1630 // Map cpu, and curproc 1680 int perm, char* (*alloc)(void))
1631 c>gdt[SEG_KCPU] = SEG(STA_W, &c>cpu, 8, 0); 1681 {
1632 1682 char *a, *last;
1633 lgdt(c>gdt, sizeof(c>gdt)); 1683 pte_t *pte;
1634 loadgs(SEG_KCPU << 3); 1684
1635 1685 a = (char*)PGROUNDDOWN((uint)va);
1636 // Initialize cpulocal storage. 1686 last = (char*)PGROUNDDOWN(((uint)va) + size 1);
1637 cpu = c; 1687 for(;;){
1638 proc = 0; 1688 if((pte = walkpgdir(pgdir, a, alloc)) == 0)
1639 } 1689 return 1;
1640 1690 if(*pte & PTE_P)
1641 1691 panic("remap");
1642 1692 *pte = pa | perm | PTE_P;
1643 1693 if(a == last)
1644 1694 break;
1645 1695 a += PGSIZE;
1646 1696 pa += PGSIZE;
1647 1697 }
1648 1698 return 0;
1649 1699 }
Sheet 16 Sheet 16
Sep 5 23:39 2011 xv6/vm.c Page 3 Sep 5 23:39 2011 xv6/vm.c Page 4
1700 // The mappings from logical to virtual are one to one (i.e., 1750 // Allocate one page table for the machine for the kernel address
1701 // segmentation doesnt do anything). There is one page table per 1751 // space for scheduler processes.
1702 // process, plus one thats used when a CPU is not running any process 1752 void
1703 // (kpgdir). A user process uses the same page table as the kernel; the 1753 kvmalloc(void)
1704 // page protection bits prevent it from accessing kernel memory. 1754 {
1705 // 1755 kpgdir = setupkvm(enter_alloc);
1706 // setupkvm() and exec() set up every page table like this: 1756 switchkvm();
1707 // 0..KERNBASE: user memory (text+data+stack+heap), mapped to some free 1757 }
1708 // phys memory 1758
1709 // KERNBASE..KERNBASE+EXTMEM: mapped to 0..EXTMEM (for I/O space) 1759 // Switch h/w page table register to the kernelonly page table,
1710 // KERNBASE+EXTMEM..KERNBASE+end: mapped to EXTMEM..end kernel, 1760 // for when no process is running.
1711 // w. no write permission 1761 void
1712 // KERNBASE+end..KERBASE+PHYSTOP: mapped to end..PHYSTOP, 1762 switchkvm(void)
1713 // rw data + free memory 1763 {
1714 // 0xfe000000..0: mapped direct (devices such as ioapic) 1764 lcr3(v2p(kpgdir)); // switch to the kernel page table
1715 // 1765 }
1716 // The kernel allocates memory for its heap and for user memory 1766
1717 // between KERNBASE+end and the end of physical memory (PHYSTOP). 1767 // Switch TSS and h/w page table to correspond to process p.
1718 // The user program sits in the bottom of the address space, and the 1768 void
1719 // kernel at the top at KERNBASE. 1769 switchuvm(struct proc *p)
1720 static struct kmap { 1770 {
1721 void *virt; 1771 pushcli();
1722 uint phys_start; 1772 cpu>gdt[SEG_TSS] = SEG16(STS_T32A, &cpu>ts, sizeof(cpu>ts)1, 0);
1723 uint phys_end; 1773 cpu>gdt[SEG_TSS].s = 0;
1724 int perm; 1774 cpu>ts.ss0 = SEG_KDATA << 3;
1725 } kmap[] = { 1775 cpu>ts.esp0 = (uint)proc>kstack + KSTACKSIZE;
1726 { P2V(0), 0, 1024*1024, PTE_W}, // I/O space 1776 ltr(SEG_TSS << 3);
1727 { (void*)KERNLINK, V2P(KERNLINK), V2P(data), 0}, // kernel text+rodata 1777 if(p>pgdir == 0)
1728 { data, V2P(data), PHYSTOP, PTE_W}, // kernel data, memory 1778 panic("switchuvm: no pgdir");
1729 { (void*)DEVSPACE, DEVSPACE, 0, PTE_W}, // more devices 1779 lcr3(v2p(p>pgdir)); // switch to new address space
1730 }; 1780 popcli();
1731 1781 }
1732 // Set up kernel part of a page table. 1782
1733 pde_t* 1783 // Load the initcode into address 0 of pgdir.
1734 setupkvm(char* (*alloc)(void)) 1784 // sz must be less than a page.
1735 { 1785 void
1736 pde_t *pgdir; 1786 inituvm(pde_t *pgdir, char *init, uint sz)
1737 struct kmap *k; 1787 {
1738 1788 char *mem;
1739 if((pgdir = (pde_t*)alloc()) == 0) 1789
1740 return 0; 1790 if(sz >= PGSIZE)
1741 memset(pgdir, 0, PGSIZE); 1791 panic("inituvm: more than a page");
1742 if (p2v(PHYSTOP) > (void*)DEVSPACE) 1792 mem = kalloc();
1743 panic("PHYSTOP too high"); 1793 memset(mem, 0, PGSIZE);
1744 for(k = kmap; k < &kmap[NELEM(kmap)]; k++) 1794 mappages(pgdir, 0, PGSIZE, v2p(mem), PTE_W|PTE_U, kalloc);
1745 if(mappages(pgdir, k>virt, k>phys_end k>phys_start, 1795 memmove(mem, init, sz);
1746 (uint)k>phys_start, k>perm, alloc) < 0) 1796 }
1747 return 0; 1797
1748 return pgdir; 1798
1749 } 1799
Sheet 17 Sheet 17
Sep 5 23:39 2011 xv6/vm.c Page 5 Sep 5 23:39 2011 xv6/vm.c Page 6
1800 // Load a program segment into pgdir. addr must be pagealigned 1850 // Deallocate user pages to bring the process size from oldsz to
1801 // and the pages from addr to addr+sz must already be mapped. 1851 // newsz. oldsz and newsz need not be pagealigned, nor does newsz
1802 int 1852 // need to be less than oldsz. oldsz can be larger than the actual
1803 loaduvm(pde_t *pgdir, char *addr, struct inode *ip, uint offset, uint sz) 1853 // process size. Returns the new process size.
1804 { 1854 int
1805 uint i, pa, n; 1855 deallocuvm(pde_t *pgdir, uint oldsz, uint newsz)
1806 pte_t *pte; 1856 {
1807 1857 pte_t *pte;
1808 if((uint) addr % PGSIZE != 0) 1858 uint a, pa;
1809 panic("loaduvm: addr must be page aligned"); 1859
1810 for(i = 0; i < sz; i += PGSIZE){ 1860 if(newsz >= oldsz)
1811 if((pte = walkpgdir(pgdir, addr+i, 0)) == 0) 1861 return oldsz;
1812 panic("loaduvm: address should exist"); 1862
1813 pa = PTE_ADDR(*pte); 1863 a = PGROUNDUP(newsz);
1814 if(sz i < PGSIZE) 1864 for(; a < oldsz; a += PGSIZE){
1815 n = sz i; 1865 pte = walkpgdir(pgdir, (char*)a, 0);
1816 else 1866 if(!pte)
1817 n = PGSIZE; 1867 a += (NPTENTRIES 1) * PGSIZE;
1818 if(readi(ip, p2v(pa), offset+i, n) != n) 1868 else if((*pte & PTE_P) != 0){
1819 return 1; 1869 pa = PTE_ADDR(*pte);
1820 } 1870 if(pa == 0)
1821 return 0; 1871 panic("kfree");
1822 } 1872 char *v = p2v(pa);
1823 1873 kfree(v);
1824 // Allocate page tables and physical memory to grow process from oldsz to 1874 *pte = 0;
1825 // newsz, which need not be page aligned. Returns new size or 0 on error. 1875 }
1826 int 1876 }
1827 allocuvm(pde_t *pgdir, uint oldsz, uint newsz) 1877 return newsz;
1828 { 1878 }
1829 char *mem; 1879
1830 uint a; 1880 // Free a page table and all the physical memory pages
1831 1881 // in the user part.
1832 if(newsz >= KERNBASE) 1882 void
1833 return 0; 1883 freevm(pde_t *pgdir)
1834 if(newsz < oldsz) 1884 {
1835 return oldsz; 1885 uint i;
1836 1886
1837 a = PGROUNDUP(oldsz); 1887 if(pgdir == 0)
1838 for(; a < newsz; a += PGSIZE){ 1888 panic("freevm: no pgdir");
1839 mem = kalloc(); 1889 deallocuvm(pgdir, KERNBASE, 0);
1840 if(mem == 0){ 1890 for(i = 0; i < NPDENTRIES; i++){
1841 cprintf("allocuvm out of memory\n"); 1891 if(pgdir[i] & PTE_P){
1842 deallocuvm(pgdir, newsz, oldsz); 1892 char * v = p2v(PTE_ADDR(pgdir[i]));
1843 return 0; 1893 kfree(v);
1844 } 1894 }
1845 memset(mem, 0, PGSIZE); 1895 }
1846 mappages(pgdir, (char*)a, PGSIZE, v2p(mem), PTE_W|PTE_U, kalloc); 1896 kfree((char*)pgdir);
1847 } 1897 }
1848 return newsz; 1898
1849 } 1899
Sheet 18 Sheet 18
Sep 5 23:39 2011 xv6/vm.c Page 7 Sep 5 23:39 2011 xv6/vm.c Page 8
1900 // Clear PTE_U on a page. Used to create an inaccessible 1950 // Map user virtual address to kernel address.
1901 // page beneath the user stack. 1951 char*
1902 void 1952 uva2ka(pde_t *pgdir, char *uva)
1903 clearpteu(pde_t *pgdir, char *uva) 1953 {
1904 { 1954 pte_t *pte;
1905 pte_t *pte; 1955
1906 1956 pte = walkpgdir(pgdir, uva, 0);
1907 pte = walkpgdir(pgdir, uva, 0); 1957 if((*pte & PTE_P) == 0)
1908 if(pte == 0) 1958 return 0;
1909 panic("clearpteu"); 1959 if((*pte & PTE_U) == 0)
1910 *pte &= ~PTE_U; 1960 return 0;
1911 } 1961 return (char*)p2v(PTE_ADDR(*pte));
1912 1962 }
1913 // Given a parent processs page table, create a copy 1963
1914 // of it for a child. 1964 // Copy len bytes from p to user address va in page table pgdir.
1915 pde_t* 1965 // Most useful when pgdir is not the current page table.
1916 copyuvm(pde_t *pgdir, uint sz) 1966 // uva2ka ensures this only works for PTE_U pages.
1917 { 1967 int
1918 pde_t *d; 1968 copyout(pde_t *pgdir, uint va, void *p, uint len)
1919 pte_t *pte; 1969 {
1920 uint pa, i; 1970 char *buf, *pa0;
1921 char *mem; 1971 uint n, va0;
1922 1972
1923 if((d = setupkvm(kalloc)) == 0) 1973 buf = (char*)p;
1924 return 0; 1974 while(len > 0){
1925 for(i = 0; i < sz; i += PGSIZE){ 1975 va0 = (uint)PGROUNDDOWN(va);
1926 if((pte = walkpgdir(pgdir, (void *) i, 0)) == 0) 1976 pa0 = uva2ka(pgdir, (char*)va0);
1927 panic("copyuvm: pte should exist"); 1977 if(pa0 == 0)
1928 if(!(*pte & PTE_P)) 1978 return 1;
1929 panic("copyuvm: page not present"); 1979 n = PGSIZE (va va0);
1930 pa = PTE_ADDR(*pte); 1980 if(n > len)
1931 if((mem = kalloc()) == 0) 1981 n = len;
1932 goto bad; 1982 memmove(pa0 + (va va0), buf, n);
1933 memmove(mem, (char*)p2v(pa), PGSIZE); 1983 len = n;
1934 if(mappages(d, (void*)i, PGSIZE, v2p(mem), PTE_W|PTE_U, kalloc) < 0) 1984 buf += n;
1935 goto bad; 1985 va = va0 + PGSIZE;
1936 } 1986 }
1937 return d; 1987 return 0;
1938 1988 }
1939 bad: 1989
1940 freevm(d); 1990
1941 return 0; 1991
1942 } 1992
1943 1993
1944 1994
1945 1995
1946 1996
1947 1997
1948 1998
1949 1999
Sheet 19 Sheet 19
Sep 5 23:39 2011 xv6/proc.h Page 1 Sep 5 23:39 2011 xv6/proc.h Page 2
2000 // Segments in proc>gdt. 2050 enum procstate { UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE };
2001 #define NSEGS 7 2051
2002 2052 // Perprocess state
2003 // PerCPU state 2053 struct proc {
2004 struct cpu { 2054 uint sz; // Size of process memory (bytes)
2005 uchar id; // Local APIC ID; index into cpus[] below 2055 pde_t* pgdir; // Page table
2006 struct context *scheduler; // swtch() here to enter scheduler 2056 char *kstack; // Bottom of kernel stack for this process
2007 struct taskstate ts; // Used by x86 to find stack for interrupt 2057 enum procstate state; // Process state
2008 struct segdesc gdt[NSEGS]; // x86 global descriptor table 2058 volatile int pid; // Process ID
2009 volatile uint started; // Has the CPU started? 2059 struct proc *parent; // Parent process
2010 int ncli; // Depth of pushcli nesting. 2060 struct trapframe *tf; // Trap frame for current syscall
2011 int intena; // Were interrupts enabled before pushcli? 2061 struct context *context; // swtch() here to run process
2012 2062 void *chan; // If nonzero, sleeping on chan
2013 // Cpulocal storage variables; see below 2063 int killed; // If nonzero, have been killed
2014 struct cpu *cpu; 2064 struct file *ofile[NOFILE]; // Open files
2015 struct proc *proc; // The currentlyrunning process. 2065 struct inode *cwd; // Current directory
2016 }; 2066 char name[16]; // Process name (debugging)
2017 2067 };
2018 extern struct cpu cpus[NCPU]; 2068
2019 extern int ncpu; 2069 // Process memory is laid out contiguously, low addresses first:
2020 2070 // text
2021 // PerCPU variables, holding pointers to the 2071 // original data and bss
2022 // current cpu and to the current process. 2072 // fixedsize stack
2023 // The asm suffix tells gcc to use "%gs:0" to refer to cpu 2073 // expandable heap
2024 // and "%gs:4" to refer to proc. seginit sets up the 2074
2025 // %gs segment register so that %gs refers to the memory 2075
2026 // holding those two variables in the local cpus struct cpu. 2076
2027 // This is similar to how threadlocal variables are implemented 2077
2028 // in thread libraries such as Linux pthreads. 2078
2029 extern struct cpu *cpu asm("%gs:0"); // &cpus[cpunum()] 2079
2030 extern struct proc *proc asm("%gs:4"); // cpus[cpunum()].proc 2080
2031 2081
2032 2082
2033 // Saved registers for kernel context switches. 2083
2034 // Dont need to save all the segment registers (%cs, etc), 2084
2035 // because they are constant across kernel contexts. 2085
2036 // Dont need to save %eax, %ecx, %edx, because the 2086
2037 // x86 convention is that the caller has saved them. 2087
2038 // Contexts are stored at the bottom of the stack they 2088
2039 // describe; the stack pointer is the address of the context. 2089
2040 // The layout of the context matches the layout of the stack in swtch.S 2090
2041 // at the "Switch stacks" comment. Switch doesnt save eip explicitly, 2091
2042 // but it is on the stack and allocproc() manipulates it. 2092
2043 struct context { 2093
2044 uint edi; 2094
2045 uint esi; 2095
2046 uint ebx; 2096
2047 uint ebp; 2097
2048 uint eip; 2098
2049 }; 2099
Sheet 20 Sheet 20
Sep 5 23:39 2011 xv6/proc.c Page 1 Sep 5 23:39 2011 xv6/proc.c Page 2
2100 #include "types.h" 2150 // Look in the process table for an UNUSED proc.
2101 #include "defs.h" 2151 // If found, change state to EMBRYO and initialize
2102 #include "param.h" 2152 // state required to run in the kernel.
2103 #include "memlayout.h" 2153 // Otherwise return 0.
2104 #include "mmu.h" 2154 static struct proc*
2105 #include "x86.h" 2155 allocproc(void)
2106 #include "proc.h" 2156 {
2107 #include "spinlock.h" 2157 struct proc *p;
2108 2158 char *sp;
2109 struct { 2159
2110 struct spinlock lock; 2160 acquire(&ptable.lock);
2111 struct proc proc[NPROC]; 2161 for(p = ptable.proc; p < &ptable.proc[NPROC]; p++)
2112 } ptable; 2162 if(p>state == UNUSED)
2113 2163 goto found;
2114 static struct proc *initproc; 2164 release(&ptable.lock);
2115 2165 return 0;
2116 int nextpid = 1; 2166
2117 extern void forkret(void); 2167 found:
2118 extern void trapret(void); 2168 p>state = EMBRYO;
2119 2169 p>pid = nextpid++;
2120 static void wakeup1(void *chan); 2170 release(&ptable.lock);
2121 2171
2122 void 2172 // Allocate kernel stack.
2123 pinit(void) 2173 if((p>kstack = kalloc()) == 0){
2124 { 2174 p>state = UNUSED;
2125 initlock(&ptable.lock, "ptable"); 2175 return 0;
2126 } 2176 }
2127 2177 sp = p>kstack + KSTACKSIZE;
2128 2178
2129 2179 // Leave room for trap frame.
2130 2180 sp = sizeof *p>tf;
2131 2181 p>tf = (struct trapframe*)sp;
2132 2182
2133 2183 // Set up new context to start executing at forkret,
2134 2184 // which returns to trapret.
2135 2185 sp = 4;
2136 2186 *(uint*)sp = (uint)trapret;
2137 2187
2138 2188 sp = sizeof *p>context;
2139 2189 p>context = (struct context*)sp;
2140 2190 memset(p>context, 0, sizeof *p>context);
2141 2191 p>context>eip = (uint)forkret;
2142 2192
2143 2193 return p;
2144 2194 }
2145 2195
2146 2196
2147 2197
2148 2198
2149 2199
Sheet 21 Sheet 21
Sep 5 23:39 2011 xv6/proc.c Page 3 Sep 5 23:39 2011 xv6/proc.c Page 4
2200 // Set up first user process. 2250 // Create a new process copying p as the parent.
2201 void 2251 // Sets up stack to return as if from system call.
2202 userinit(void) 2252 // Caller must set state of returned proc to RUNNABLE.
2203 { 2253 int
2204 struct proc *p; 2254 fork(void)
2205 extern char _binary_initcode_start[], _binary_initcode_size[]; 2255 {
2206 2256 int i, pid;
2207 p = allocproc(); 2257 struct proc *np;
2208 initproc = p; 2258
2209 if((p>pgdir = setupkvm(kalloc)) == 0) 2259 // Allocate process.
2210 panic("userinit: out of memory?"); 2260 if((np = allocproc()) == 0)
2211 inituvm(p>pgdir, _binary_initcode_start, (int)_binary_initcode_size); 2261 return 1;
2212 p>sz = PGSIZE; 2262
2213 memset(p>tf, 0, sizeof(*p>tf)); 2263 // Copy process state from p.
2214 p>tf>cs = (SEG_UCODE << 3) | DPL_USER; 2264 if((np>pgdir = copyuvm(proc>pgdir, proc>sz)) == 0){
2215 p>tf>ds = (SEG_UDATA << 3) | DPL_USER; 2265 kfree(np>kstack);
2216 p>tf>es = p>tf>ds; 2266 np>kstack = 0;
2217 p>tf>ss = p>tf>ds; 2267 np>state = UNUSED;
2218 p>tf>eflags = FL_IF; 2268 return 1;
2219 p>tf>esp = PGSIZE; 2269 }
2220 p>tf>eip = 0; // beginning of initcode.S 2270 np>sz = proc>sz;
2221 2271 np>parent = proc;
2222 safestrcpy(p>name, "initcode", sizeof(p>name)); 2272 *np>tf = *proc>tf;
2223 p>cwd = namei("/"); 2273
2224 2274 // Clear %eax so that fork returns 0 in the child.
2225 p>state = RUNNABLE; 2275 np>tf>eax = 0;
2226 } 2276
2227 2277 for(i = 0; i < NOFILE; i++)
2228 // Grow current processs memory by n bytes. 2278 if(proc>ofile[i])
2229 // Return 0 on success, 1 on failure. 2279 np>ofile[i] = filedup(proc>ofile[i]);
2230 int 2280 np>cwd = idup(proc>cwd);
2231 growproc(int n) 2281
2232 { 2282 pid = np>pid;
2233 uint sz; 2283 np>state = RUNNABLE;
2234 2284 safestrcpy(np>name, proc>name, sizeof(proc>name));
2235 sz = proc>sz; 2285 return pid;
2236 if(n > 0){ 2286 }
2237 if((sz = allocuvm(proc>pgdir, sz, sz + n)) == 0) 2287
2238 return 1; 2288
2239 } else if(n < 0){ 2289
2240 if((sz = deallocuvm(proc>pgdir, sz, sz + n)) == 0) 2290
2241 return 1; 2291
2242 } 2292
2243 proc>sz = sz; 2293
2244 switchuvm(proc); 2294
2245 return 0; 2295
2246 } 2296
2247 2297
2248 2298
2249 2299
Sheet 22 Sheet 22
Sep 5 23:39 2011 xv6/proc.c Page 5 Sep 5 23:39 2011 xv6/proc.c Page 6
2300 // Exit the current process. Does not return. 2350 // Wait for a child process to exit and return its pid.
2301 // An exited process remains in the zombie state 2351 // Return 1 if this process has no children.
2302 // until its parent calls wait() to find out it exited. 2352 int
2303 void 2353 wait(void)
2304 exit(void) 2354 {
2305 { 2355 struct proc *p;
2306 struct proc *p; 2356 int havekids, pid;
2307 int fd; 2357
2308 2358 acquire(&ptable.lock);
2309 if(proc == initproc) 2359 for(;;){
2310 panic("init exiting"); 2360 // Scan through table looking for zombie children.
2311 2361 havekids = 0;
2312 // Close all open files. 2362 for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){
2313 for(fd = 0; fd < NOFILE; fd++){ 2363 if(p>parent != proc)
2314 if(proc>ofile[fd]){ 2364 continue;
2315 fileclose(proc>ofile[fd]); 2365 havekids = 1;
2316 proc>ofile[fd] = 0; 2366 if(p>state == ZOMBIE){
2317 } 2367 // Found one.
2318 } 2368 pid = p>pid;
2319 2369 kfree(p>kstack);
2320 iput(proc>cwd); 2370 p>kstack = 0;
2321 proc>cwd = 0; 2371 freevm(p>pgdir);
2322 2372 p>state = UNUSED;
2323 acquire(&ptable.lock); 2373 p>pid = 0;
2324 2374 p>parent = 0;
2325 // Parent might be sleeping in wait(). 2375 p>name[0] = 0;
2326 wakeup1(proc>parent); 2376 p>killed = 0;
2327 2377 release(&ptable.lock);
2328 // Pass abandoned children to init. 2378 return pid;
2329 for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){ 2379 }
2330 if(p>parent == proc){ 2380 }
2331 p>parent = initproc; 2381
2332 if(p>state == ZOMBIE) 2382 // No point waiting if we dont have any children.
2333 wakeup1(initproc); 2383 if(!havekids || proc>killed){
2334 } 2384 release(&ptable.lock);
2335 } 2385 return 1;
2336 2386 }
2337 // Jump into the scheduler, never to return. 2387
2338 proc>state = ZOMBIE; 2388 // Wait for children to exit. (See wakeup1 call in proc_exit.)
2339 sched(); 2389 sleep(proc, &ptable.lock);
2340 panic("zombie exit"); 2390 }
2341 } 2391 }
2342 2392
2343 2393
2344 2394
2345 2395
2346 2396
2347 2397
2348 2398
2349 2399
Sheet 23 Sheet 23
Sep 5 23:39 2011 xv6/proc.c Page 7 Sep 5 23:39 2011 xv6/proc.c Page 8
2400 // PerCPU process scheduler. 2450 // Enter scheduler. Must hold only ptable.lock
2401 // Each CPU calls scheduler() after setting itself up. 2451 // and have changed proc>state.
2402 // Scheduler never returns. It loops, doing: 2452 void
2403 // choose a process to run 2453 sched(void)
2404 // swtch to start running that process 2454 {
2405 // eventually that process transfers control 2455 int intena;
2406 // via swtch back to the scheduler. 2456
2407 void 2457 if(!holding(&ptable.lock))
2408 scheduler(void) 2458 panic("sched ptable.lock");
2409 { 2459 if(cpu>ncli != 1)
2410 struct proc *p; 2460 panic("sched locks");
2411 2461 if(proc>state == RUNNING)
2412 for(;;){ 2462 panic("sched running");
2413 // Enable interrupts on this processor. 2463 if(readeflags()&FL_IF)
2414 sti(); 2464 panic("sched interruptible");
2415 2465 intena = cpu>intena;
2416 // Loop over process table looking for process to run. 2466 swtch(&proc>context, cpu>scheduler);
2417 acquire(&ptable.lock); 2467 cpu>intena = intena;
2418 for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){ 2468 }
2419 if(p>state != RUNNABLE) 2469
2420 continue; 2470 // Give up the CPU for one scheduling round.
2421 2471 void
2422 // Switch to chosen process. It is the processs job 2472 yield(void)
2423 // to release ptable.lock and then reacquire it 2473 {
2424 // before jumping back to us. 2474 acquire(&ptable.lock);
2425 proc = p; 2475 proc>state = RUNNABLE;
2426 switchuvm(p); 2476 sched();
2427 p>state = RUNNING; 2477 release(&ptable.lock);
2428 swtch(&cpu>scheduler, proc>context); 2478 }
2429 switchkvm(); 2479
2430 2480 // A fork childs very first scheduling by scheduler()
2431 // Process is done running for now. 2481 // will swtch here. "Return" to user space.
2432 // It should have changed its p>state before coming back. 2482 void
2433 proc = 0; 2483 forkret(void)
2434 } 2484 {
2435 release(&ptable.lock); 2485 static int first = 1;
2436 2486 // Still holding ptable.lock from scheduler.
2437 } 2487 release(&ptable.lock);
2438 } 2488
2439 2489 if (first) {
2440 2490 // Some initialization functions must be run in the context
2441 2491 // of a regular process (e.g., they call sleep), and thus cannot
2442 2492 // be run from main().
2443 2493 first = 0;
2444 2494 initlog();
2445 2495 }
2446 2496
2447 2497 // Return to "caller", actually trapret (see allocproc).
2448 2498 }
2449 2499
Sheet 24 Sheet 24
Sep 5 23:39 2011 xv6/proc.c Page 9 Sep 5 23:39 2011 xv6/proc.c Page 10
2500 // Atomically release lock and sleep on chan. 2550 // Wake up all processes sleeping on chan.
2501 // Reacquires lock when awakened. 2551 // The ptable lock must be held.
2502 void 2552 static void
2503 sleep(void *chan, struct spinlock *lk) 2553 wakeup1(void *chan)
2504 { 2554 {
2505 if(proc == 0) 2555 struct proc *p;
2506 panic("sleep"); 2556
2507 2557 for(p = ptable.proc; p < &ptable.proc[NPROC]; p++)
2508 if(lk == 0) 2558 if(p>state == SLEEPING && p>chan == chan)
2509 panic("sleep without lk"); 2559 p>state = RUNNABLE;
2510 2560 }
2511 // Must acquire ptable.lock in order to 2561
2512 // change p>state and then call sched. 2562 // Wake up all processes sleeping on chan.
2513 // Once we hold ptable.lock, we can be 2563 void
2514 // guaranteed that we wont miss any wakeup 2564 wakeup(void *chan)
2515 // (wakeup runs with ptable.lock locked), 2565 {
2516 // so its okay to release lk. 2566 acquire(&ptable.lock);
2517 if(lk != &ptable.lock){ 2567 wakeup1(chan);
2518 acquire(&ptable.lock); 2568 release(&ptable.lock);
2519 release(lk); 2569 }
2520 } 2570
2521 2571 // Kill the process with the given pid.
2522 // Go to sleep. 2572 // Process wont exit until it returns
2523 proc>chan = chan; 2573 // to user space (see trap in trap.c).
2524 proc>state = SLEEPING; 2574 int
2525 sched(); 2575 kill(int pid)
2526 2576 {
2527 // Tidy up. 2577 struct proc *p;
2528 proc>chan = 0; 2578
2529 2579 acquire(&ptable.lock);
2530 // Reacquire original lock. 2580 for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){
2531 if(lk != &ptable.lock){ 2581 if(p>pid == pid){
2532 release(&ptable.lock); 2582 p>killed = 1;
2533 acquire(lk); 2583 // Wake process from sleep if necessary.
2534 } 2584 if(p>state == SLEEPING)
2535 } 2585 p>state = RUNNABLE;
2536 2586 release(&ptable.lock);
2537 2587 return 0;
2538 2588 }
2539 2589 }
2540 2590 release(&ptable.lock);
2541 2591 return 1;
2542 2592 }
2543 2593
2544 2594
2545 2595
2546 2596
2547 2597
2548 2598
2549 2599
Sheet 25 Sheet 25
Sep 5 23:39 2011 xv6/proc.c Page 11 Sep 5 23:39 2011 xv6/swtch.S Page 1
2600 // Print a process listing to console. For debugging. 2650 # Context switch
2601 // Runs when user types ^P on console. 2651 #
2602 // No lock to avoid wedging a stuck machine further. 2652 # void swtch(struct context **old, struct context *new);
2603 void 2653 #
2604 procdump(void) 2654 # Save current register context in old
2605 { 2655 # and then load register context from new.
2606 static char *states[] = { 2656
2607 [UNUSED] "unused", 2657 .globl swtch
2608 [EMBRYO] "embryo", 2658 swtch:
2609 [SLEEPING] "sleep ", 2659 movl 4(%esp), %eax
2610 [RUNNABLE] "runble", 2660 movl 8(%esp), %edx
2611 [RUNNING] "run ", 2661
2612 [ZOMBIE] "zombie" 2662 # Save old calleesave registers
2613 }; 2663 pushl %ebp
2614 int i; 2664 pushl %ebx
2615 struct proc *p; 2665 pushl %esi
2616 char *state; 2666 pushl %edi
2617 uint pc[10]; 2667
2618 2668 # Switch stacks
2619 for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){ 2669 movl %esp, (%eax)
2620 if(p>state == UNUSED) 2670 movl %edx, %esp
2621 continue; 2671
2622 if(p>state >= 0 && p>state < NELEM(states) && states[p>state]) 2672 # Load new calleesave registers
2623 state = states[p>state]; 2673 popl %edi
2624 else 2674 popl %esi
2625 state = "???"; 2675 popl %ebx
2626 cprintf("%d %s %s", p>pid, state, p>name); 2676 popl %ebp
2627 if(p>state == SLEEPING){ 2677 ret
2628 getcallerpcs((uint*)p>context>ebp+2, pc); 2678
2629 for(i=0; i<10 && pc[i] != 0; i++) 2679
2630 cprintf(" %p", pc[i]); 2680
2631 } 2681
2632 cprintf("\n"); 2682
2633 } 2683
2634 } 2684
2635 2685
2636 2686
2637 2687
2638 2688
2639 2689
2640 2690
2641 2691
2642 2692
2643 2693
2644 2694
2645 2695
2646 2696
2647 2697
2648 2698
2649 2699
Sheet 26 Sheet 26
Sep 5 23:39 2011 xv6/kalloc.c Page 1 Sep 5 23:39 2011 xv6/kalloc.c Page 2
Sheet 27 Sheet 27
Sep 5 23:39 2011 xv6/traps.h Page 1 Sep 5 23:39 2011 xv6/vectors.pl Page 1
Sheet 28 Sheet 28
Sep 5 23:39 2011 xv6/trapasm.S Page 1 Sep 5 23:39 2011 xv6/trap.c Page 1
Sheet 29 Sheet 29
Sep 5 23:39 2011 xv6/trap.c Page 2 Sep 5 23:39 2011 xv6/trap.c Page 3
Sheet 30 Sheet 30
Sep 5 23:39 2011 xv6/syscall.h Page 1 Sep 5 23:39 2011 xv6/syscall.c Page 1
Sheet 31 Sheet 31
Sep 5 23:39 2011 xv6/syscall.c Page 2 Sep 5 23:39 2011 xv6/syscall.c Page 3
3200 // Fetch the nth wordsized system call argument as a pointer 3250 static int (*syscalls[])(void) = {
3201 // to a block of memory of size n bytes. Check that the pointer 3251 [SYS_fork] sys_fork,
3202 // lies within the process address space. 3252 [SYS_exit] sys_exit,
3203 int 3253 [SYS_wait] sys_wait,
3204 argptr(int n, char **pp, int size) 3254 [SYS_pipe] sys_pipe,
3205 { 3255 [SYS_read] sys_read,
3206 int i; 3256 [SYS_kill] sys_kill,
3207 3257 [SYS_exec] sys_exec,
3208 if(argint(n, &i) < 0) 3258 [SYS_fstat] sys_fstat,
3209 return 1; 3259 [SYS_chdir] sys_chdir,
3210 if((uint)i >= proc>sz || (uint)i+size > proc>sz) 3260 [SYS_dup] sys_dup,
3211 return 1; 3261 [SYS_getpid] sys_getpid,
3212 *pp = (char*)i; 3262 [SYS_sbrk] sys_sbrk,
3213 return 0; 3263 [SYS_sleep] sys_sleep,
3214 } 3264 [SYS_uptime] sys_uptime,
3215 3265 [SYS_open] sys_open,
3216 // Fetch the nth wordsized system call argument as a string pointer. 3266 [SYS_write] sys_write,
3217 // Check that the pointer is valid and the string is nulterminated. 3267 [SYS_mknod] sys_mknod,
3218 // (There is no shared writable memory, so the string cant change 3268 [SYS_unlink] sys_unlink,
3219 // between this check and being used by the kernel.) 3269 [SYS_link] sys_link,
3220 int 3270 [SYS_mkdir] sys_mkdir,
3221 argstr(int n, char **pp) 3271 [SYS_close] sys_close,
3222 { 3272 };
3223 int addr; 3273
3224 if(argint(n, &addr) < 0) 3274 void
3225 return 1; 3275 syscall(void)
3226 return fetchstr(proc, addr, pp); 3276 {
3227 } 3277 int num;
3228 3278
3229 extern int sys_chdir(void); 3279 num = proc>tf>eax;
3230 extern int sys_close(void); 3280 if(num >= 0 && num < SYS_open && syscalls[num]) {
3231 extern int sys_dup(void); 3281 proc>tf>eax = syscalls[num]();
3232 extern int sys_exec(void); 3282 } else if (num >= SYS_open && num < NELEM(syscalls) && syscalls[num]) {
3233 extern int sys_exit(void); 3283 proc>tf>eax = syscalls[num]();
3234 extern int sys_fork(void); 3284 } else {
3235 extern int sys_fstat(void); 3285 cprintf("%d %s: unknown sys call %d\n",
3236 extern int sys_getpid(void); 3286 proc>pid, proc>name, num);
3237 extern int sys_kill(void); 3287 proc>tf>eax = 1;
3238 extern int sys_link(void); 3288 }
3239 extern int sys_mkdir(void); 3289 }
3240 extern int sys_mknod(void); 3290
3241 extern int sys_open(void); 3291
3242 extern int sys_pipe(void); 3292
3243 extern int sys_read(void); 3293
3244 extern int sys_sbrk(void); 3294
3245 extern int sys_sleep(void); 3295
3246 extern int sys_unlink(void); 3296
3247 extern int sys_wait(void); 3297
3248 extern int sys_write(void); 3298
3249 extern int sys_uptime(void); 3299
Sheet 32 Sheet 32
Sep 5 23:39 2011 xv6/sysproc.c Page 1 Sep 5 23:39 2011 xv6/sysproc.c Page 2
Sheet 33 Sheet 33
Sep 5 23:39 2011 xv6/buf.h Page 1 Sep 5 23:39 2011 xv6/fcntl.h Page 1
Sheet 34 Sheet 34
Sep 5 23:39 2011 xv6/stat.h Page 1 Sep 5 23:39 2011 xv6/fs.h Page 1
Sheet 35 Sheet 35
Sep 5 23:39 2011 xv6/file.h Page 1 Sep 5 23:39 2011 xv6/ide.c Page 1
3600 struct file { 3650 // Simple PIObased (nonDMA) IDE driver code.
3601 enum { FD_NONE, FD_PIPE, FD_INODE } type; 3651
3602 int ref; // reference count 3652 #include "types.h"
3603 char readable; 3653 #include "defs.h"
3604 char writable; 3654 #include "param.h"
3605 struct pipe *pipe; 3655 #include "memlayout.h"
3606 struct inode *ip; 3656 #include "mmu.h"
3607 uint off; 3657 #include "proc.h"
3608 }; 3658 #include "x86.h"
3609 3659 #include "traps.h"
3610 3660 #include "spinlock.h"
3611 // incore file system types 3661 #include "buf.h"
3612 3662
3613 struct inode { 3663 #define IDE_BSY 0x80
3614 uint dev; // Device number 3664 #define IDE_DRDY 0x40
3615 uint inum; // Inode number 3665 #define IDE_DF 0x20
3616 int ref; // Reference count 3666 #define IDE_ERR 0x01
3617 int flags; // I_BUSY, I_VALID 3667
3618 3668 #define IDE_CMD_READ 0x20
3619 short type; // copy of disk inode 3669 #define IDE_CMD_WRITE 0x30
3620 short major; 3670
3621 short minor; 3671 // idequeue points to the buf now being read/written to the disk.
3622 short nlink; 3672 // idequeue>qnext points to the next buf to be processed.
3623 uint size; 3673 // You must hold idelock while manipulating queue.
3624 uint addrs[NDIRECT+1]; 3674
3625 }; 3675 static struct spinlock idelock;
3626 3676 static struct buf *idequeue;
3627 #define I_BUSY 0x1 3677
3628 #define I_VALID 0x2 3678 static int havedisk1;
3629 3679 static void idestart(struct buf*);
3630 // device implementations 3680
3631 3681 // Wait for IDE disk to become ready.
3632 struct devsw { 3682 static int
3633 int (*read)(struct inode*, char*, int); 3683 idewait(int checkerr)
3634 int (*write)(struct inode*, char*, int); 3684 {
3635 }; 3685 int r;
3636 3686
3637 extern struct devsw devsw[]; 3687 while(((r = inb(0x1f7)) & (IDE_BSY|IDE_DRDY)) != IDE_DRDY)
3638 3688 ;
3639 #define CONSOLE 1 3689 if(checkerr && (r & (IDE_DF|IDE_ERR)) != 0)
3640 3690 return 1;
3641 3691 return 0;
3642 3692 }
3643 3693
3644 3694
3645 3695
3646 3696
3647 3697
3648 3698
3649 3699
Sheet 36 Sheet 36
Sep 5 23:39 2011 xv6/ide.c Page 2 Sep 5 23:39 2011 xv6/ide.c Page 3
Sheet 37 Sheet 37
Sep 5 23:39 2011 xv6/ide.c Page 4 Sep 5 23:39 2011 xv6/bio.c Page 1
Sheet 38 Sheet 38
Sep 5 23:39 2011 xv6/bio.c Page 2 Sep 5 23:39 2011 xv6/bio.c Page 3
3900 // Create linked list of buffers 3950 // Return a B_BUSY buf with the contents of the indicated disk sector.
3901 bcache.head.prev = &bcache.head; 3951 struct buf*
3902 bcache.head.next = &bcache.head; 3952 bread(uint dev, uint sector)
3903 for(b = bcache.buf; b < bcache.buf+NBUF; b++){ 3953 {
3904 b>next = bcache.head.next; 3954 struct buf *b;
3905 b>prev = &bcache.head; 3955
3906 b>dev = 1; 3956 b = bget(dev, sector);
3907 bcache.head.next>prev = b; 3957 if(!(b>flags & B_VALID))
3908 bcache.head.next = b; 3958 iderw(b);
3909 } 3959 return b;
3910 } 3960 }
3911 3961
3912 // Look through buffer cache for sector on device dev. 3962 // Write bs contents to disk. Must be locked.
3913 // If not found, allocate fresh block. 3963 void
3914 // In either case, return locked buffer. 3964 bwrite(struct buf *b)
3915 static struct buf* 3965 {
3916 bget(uint dev, uint sector) 3966 if((b>flags & B_BUSY) == 0)
3917 { 3967 panic("bwrite");
3918 struct buf *b; 3968 b>flags |= B_DIRTY;
3919 3969 iderw(b);
3920 acquire(&bcache.lock); 3970 }
3921 3971
3922 loop: 3972 // Release the buffer b.
3923 // Try for cached block. 3973 void
3924 for(b = bcache.head.next; b != &bcache.head; b = b>next){ 3974 brelse(struct buf *b)
3925 if(b>dev == dev && b>sector == sector){ 3975 {
3926 if(!(b>flags & B_BUSY)){ 3976 if((b>flags & B_BUSY) == 0)
3927 b>flags |= B_BUSY; 3977 panic("brelse");
3928 release(&bcache.lock); 3978
3929 return b; 3979 acquire(&bcache.lock);
3930 } 3980
3931 sleep(b, &bcache.lock); 3981 b>next>prev = b>prev;
3932 goto loop; 3982 b>prev>next = b>next;
3933 } 3983 b>next = bcache.head.next;
3934 } 3984 b>prev = &bcache.head;
3935 3985 bcache.head.next>prev = b;
3936 // Allocate fresh block. 3986 bcache.head.next = b;
3937 for(b = bcache.head.prev; b != &bcache.head; b = b>prev){ 3987
3938 if((b>flags & B_BUSY) == 0){ 3988 b>flags &= ~B_BUSY;
3939 b>dev = dev; 3989 wakeup(b);
3940 b>sector = sector; 3990
3941 b>flags = B_BUSY; 3991 release(&bcache.lock);
3942 release(&bcache.lock); 3992 }
3943 return b; 3993
3944 } 3994
3945 } 3995
3946 panic("bget: no buffers"); 3996
3947 } 3997
3948 3998
3949 3999
Sheet 39 Sheet 39
Sep 5 23:39 2011 xv6/log.c Page 1 Sep 5 23:39 2011 xv6/log.c Page 2
Sheet 40 Sheet 40
Sep 5 23:39 2011 xv6/log.c Page 3 Sep 5 23:39 2011 xv6/log.c Page 4
4100 // Write inmemory log header to disk, committing log entries till head 4150 // Caller has modified b>data and is done with the buffer.
4101 static void 4151 // Append the block to the log and record the block number,
4102 write_head(void) 4152 // but dont write the log header (which would commit the write).
4103 { 4153 // log_write() replaces bwrite(); a typical use is:
4104 struct buf *buf = bread(log.dev, log.start); 4154 // bp = bread(...)
4105 struct logheader *hb = (struct logheader *) (buf>data); 4155 // modify bp>data[]
4106 int i; 4156 // log_write(bp)
4107 hb>n = log.lh.n; 4157 // brelse(bp)
4108 for (i = 0; i < log.lh.n; i++) { 4158 void
4109 hb>sector[i] = log.lh.sector[i]; 4159 log_write(struct buf *b)
4110 } 4160 {
4111 bwrite(buf); 4161 int i;
4112 brelse(buf); 4162
4113 } 4163 if (log.lh.n >= LOGSIZE || log.lh.n >= log.size 1)
4114 4164 panic("too big a transaction");
4115 static void 4165 if (!log.intrans)
4116 recover_from_log(void) 4166 panic("write outside of trans");
4117 { 4167
4118 read_head(); 4168 for (i = 0; i < log.lh.n; i++) {
4119 install_trans(); // if committed, copy from log to disk 4169 if (log.lh.sector[i] == b>sector) // log absorbtion?
4120 log.lh.n = 0; 4170 break;
4121 write_head(); // clear the log 4171 }
4122 } 4172 log.lh.sector[i] = b>sector;
4123 4173 struct buf *lbuf = bread(b>dev, log.start+i+1);
4124 void 4174 memmove(lbuf>data, b>data, BSIZE);
4125 begin_trans(void) 4175 bwrite(lbuf);
4126 { 4176 brelse(lbuf);
4127 acquire(&log.lock); 4177 if (i == log.lh.n)
4128 while (log.intrans) { 4178 log.lh.n++;
4129 sleep(&log, &log.lock); 4179 }
4130 } 4180
4131 log.intrans = 1; 4181
4132 release(&log.lock); 4182
4133 } 4183
4134 4184
4135 void 4185
4136 commit_trans(void) 4186
4137 { 4187
4138 if (log.lh.n > 0) { 4188
4139 write_head(); // Causes all blocks till log.head to be commited 4189
4140 install_trans(); // Install all the transactions till head 4190
4141 log.lh.n = 0; 4191
4142 write_head(); // Reclaim log 4192
4143 } 4193
4144 4194
4145 acquire(&log.lock); 4195
4146 log.intrans = 0; 4196
4147 wakeup(&log); 4197
4148 release(&log.lock); 4198
4149 } 4199
Sheet 41 Sheet 41
Sep 5 23:39 2011 xv6/log.c Page 5 Sep 5 23:39 2011 xv6/fs.c Page 1
Sheet 42 Sheet 42
Sep 5 23:39 2011 xv6/fs.c Page 2 Sep 5 23:39 2011 xv6/fs.c Page 3
Sheet 43 Sheet 43
Sep 5 23:39 2011 xv6/fs.c Page 4 Sep 5 23:39 2011 xv6/fs.c Page 5
4400 // Allocate a new inode with the given type on device dev. 4450 // Find the inode with number inum on device dev
4401 struct inode* 4451 // and return the inmemory copy.
4402 ialloc(uint dev, short type) 4452 static struct inode*
4403 { 4453 iget(uint dev, uint inum)
4404 int inum; 4454 {
4405 struct buf *bp; 4455 struct inode *ip, *empty;
4406 struct dinode *dip; 4456
4407 struct superblock sb; 4457 acquire(&icache.lock);
4408 4458
4409 readsb(dev, &sb); 4459 // Try for cached inode.
4410 for(inum = 1; inum < sb.ninodes; inum++){ // loop over inode blocks 4460 empty = 0;
4411 bp = bread(dev, IBLOCK(inum)); 4461 for(ip = &icache.inode[0]; ip < &icache.inode[NINODE]; ip++){
4412 dip = (struct dinode*)bp>data + inum%IPB; 4462 if(ip>ref > 0 && ip>dev == dev && ip>inum == inum){
4413 if(dip>type == 0){ // a free inode 4463 ip>ref++;
4414 memset(dip, 0, sizeof(*dip)); 4464 release(&icache.lock);
4415 dip>type = type; 4465 return ip;
4416 log_write(bp); // mark it allocated on the disk 4466 }
4417 brelse(bp); 4467 if(empty == 0 && ip>ref == 0) // Remember empty slot.
4418 return iget(dev, inum); 4468 empty = ip;
4419 } 4469 }
4420 brelse(bp); 4470
4421 } 4471 // Allocate fresh inode.
4422 panic("ialloc: no inodes"); 4472 if(empty == 0)
4423 } 4473 panic("iget: no inodes");
4424 4474
4425 // Copy inode, which has changed, from memory to disk. 4475 ip = empty;
4426 void 4476 ip>dev = dev;
4427 iupdate(struct inode *ip) 4477 ip>inum = inum;
4428 { 4478 ip>ref = 1;
4429 struct buf *bp; 4479 ip>flags = 0;
4430 struct dinode *dip; 4480 release(&icache.lock);
4431 4481
4432 bp = bread(ip>dev, IBLOCK(ip>inum)); 4482 return ip;
4433 dip = (struct dinode*)bp>data + ip>inum%IPB; 4483 }
4434 dip>type = ip>type; 4484
4435 dip>major = ip>major; 4485 // Increment reference count for ip.
4436 dip>minor = ip>minor; 4486 // Returns ip to enable ip = idup(ip1) idiom.
4437 dip>nlink = ip>nlink; 4487 struct inode*
4438 dip>size = ip>size; 4488 idup(struct inode *ip)
4439 memmove(dip>addrs, ip>addrs, sizeof(ip>addrs)); 4489 {
4440 log_write(bp); 4490 acquire(&icache.lock);
4441 brelse(bp); 4491 ip>ref++;
4442 } 4492 release(&icache.lock);
4443 4493 return ip;
4444 4494 }
4445 4495
4446 4496
4447 4497
4448 4498
4449 4499
Sheet 44 Sheet 44
Sep 5 23:39 2011 xv6/fs.c Page 6 Sep 5 23:39 2011 xv6/fs.c Page 7
4500 // Lock the given inode. 4550 // Caller holds reference to unlocked ip. Drop reference.
4501 void 4551 void
4502 ilock(struct inode *ip) 4552 iput(struct inode *ip)
4503 { 4553 {
4504 struct buf *bp; 4554 acquire(&icache.lock);
4505 struct dinode *dip; 4555 if(ip>ref == 1 && (ip>flags & I_VALID) && ip>nlink == 0){
4506 4556 // inode is no longer used: truncate and free inode.
4507 if(ip == 0 || ip>ref < 1) 4557 if(ip>flags & I_BUSY)
4508 panic("ilock"); 4558 panic("iput busy");
4509 4559 ip>flags |= I_BUSY;
4510 acquire(&icache.lock); 4560 release(&icache.lock);
4511 while(ip>flags & I_BUSY) 4561 itrunc(ip);
4512 sleep(ip, &icache.lock); 4562 ip>type = 0;
4513 ip>flags |= I_BUSY; 4563 iupdate(ip);
4514 release(&icache.lock); 4564 acquire(&icache.lock);
4515 4565 ip>flags = 0;
4516 if(!(ip>flags & I_VALID)){ 4566 wakeup(ip);
4517 bp = bread(ip>dev, IBLOCK(ip>inum)); 4567 }
4518 dip = (struct dinode*)bp>data + ip>inum%IPB; 4568 ip>ref;
4519 ip>type = dip>type; 4569 release(&icache.lock);
4520 ip>major = dip>major; 4570 }
4521 ip>minor = dip>minor; 4571
4522 ip>nlink = dip>nlink; 4572 // Common idiom: unlock, then put.
4523 ip>size = dip>size; 4573 void
4524 memmove(ip>addrs, dip>addrs, sizeof(ip>addrs)); 4574 iunlockput(struct inode *ip)
4525 brelse(bp); 4575 {
4526 ip>flags |= I_VALID; 4576 iunlock(ip);
4527 if(ip>type == 0) 4577 iput(ip);
4528 panic("ilock: no type"); 4578 }
4529 } 4579
4530 } 4580
4531 4581
4532 // Unlock the given inode. 4582
4533 void 4583
4534 iunlock(struct inode *ip) 4584
4535 { 4585
4536 if(ip == 0 || !(ip>flags & I_BUSY) || ip>ref < 1) 4586
4537 panic("iunlock"); 4587
4538 4588
4539 acquire(&icache.lock); 4589
4540 ip>flags &= ~I_BUSY; 4590
4541 wakeup(ip); 4591
4542 release(&icache.lock); 4592
4543 } 4593
4544 4594
4545 4595
4546 4596
4547 4597
4548 4598
4549 4599
Sheet 45 Sheet 45
Sep 5 23:39 2011 xv6/fs.c Page 8 Sep 5 23:39 2011 xv6/fs.c Page 9
Sheet 46 Sheet 46
Sep 5 23:39 2011 xv6/fs.c Page 10 Sep 5 23:39 2011 xv6/fs.c Page 11
Sheet 47 Sheet 47
Sep 5 23:39 2011 xv6/fs.c Page 12 Sep 5 23:39 2011 xv6/fs.c Page 13
4800 // Directories 4850 // Write a new directory entry (name, inum) into the directory dp.
4801 4851 int
4802 int 4852 dirlink(struct inode *dp, char *name, uint inum)
4803 namecmp(const char *s, const char *t) 4853 {
4804 { 4854 int off;
4805 return strncmp(s, t, DIRSIZ); 4855 struct dirent de;
4806 } 4856 struct inode *ip;
4807 4857
4808 // Look for a directory entry in a directory. 4858 // Check that name is not present.
4809 // If found, set *poff to byte offset of entry. 4859 if((ip = dirlookup(dp, name, 0)) != 0){
4810 // Caller must have already locked dp. 4860 iput(ip);
4811 struct inode* 4861 return 1;
4812 dirlookup(struct inode *dp, char *name, uint *poff) 4862 }
4813 { 4863
4814 uint off, inum; 4864 // Look for an empty dirent.
4815 struct dirent de; 4865 for(off = 0; off < dp>size; off += sizeof(de)){
4816 4866 if(readi(dp, (char*)&de, off, sizeof(de)) != sizeof(de))
4817 if(dp>type != T_DIR) 4867 panic("dirlink read");
4818 panic("dirlookup not DIR"); 4868 if(de.inum == 0)
4819 4869 break;
4820 for(off = 0; off < dp>size; off += sizeof(de)){ 4870 }
4821 if(readi(dp, (char*)&de, off, sizeof(de)) != sizeof(de)) 4871
4822 panic("dirlink read"); 4872 strncpy(de.name, name, DIRSIZ);
4823 if(de.inum == 0) 4873 de.inum = inum;
4824 continue; 4874 if(writei(dp, (char*)&de, off, sizeof(de)) != sizeof(de))
4825 if(namecmp(name, de.name) == 0){ 4875 panic("dirlink");
4826 // entry matches path element 4876
4827 if(poff) 4877 return 0;
4828 *poff = off; 4878 }
4829 inum = de.inum; 4879
4830 return iget(dp>dev, inum); 4880
4831 } 4881
4832 } 4882
4833 4883
4834 return 0; 4884
4835 } 4885
4836 4886
4837 4887
4838 4888
4839 4889
4840 4890
4841 4891
4842 4892
4843 4893
4844 4894
4845 4895
4846 4896
4847 4897
4848 4898
4849 4899
Sheet 48 Sheet 48
Sep 5 23:39 2011 xv6/fs.c Page 14 Sep 5 23:39 2011 xv6/fs.c Page 15
4900 // Paths 4950 // Look up and return the inode for a path name.
4901 4951 // If parent != 0, return the inode for the parent and copy the final
4902 // Copy the next path element from path into name. 4952 // path element into name, which must have room for DIRSIZ bytes.
4903 // Return a pointer to the element following the copied one. 4953 static struct inode*
4904 // The returned path has no leading slashes, 4954 namex(char *path, int nameiparent, char *name)
4905 // so the caller can check *path==\0 to see if the name is the last one. 4955 {
4906 // If no name to remove, return 0. 4956 struct inode *ip, *next;
4907 // 4957
4908 // Examples: 4958 if(*path == /)
4909 // skipelem("a/bb/c", name) = "bb/c", setting name = "a" 4959 ip = iget(ROOTDEV, ROOTINO);
4910 // skipelem("///a//bb", name) = "bb", setting name = "a" 4960 else
4911 // skipelem("a", name) = "", setting name = "a" 4961 ip = idup(proc>cwd);
4912 // skipelem("", name) = skipelem("////", name) = 0 4962
4913 // 4963 while((path = skipelem(path, name)) != 0){
4914 static char* 4964 ilock(ip);
4915 skipelem(char *path, char *name) 4965 if(ip>type != T_DIR){
4916 { 4966 iunlockput(ip);
4917 char *s; 4967 return 0;
4918 int len; 4968 }
4919 4969 if(nameiparent && *path == \0){
4920 while(*path == /) 4970 // Stop one level early.
4921 path++; 4971 iunlock(ip);
4922 if(*path == 0) 4972 return ip;
4923 return 0; 4973 }
4924 s = path; 4974 if((next = dirlookup(ip, name, 0)) == 0){
4925 while(*path != / && *path != 0) 4975 iunlockput(ip);
4926 path++; 4976 return 0;
4927 len = path s; 4977 }
4928 if(len >= DIRSIZ) 4978 iunlockput(ip);
4929 memmove(name, s, DIRSIZ); 4979 ip = next;
4930 else { 4980 }
4931 memmove(name, s, len); 4981 if(nameiparent){
4932 name[len] = 0; 4982 iput(ip);
4933 } 4983 return 0;
4934 while(*path == /) 4984 }
4935 path++; 4985 return ip;
4936 return path; 4986 }
4937 } 4987
4938 4988 struct inode*
4939 4989 namei(char *path)
4940 4990 {
4941 4991 char name[DIRSIZ];
4942 4992 return namex(path, 0, name);
4943 4993 }
4944 4994
4945 4995 struct inode*
4946 4996 nameiparent(char *path, char *name)
4947 4997 {
4948 4998 return namex(path, 1, name);
4949 4999 }
Sheet 49 Sheet 49
Sep 5 23:39 2011 xv6/file.c Page 1 Sep 5 23:39 2011 xv6/file.c Page 2
5000 #include "types.h" 5050 // Close file f. (Decrement ref count, close when reaches 0.)
5001 #include "defs.h" 5051 void
5002 #include "param.h" 5052 fileclose(struct file *f)
5003 #include "fs.h" 5053 {
5004 #include "file.h" 5054 struct file ff;
5005 #include "spinlock.h" 5055
5006 5056 acquire(&ftable.lock);
5007 struct devsw devsw[NDEV]; 5057 if(f>ref < 1)
5008 struct { 5058 panic("fileclose");
5009 struct spinlock lock; 5059 if(f>ref > 0){
5010 struct file file[NFILE]; 5060 release(&ftable.lock);
5011 } ftable; 5061 return;
5012 5062 }
5013 void 5063 ff = *f;
5014 fileinit(void) 5064 f>ref = 0;
5015 { 5065 f>type = FD_NONE;
5016 initlock(&ftable.lock, "ftable"); 5066 release(&ftable.lock);
5017 } 5067
5018 5068 if(ff.type == FD_PIPE)
5019 // Allocate a file structure. 5069 pipeclose(ff.pipe, ff.writable);
5020 struct file* 5070 else if(ff.type == FD_INODE){
5021 filealloc(void) 5071 begin_trans();
5022 { 5072 iput(ff.ip);
5023 struct file *f; 5073 commit_trans();
5024 5074 }
5025 acquire(&ftable.lock); 5075 }
5026 for(f = ftable.file; f < ftable.file + NFILE; f++){ 5076
5027 if(f>ref == 0){ 5077 // Get metadata about file f.
5028 f>ref = 1; 5078 int
5029 release(&ftable.lock); 5079 filestat(struct file *f, struct stat *st)
5030 return f; 5080 {
5031 } 5081 if(f>type == FD_INODE){
5032 } 5082 ilock(f>ip);
5033 release(&ftable.lock); 5083 stati(f>ip, st);
5034 return 0; 5084 iunlock(f>ip);
5035 } 5085 return 0;
5036 5086 }
5037 // Increment ref count for file f. 5087 return 1;
5038 struct file* 5088 }
5039 filedup(struct file *f) 5089
5040 { 5090
5041 acquire(&ftable.lock); 5091
5042 if(f>ref < 1) 5092
5043 panic("filedup"); 5093
5044 f>ref++; 5094
5045 release(&ftable.lock); 5095
5046 return f; 5096
5047 } 5097
5048 5098
5049 5099
Sheet 50 Sheet 50
Sep 5 23:39 2011 xv6/file.c Page 3 Sep 5 23:39 2011 xv6/file.c Page 4
5100 // Read from file f. Addr is kernel address. 5150 // Write to file f. Addr is kernel address.
5101 int 5151 int
5102 fileread(struct file *f, char *addr, int n) 5152 filewrite(struct file *f, char *addr, int n)
5103 { 5153 {
5104 int r; 5154 int r;
5105 5155
5106 if(f>readable == 0) 5156 if(f>writable == 0)
5107 return 1; 5157 return 1;
5108 if(f>type == FD_PIPE) 5158 if(f>type == FD_PIPE)
5109 return piperead(f>pipe, addr, n); 5159 return pipewrite(f>pipe, addr, n);
5110 if(f>type == FD_INODE){ 5160 if(f>type == FD_INODE){
5111 ilock(f>ip); 5161 // write a few blocks at a time to avoid exceeding
5112 if((r = readi(f>ip, addr, f>off, n)) > 0) 5162 // the maximum log transaction size, including
5113 f>off += r; 5163 // inode, indirect block, allocation blocks,
5114 iunlock(f>ip); 5164 // and 2 blocks of slop for nonaligned writes.
5115 return r; 5165 // this really belongs lower down, since writei()
5116 } 5166 // might be writing a device like the console.
5117 panic("fileread"); 5167 int max = ((LOGSIZE112) / 2) * 512;
5118 } 5168 int i = 0;
5119 5169 while(i < n){
5120 5170 int n1 = n i;
5121 5171 if(n1 > max)
5122 5172 n1 = max;
5123 5173
5124 5174 begin_trans();
5125 5175 ilock(f>ip);
5126 5176 if ((r = writei(f>ip, addr + i, f>off, n1)) > 0)
5127 5177 f>off += r;
5128 5178 iunlock(f>ip);
5129 5179 commit_trans();
5130 5180
5131 5181 if(r < 0)
5132 5182 break;
5133 5183 if(r != n1)
5134 5184 panic("short filewrite");
5135 5185 i += r;
5136 5186 }
5137 5187 return i == n ? n : 1;
5138 5188 }
5139 5189 panic("filewrite");
5140 5190 }
5141 5191
5142 5192
5143 5193
5144 5194
5145 5195
5146 5196
5147 5197
5148 5198
5149 5199
Sheet 51 Sheet 51
Sep 5 23:39 2011 xv6/sysfile.c Page 1 Sep 5 23:39 2011 xv6/sysfile.c Page 2
Sheet 52 Sheet 52
Sep 5 23:39 2011 xv6/sysfile.c Page 3 Sep 5 23:39 2011 xv6/sysfile.c Page 4
Sheet 53 Sheet 53
Sep 5 23:39 2011 xv6/sysfile.c Page 5 Sep 5 23:39 2011 xv6/sysfile.c Page 6
Sheet 54 Sheet 54
Sep 5 23:39 2011 xv6/sysfile.c Page 7 Sep 5 23:39 2011 xv6/sysfile.c Page 8
Sheet 55 Sheet 55
Sep 5 23:39 2011 xv6/sysfile.c Page 9 Sep 5 23:39 2011 xv6/sysfile.c Page 10
Sheet 56 Sheet 56
Sep 5 23:39 2011 xv6/exec.c Page 1 Sep 5 23:39 2011 xv6/exec.c Page 2
5700 #include "types.h" 5750 // Allocate two pages at the next page boundary.
5701 #include "param.h" 5751 // Make the first inaccessible. Use the second as the user stack.
5702 #include "memlayout.h" 5752 sz = PGROUNDUP(sz);
5703 #include "mmu.h" 5753 if((sz = allocuvm(pgdir, sz, sz + 2*PGSIZE)) == 0)
5704 #include "proc.h" 5754 goto bad;
5705 #include "defs.h" 5755 clearpteu(pgdir, (char*)(sz 2*PGSIZE));
5706 #include "x86.h" 5756 sp = sz;
5707 #include "elf.h" 5757
5708 5758 // Push argument strings, prepare rest of stack in ustack.
5709 int 5759 for(argc = 0; argv[argc]; argc++) {
5710 exec(char *path, char **argv) 5760 if(argc >= MAXARG)
5711 { 5761 goto bad;
5712 char *s, *last; 5762 sp = (sp (strlen(argv[argc]) + 1)) & ~3;
5713 int i, off; 5763 if(copyout(pgdir, sp, argv[argc], strlen(argv[argc]) + 1) < 0)
5714 uint argc, sz, sp, ustack[3+MAXARG+1]; 5764 goto bad;
5715 struct elfhdr elf; 5765 ustack[3+argc] = sp;
5716 struct inode *ip; 5766 }
5717 struct proghdr ph; 5767 ustack[3+argc] = 0;
5718 pde_t *pgdir, *oldpgdir; 5768
5719 5769 ustack[0] = 0xffffffff; // fake return PC
5720 if((ip = namei(path)) == 0) 5770 ustack[1] = argc;
5721 return 1; 5771 ustack[2] = sp (argc+1)*4; // argv pointer
5722 ilock(ip); 5772
5723 pgdir = 0; 5773 sp = (3+argc+1) * 4;
5724 5774 if(copyout(pgdir, sp, ustack, (3+argc+1)*4) < 0)
5725 // Check ELF header 5775 goto bad;
5726 if(readi(ip, (char*)&elf, 0, sizeof(elf)) < sizeof(elf)) 5776
5727 goto bad; 5777 // Save program name for debugging.
5728 if(elf.magic != ELF_MAGIC) 5778 for(last=s=path; *s; s++)
5729 goto bad; 5779 if(*s == /)
5730 5780 last = s+1;
5731 if((pgdir = setupkvm(kalloc)) == 0) 5781 safestrcpy(proc>name, last, sizeof(proc>name));
5732 goto bad; 5782
5733 5783 // Commit to the user image.
5734 // Load program into memory. 5784 oldpgdir = proc>pgdir;
5735 sz = 0; 5785 proc>pgdir = pgdir;
5736 for(i=0, off=elf.phoff; i<elf.phnum; i++, off+=sizeof(ph)){ 5786 proc>sz = sz;
5737 if(readi(ip, (char*)&ph, off, sizeof(ph)) != sizeof(ph)) 5787 proc>tf>eip = elf.entry; // main
5738 goto bad; 5788 proc>tf>esp = sp;
5739 if(ph.type != ELF_PROG_LOAD) 5789 switchuvm(proc);
5740 continue; 5790 freevm(oldpgdir);
5741 if(ph.memsz < ph.filesz) 5791 return 0;
5742 goto bad; 5792
5743 if((sz = allocuvm(pgdir, sz, ph.vaddr + ph.memsz)) == 0) 5793 bad:
5744 goto bad; 5794 if(pgdir)
5745 if(loaduvm(pgdir, (char*)ph.vaddr, ip, ph.off, ph.filesz) < 0) 5795 freevm(pgdir);
5746 goto bad; 5796 if(ip)
5747 } 5797 iunlockput(ip);
5748 iunlockput(ip); 5798 return 1;
5749 ip = 0; 5799 }
Sheet 57 Sheet 57
Sep 5 23:39 2011 xv6/pipe.c Page 1 Sep 5 23:39 2011 xv6/pipe.c Page 2
Sheet 58 Sheet 58
Sep 5 23:39 2011 xv6/pipe.c Page 3 Sep 5 23:39 2011 xv6/string.c Page 1
Sheet 59 Sheet 59
Sep 5 23:39 2011 xv6/string.c Page 2 Sep 5 23:39 2011 xv6/string.c Page 3
Sheet 60 Sheet 60
Sep 5 23:39 2011 xv6/mp.h Page 1 Sep 5 23:39 2011 xv6/mp.h Page 2
6100 // See MultiProcessor Specification Version 1.[14] 6150 // Table entry types
6101 6151 #define MPPROC 0x00 // One per processor
6102 struct mp { // floating pointer 6152 #define MPBUS 0x01 // One per bus
6103 uchar signature[4]; // "_MP_" 6153 #define MPIOAPIC 0x02 // One per I/O APIC
6104 void *physaddr; // phys addr of MP config table 6154 #define MPIOINTR 0x03 // One per bus interrupt source
6105 uchar length; // 1 6155 #define MPLINTR 0x04 // One per system interrupt source
6106 uchar specrev; // [14] 6156
6107 uchar checksum; // all bytes must add up to 0 6157
6108 uchar type; // MP system config type 6158
6109 uchar imcrp; 6159
6110 uchar reserved[3]; 6160
6111 }; 6161
6112 6162
6113 struct mpconf { // configuration table header 6163
6114 uchar signature[4]; // "PCMP" 6164
6115 ushort length; // total table length 6165
6116 uchar version; // [14] 6166
6117 uchar checksum; // all bytes must add up to 0 6167
6118 uchar product[20]; // product id 6168
6119 uint *oemtable; // OEM table pointer 6169
6120 ushort oemlength; // OEM table length 6170
6121 ushort entry; // entry count 6171
6122 uint *lapicaddr; // address of local APIC 6172
6123 ushort xlength; // extended table length 6173
6124 uchar xchecksum; // extended table checksum 6174
6125 uchar reserved; 6175
6126 }; 6176
6127 6177
6128 struct mpproc { // processor table entry 6178
6129 uchar type; // entry type (0) 6179
6130 uchar apicid; // local APIC id 6180
6131 uchar version; // local APIC verison 6181
6132 uchar flags; // CPU flags 6182
6133 #define MPBOOT 0x02 // This proc is the bootstrap processor. 6183
6134 uchar signature[4]; // CPU signature 6184
6135 uint feature; // feature flags from CPUID instruction 6185
6136 uchar reserved[8]; 6186
6137 }; 6187
6138 6188
6139 struct mpioapic { // I/O APIC table entry 6189
6140 uchar type; // entry type (2) 6190
6141 uchar apicno; // I/O APIC id 6191
6142 uchar version; // I/O APIC version 6192
6143 uchar flags; // I/O APIC flags 6193
6144 uint *addr; // I/O APIC address 6194
6145 }; 6195
6146 6196
6147 6197
6148 6198
6149 6199
Sheet 61 Sheet 61
Sep 5 23:39 2011 xv6/mp.c Page 1 Sep 5 23:39 2011 xv6/mp.c Page 2
6200 // Multiprocessor support 6250 // Search for the MP Floating Pointer Structure, which according to the
6201 // Search memory for MP description structures. 6251 // spec is in one of the following three locations:
6202 // http://developer.intel.com/design/pentium/datashts/24201606.pdf 6252 // 1) in the first KB of the EBDA;
6203 6253 // 2) in the last KB of system base memory;
6204 #include "types.h" 6254 // 3) in the BIOS ROM between 0xE0000 and 0xFFFFF.
6205 #include "defs.h" 6255 static struct mp*
6206 #include "param.h" 6256 mpsearch(void)
6207 #include "memlayout.h" 6257 {
6208 #include "mp.h" 6258 uchar *bda;
6209 #include "x86.h" 6259 uint p;
6210 #include "mmu.h" 6260 struct mp *mp;
6211 #include "proc.h" 6261
6212 6262 bda = (uchar *) P2V(0x400);
6213 struct cpu cpus[NCPU]; 6263 if((p = ((bda[0x0F]<<8)| bda[0x0E]) << 4)){
6214 static struct cpu *bcpu; 6264 if((mp = mpsearch1(p, 1024)))
6215 int ismp; 6265 return mp;
6216 int ncpu; 6266 } else {
6217 uchar ioapicid; 6267 p = ((bda[0x14]<<8)|bda[0x13])*1024;
6218 6268 if((mp = mpsearch1(p1024, 1024)))
6219 int 6269 return mp;
6220 mpbcpu(void) 6270 }
6221 { 6271 return mpsearch1(0xF0000, 0x10000);
6222 return bcpucpus; 6272 }
6223 } 6273
6224 6274 // Search for an MP configuration table. For now,
6225 static uchar 6275 // dont accept the default configurations (physaddr == 0).
6226 sum(uchar *addr, int len) 6276 // Check for correct signature, calculate the checksum and,
6227 { 6277 // if correct, check the version.
6228 int i, sum; 6278 // To do: check extended table checksum.
6229 6279 static struct mpconf*
6230 sum = 0; 6280 mpconfig(struct mp **pmp)
6231 for(i=0; i<len; i++) 6281 {
6232 sum += addr[i]; 6282 struct mpconf *conf;
6233 return sum; 6283 struct mp *mp;
6234 } 6284
6235 6285 if((mp = mpsearch()) == 0 || mp>physaddr == 0)
6236 // Look for an MP structure in the len bytes at addr. 6286 return 0;
6237 static struct mp* 6287 conf = (struct mpconf*) p2v((uint) mp>physaddr);
6238 mpsearch1(uint a, int len) 6288 if(memcmp(conf, "PCMP", 4) != 0)
6239 { 6289 return 0;
6240 uchar *e, *p, *addr; 6290 if(conf>version != 1 && conf>version != 4)
6241 6291 return 0;
6242 addr = p2v(a); 6292 if(sum((uchar*)conf, conf>length) != 0)
6243 e = addr+len; 6293 return 0;
6244 for(p = addr; p < e; p += sizeof(struct mp)) 6294 *pmp = mp;
6245 if(memcmp(p, "_MP_", 4) == 0 && sum(p, sizeof(struct mp)) == 0) 6295 return conf;
6246 return (struct mp*)p; 6296 }
6247 return 0; 6297
6248 } 6298
6249 6299
Sheet 62 Sheet 62
Sep 5 23:39 2011 xv6/mp.c Page 3 Sep 5 23:39 2011 xv6/mp.c Page 4
Sheet 63 Sheet 63
Sep 5 23:39 2011 xv6/lapic.c Page 1 Sep 5 23:39 2011 xv6/lapic.c Page 2
6400 // The local APIC manages internal (nonI/O) interrupts. 6450 void
6401 // See Chapter 8 & Appendix C of Intel processor manual volume 3. 6451 lapicinit(int c)
6402 6452 {
6403 #include "types.h" 6453 if(!lapic)
6404 #include "defs.h" 6454 return;
6405 #include "memlayout.h" 6455
6406 #include "traps.h" 6456 // Enable local APIC; set spurious interrupt vector.
6407 #include "mmu.h" 6457 lapicw(SVR, ENABLE | (T_IRQ0 + IRQ_SPURIOUS));
6408 #include "x86.h" 6458
6409 6459 // The timer repeatedly counts down at bus frequency
6410 // Local APIC registers, divided by 4 for use as uint[] indices. 6460 // from lapic[TICR] and then issues an interrupt.
6411 #define ID (0x0020/4) // ID 6461 // If xv6 cared more about precise timekeeping,
6412 #define VER (0x0030/4) // Version 6462 // TICR would be calibrated using an external time source.
6413 #define TPR (0x0080/4) // Task Priority 6463 lapicw(TDCR, X1);
6414 #define EOI (0x00B0/4) // EOI 6464 lapicw(TIMER, PERIODIC | (T_IRQ0 + IRQ_TIMER));
6415 #define SVR (0x00F0/4) // Spurious Interrupt Vector 6465 lapicw(TICR, 10000000);
6416 #define ENABLE 0x00000100 // Unit Enable 6466
6417 #define ESR (0x0280/4) // Error Status 6467 // Disable logical interrupt lines.
6418 #define ICRLO (0x0300/4) // Interrupt Command 6468 lapicw(LINT0, MASKED);
6419 #define INIT 0x00000500 // INIT/RESET 6469 lapicw(LINT1, MASKED);
6420 #define STARTUP 0x00000600 // Startup IPI 6470
6421 #define DELIVS 0x00001000 // Delivery status 6471 // Disable performance counter overflow interrupts
6422 #define ASSERT 0x00004000 // Assert interrupt (vs deassert) 6472 // on machines that provide that interrupt entry.
6423 #define DEASSERT 0x00000000 6473 if(((lapic[VER]>>16) & 0xFF) >= 4)
6424 #define LEVEL 0x00008000 // Level triggered 6474 lapicw(PCINT, MASKED);
6425 #define BCAST 0x00080000 // Send to all APICs, including self. 6475
6426 #define BUSY 0x00001000 6476 // Map error interrupt to IRQ_ERROR.
6427 #define FIXED 0x00000000 6477 lapicw(ERROR, T_IRQ0 + IRQ_ERROR);
6428 #define ICRHI (0x0310/4) // Interrupt Command [63:32] 6478
6429 #define TIMER (0x0320/4) // Local Vector Table 0 (TIMER) 6479 // Clear error status register (requires backtoback writes).
6430 #define X1 0x0000000B // divide counts by 1 6480 lapicw(ESR, 0);
6431 #define PERIODIC 0x00020000 // Periodic 6481 lapicw(ESR, 0);
6432 #define PCINT (0x0340/4) // Performance Counter LVT 6482
6433 #define LINT0 (0x0350/4) // Local Vector Table 1 (LINT0) 6483 // Ack any outstanding interrupts.
6434 #define LINT1 (0x0360/4) // Local Vector Table 2 (LINT1) 6484 lapicw(EOI, 0);
6435 #define ERROR (0x0370/4) // Local Vector Table 3 (ERROR) 6485
6436 #define MASKED 0x00010000 // Interrupt masked 6486 // Send an Init Level DeAssert to synchronise arbitration IDs.
6437 #define TICR (0x0380/4) // Timer Initial Count 6487 lapicw(ICRHI, 0);
6438 #define TCCR (0x0390/4) // Timer Current Count 6488 lapicw(ICRLO, BCAST | INIT | LEVEL);
6439 #define TDCR (0x03E0/4) // Timer Divide Configuration 6489 while(lapic[ICRLO] & DELIVS)
6440 6490 ;
6441 volatile uint *lapic; // Initialized in mp.c 6491
6442 6492 // Enable interrupts on the APIC (but not on the processor).
6443 static void 6493 lapicw(TPR, 0);
6444 lapicw(int index, int value) 6494 }
6445 { 6495
6446 lapic[index] = value; 6496
6447 lapic[ID]; // wait for write to finish, by reading 6497
6448 } 6498
6449 6499
Sheet 64 Sheet 64
Sep 5 23:39 2011 xv6/lapic.c Page 3 Sep 5 23:39 2011 xv6/lapic.c Page 4
Sheet 65 Sheet 65
Sep 5 23:39 2011 xv6/ioapic.c Page 1 Sep 5 23:39 2011 xv6/ioapic.c Page 2
6600 // The I/O APIC manages hardware interrupts for an SMP system. 6650 void
6601 // http://www.intel.com/design/chipsets/datashts/29056601.pdf 6651 ioapicinit(void)
6602 // See also picirq.c. 6652 {
6603 6653 int i, id, maxintr;
6604 #include "types.h" 6654
6605 #include "defs.h" 6655 if(!ismp)
6606 #include "traps.h" 6656 return;
6607 6657
6608 #define IOAPIC 0xFEC00000 // Default physical address of IO APIC 6658 ioapic = (volatile struct ioapic*)IOAPIC;
6609 6659 maxintr = (ioapicread(REG_VER) >> 16) & 0xFF;
6610 #define REG_ID 0x00 // Register index: ID 6660 id = ioapicread(REG_ID) >> 24;
6611 #define REG_VER 0x01 // Register index: version 6661 if(id != ioapicid)
6612 #define REG_TABLE 0x10 // Redirection table base 6662 cprintf("ioapicinit: id isnt equal to ioapicid; not a MP\n");
6613 6663
6614 // The redirection table starts at REG_TABLE and uses 6664 // Mark all interrupts edgetriggered, active high, disabled,
6615 // two registers to configure each interrupt. 6665 // and not routed to any CPUs.
6616 // The first (low) register in a pair contains configuration bits. 6666 for(i = 0; i <= maxintr; i++){
6617 // The second (high) register contains a bitmask telling which 6667 ioapicwrite(REG_TABLE+2*i, INT_DISABLED | (T_IRQ0 + i));
6618 // CPUs can serve that interrupt. 6668 ioapicwrite(REG_TABLE+2*i+1, 0);
6619 #define INT_DISABLED 0x00010000 // Interrupt disabled 6669 }
6620 #define INT_LEVEL 0x00008000 // Leveltriggered (vs edge) 6670 }
6621 #define INT_ACTIVELOW 0x00002000 // Active low (vs high) 6671
6622 #define INT_LOGICAL 0x00000800 // Destination is CPU id (vs APIC ID) 6672 void
6623 6673 ioapicenable(int irq, int cpunum)
6624 volatile struct ioapic *ioapic; 6674 {
6625 6675 if(!ismp)
6626 // IO APIC MMIO structure: write reg, then read or write data. 6676 return;
6627 struct ioapic { 6677
6628 uint reg; 6678 // Mark interrupt edgetriggered, active high,
6629 uint pad[3]; 6679 // enabled, and routed to the given cpunum,
6630 uint data; 6680 // which happens to be that cpus APIC ID.
6631 }; 6681 ioapicwrite(REG_TABLE+2*irq, T_IRQ0 + irq);
6632 6682 ioapicwrite(REG_TABLE+2*irq+1, cpunum << 24);
6633 static uint 6683 }
6634 ioapicread(int reg) 6684
6635 { 6685
6636 ioapic>reg = reg; 6686
6637 return ioapic>data; 6687
6638 } 6688
6639 6689
6640 static void 6690
6641 ioapicwrite(int reg, uint data) 6691
6642 { 6692
6643 ioapic>reg = reg; 6693
6644 ioapic>data = data; 6694
6645 } 6695
6646 6696
6647 6697
6648 6698
6649 6699
Sheet 66 Sheet 66
Sep 5 23:39 2011 xv6/picirq.c Page 1 Sep 5 23:39 2011 xv6/picirq.c Page 2
6700 // Intel 8259A programmable interrupt controllers. 6750 // ICW3: (master PIC) bit mask of IR lines connected to slaves
6701 6751 // (slave PIC) 3bit # of slaves connection to master
6702 #include "types.h" 6752 outb(IO_PIC1+1, 1<<IRQ_SLAVE);
6703 #include "x86.h" 6753
6704 #include "traps.h" 6754 // ICW4: 000nbmap
6705 6755 // n: 1 = special fully nested mode
6706 // I/O Addresses of the two programmable interrupt controllers 6756 // b: 1 = buffered mode
6707 #define IO_PIC1 0x20 // Master (IRQs 07) 6757 // m: 0 = slave PIC, 1 = master PIC
6708 #define IO_PIC2 0xA0 // Slave (IRQs 815) 6758 // (ignored when b is 0, as the master/slave role
6709 6759 // can be hardwired).
6710 #define IRQ_SLAVE 2 // IRQ at which slave connects to master 6760 // a: 1 = Automatic EOI mode
6711 6761 // p: 0 = MCS80/85 mode, 1 = intel x86 mode
6712 // Current IRQ mask. 6762 outb(IO_PIC1+1, 0x3);
6713 // Initial IRQ mask has interrupt 2 enabled (for slave 8259A). 6763
6714 static ushort irqmask = 0xFFFF & ~(1<<IRQ_SLAVE); 6764 // Set up slave (8259A2)
6715 6765 outb(IO_PIC2, 0x11); // ICW1
6716 static void 6766 outb(IO_PIC2+1, T_IRQ0 + 8); // ICW2
6717 picsetmask(ushort mask) 6767 outb(IO_PIC2+1, IRQ_SLAVE); // ICW3
6718 { 6768 // NB Automatic EOI mode doesnt tend to work on the slave.
6719 irqmask = mask; 6769 // Linux source code says its "to be investigated".
6720 outb(IO_PIC1+1, mask); 6770 outb(IO_PIC2+1, 0x3); // ICW4
6721 outb(IO_PIC2+1, mask >> 8); 6771
6722 } 6772 // OCW3: 0ef01prs
6723 6773 // ef: 0x = NOP, 10 = clear specific mask, 11 = set specific mask
6724 void 6774 // p: 0 = no polling, 1 = polling mode
6725 picenable(int irq) 6775 // rs: 0x = NOP, 10 = read IRR, 11 = read ISR
6726 { 6776 outb(IO_PIC1, 0x68); // clear specific mask
6727 picsetmask(irqmask & ~(1<<irq)); 6777 outb(IO_PIC1, 0x0a); // read IRR by default
6728 } 6778
6729 6779 outb(IO_PIC2, 0x68); // OCW3
6730 // Initialize the 8259A interrupt controllers. 6780 outb(IO_PIC2, 0x0a); // OCW3
6731 void 6781
6732 picinit(void) 6782 if(irqmask != 0xFFFF)
6733 { 6783 picsetmask(irqmask);
6734 // mask all interrupts 6784 }
6735 outb(IO_PIC1+1, 0xFF); 6785
6736 outb(IO_PIC2+1, 0xFF); 6786
6737 6787
6738 // Set up master (8259A1) 6788
6739 6789
6740 // ICW1: 0001g0hi 6790
6741 // g: 0 = edge triggering, 1 = level triggering 6791
6742 // h: 0 = cascaded PICs, 1 = master only 6792
6743 // i: 0 = no ICW4, 1 = ICW4 required 6793
6744 outb(IO_PIC1, 0x11); 6794
6745 6795
6746 // ICW2: Vector offset 6796
6747 outb(IO_PIC1+1, T_IRQ0); 6797
6748 6798
6749 6799
Sheet 67 Sheet 67
Sep 5 23:39 2011 xv6/kbd.h Page 1 Sep 5 23:39 2011 xv6/kbd.h Page 2
Sheet 68 Sheet 68
Sep 5 23:39 2011 xv6/kbd.h Page 3 Sep 5 23:39 2011 xv6/kbd.c Page 1
Sheet 69 Sheet 69
Sep 5 23:39 2011 xv6/console.c Page 1 Sep 5 23:39 2011 xv6/console.c Page 2
7000 // Console input and output. 7050 // Print to the console. only understands %d, %x, %p, %s.
7001 // Input is from the keyboard or serial port. 7051 void
7002 // Output is written to the screen and serial port. 7052 cprintf(char *fmt, ...)
7003 7053 {
7004 #include "types.h" 7054 int i, c, state, locking;
7005 #include "defs.h" 7055 uint *argp;
7006 #include "param.h" 7056 char *s;
7007 #include "traps.h" 7057
7008 #include "spinlock.h" 7058 locking = cons.locking;
7009 #include "fs.h" 7059 if(locking)
7010 #include "file.h" 7060 acquire(&cons.lock);
7011 #include "memlayout.h" 7061
7012 #include "mmu.h" 7062 if (fmt == 0)
7013 #include "proc.h" 7063 panic("null fmt");
7014 #include "x86.h" 7064
7015 7065 argp = (uint*)(void*)(&fmt + 1);
7016 static void consputc(int); 7066 state = 0;
7017 7067 for(i = 0; (c = fmt[i] & 0xff) != 0; i++){
7018 static int panicked = 0; 7068 if(c != %){
7019 7069 consputc(c);
7020 static struct { 7070 continue;
7021 struct spinlock lock; 7071 }
7022 int locking; 7072 c = fmt[++i] & 0xff;
7023 } cons; 7073 if(c == 0)
7024 7074 break;
7025 static void 7075 switch(c){
7026 printint(int xx, int base, int sign) 7076 case d:
7027 { 7077 printint(*argp++, 10, 1);
7028 static char digits[] = "0123456789abcdef"; 7078 break;
7029 char buf[16]; 7079 case x:
7030 int i; 7080 case p:
7031 uint x; 7081 printint(*argp++, 16, 0);
7032 7082 break;
7033 if(sign && (sign = xx < 0)) 7083 case s:
7034 x = xx; 7084 if((s = (char*)*argp++) == 0)
7035 else 7085 s = "(null)";
7036 x = xx; 7086 for(; *s; s++)
7037 7087 consputc(*s);
7038 i = 0; 7088 break;
7039 do{ 7089 case %:
7040 buf[i++] = digits[x % base]; 7090 consputc(%);
7041 }while((x /= base) != 0); 7091 break;
7042 7092 default:
7043 if(sign) 7093 // Print unknown % sequence to draw attention.
7044 buf[i++] = ; 7094 consputc(%);
7045 7095 consputc(c);
7046 while(i >= 0) 7096 break;
7047 consputc(buf[i]); 7097 }
7048 } 7098 }
7049 7099
Sheet 70 Sheet 70
Sep 5 23:39 2011 xv6/console.c Page 3 Sep 5 23:39 2011 xv6/console.c Page 4
Sheet 71 Sheet 71
Sep 5 23:39 2011 xv6/console.c Page 5 Sep 5 23:39 2011 xv6/console.c Page 6
Sheet 72 Sheet 72
Sep 5 23:39 2011 xv6/console.c Page 7 Sep 5 23:39 2011 xv6/timer.c Page 1
Sheet 73 Sheet 73
Sep 5 23:39 2011 xv6/uart.c Page 1 Sep 5 23:39 2011 xv6/uart.c Page 2
Sheet 74 Sheet 74
Sep 5 23:39 2011 xv6/initcode.S Page 1 Sep 5 23:39 2011 xv6/usys.S Page 1
Sheet 75 Sheet 75
Sep 5 23:39 2011 xv6/init.c Page 1 Sep 5 23:39 2011 xv6/sh.c Page 1
Sheet 76 Sheet 76
Sep 5 23:39 2011 xv6/sh.c Page 2 Sep 5 23:39 2011 xv6/sh.c Page 3
7700 int fork1(void); // Fork but panics on failure. 7750 case PIPE:
7701 void panic(char*); 7751 pcmd = (struct pipecmd*)cmd;
7702 struct cmd *parsecmd(char*); 7752 if(pipe(p) < 0)
7703 7753 panic("pipe");
7704 // Execute cmd. Never returns. 7754 if(fork1() == 0){
7705 void 7755 close(1);
7706 runcmd(struct cmd *cmd) 7756 dup(p[1]);
7707 { 7757 close(p[0]);
7708 int p[2]; 7758 close(p[1]);
7709 struct backcmd *bcmd; 7759 runcmd(pcmd>left);
7710 struct execcmd *ecmd; 7760 }
7711 struct listcmd *lcmd; 7761 if(fork1() == 0){
7712 struct pipecmd *pcmd; 7762 close(0);
7713 struct redircmd *rcmd; 7763 dup(p[0]);
7714 7764 close(p[0]);
7715 if(cmd == 0) 7765 close(p[1]);
7716 exit(); 7766 runcmd(pcmd>right);
7717 7767 }
7718 switch(cmd>type){ 7768 close(p[0]);
7719 default: 7769 close(p[1]);
7720 panic("runcmd"); 7770 wait();
7721 7771 wait();
7722 case EXEC: 7772 break;
7723 ecmd = (struct execcmd*)cmd; 7773
7724 if(ecmd>argv[0] == 0) 7774 case BACK:
7725 exit(); 7775 bcmd = (struct backcmd*)cmd;
7726 exec(ecmd>argv[0], ecmd>argv); 7776 if(fork1() == 0)
7727 printf(2, "exec %s failed\n", ecmd>argv[0]); 7777 runcmd(bcmd>cmd);
7728 break; 7778 break;
7729 7779 }
7730 case REDIR: 7780 exit();
7731 rcmd = (struct redircmd*)cmd; 7781 }
7732 close(rcmd>fd); 7782
7733 if(open(rcmd>file, rcmd>mode) < 0){ 7783 int
7734 printf(2, "open %s failed\n", rcmd>file); 7784 getcmd(char *buf, int nbuf)
7735 exit(); 7785 {
7736 } 7786 printf(2, "$ ");
7737 runcmd(rcmd>cmd); 7787 memset(buf, 0, nbuf);
7738 break; 7788 gets(buf, nbuf);
7739 7789 if(buf[0] == 0) // EOF
7740 case LIST: 7790 return 1;
7741 lcmd = (struct listcmd*)cmd; 7791 return 0;
7742 if(fork1() == 0) 7792 }
7743 runcmd(lcmd>left); 7793
7744 wait(); 7794
7745 runcmd(lcmd>right); 7795
7746 break; 7796
7747 7797
7748 7798
7749 7799
Sheet 77 Sheet 77
Sep 5 23:39 2011 xv6/sh.c Page 4 Sep 5 23:39 2011 xv6/sh.c Page 5
Sheet 78 Sheet 78
Sep 5 23:39 2011 xv6/sh.c Page 6 Sep 5 23:39 2011 xv6/sh.c Page 7
Sheet 79 Sheet 79
Sep 5 23:39 2011 xv6/sh.c Page 8 Sep 5 23:39 2011 xv6/sh.c Page 9
Sheet 80 Sheet 80
Sep 5 23:39 2011 xv6/sh.c Page 10 Sep 5 23:39 2011 xv6/sh.c Page 11
Sheet 81 Sheet 81
Sep 5 23:39 2011 xv6/bootasm.S Page 1 Sep 5 23:39 2011 xv6/bootasm.S Page 2
8200 #include "asm.h" 8250 # Complete transition to 32bit protected mode by using long jmp
8201 #include "memlayout.h" 8251 # to reload %cs and %eip. The segment descriptors are set up with no
8202 #include "mmu.h" 8252 # translation, so that the mapping is still the identity mapping.
8203 8253 ljmp $(SEG_KCODE<<3), $start32
8204 # Start the first CPU: switch to 32bit protected mode, jump into C. 8254
8205 # The BIOS loads this code from the first sector of the hard disk into 8255 .code32 # Tell assembler to generate 32bit code now.
8206 # memory at physical address 0x7c00 and starts executing in real mode 8256 start32:
8207 # with %cs=0 %ip=7c00. 8257 # Set up the protectedmode data segment registers
8208 8258 movw $(SEG_KDATA<<3), %ax # Our data segment selector
8209 .code16 # Assemble for 16bit mode 8259 movw %ax, %ds # > DS: Data Segment
8210 .globl start 8260 movw %ax, %es # > ES: Extra Segment
8211 start: 8261 movw %ax, %ss # > SS: Stack Segment
8212 cli # BIOS enabled interrupts; disable 8262 movw $0, %ax # Zero segments not ready for use
8213 8263 movw %ax, %fs # > FS
8214 # Set up the important data segment registers (DS, ES, SS). 8264 movw %ax, %gs # > GS
8215 xorw %ax,%ax # Segment number zero 8265
8216 movw %ax,%ds # > Data Segment 8266 # Set up the stack pointer and call into C.
8217 movw %ax,%es # > Extra Segment 8267 movl $start, %esp
8218 movw %ax,%ss # > Stack Segment 8268 call bootmain
8219 8269
8220 # Physical address line A20 is tied to zero so that the first PCs 8270 # If bootmain returns (it shouldnt), trigger a Bochs
8221 # with 2 MB would run software that assumed 1 MB. Undo that. 8271 # breakpoint if running under Bochs, then loop.
8222 seta20.1: 8272 movw $0x8a00, %ax # 0x8a00 > port 0x8a00
8223 inb $0x64,%al # Wait for not busy 8273 movw %ax, %dx
8224 testb $0x2,%al 8274 outw %ax, %dx
8225 jnz seta20.1 8275 movw $0x8ae0, %ax # 0x8ae0 > port 0x8a00
8226 8276 outw %ax, %dx
8227 movb $0xd1,%al # 0xd1 > port 0x64 8277 spin:
8228 outb %al,$0x64 8278 jmp spin
8229 8279
8230 seta20.2: 8280 # Bootstrap GDT
8231 inb $0x64,%al # Wait for not busy 8281 .p2align 2 # force 4 byte alignment
8232 testb $0x2,%al 8282 gdt:
8233 jnz seta20.2 8283 SEG_NULLASM # null seg
8234 8284 SEG_ASM(STA_X|STA_R, 0x0, 0xffffffff) # code seg
8235 movb $0xdf,%al # 0xdf > port 0x60 8285 SEG_ASM(STA_W, 0x0, 0xffffffff) # data seg
8236 outb %al,$0x60 8286
8237 8287 gdtdesc:
8238 # Switch from real to protected mode. Use a bootstrap GDT that makes 8288 .word (gdtdesc gdt 1) # sizeof(gdt) 1
8239 # virtual addresses map dierctly to physical addresses so that the 8289 .long gdt # address gdt
8240 # effective memory map doesnt change during the transition. 8290
8241 lgdt gdtdesc 8291
8242 movl %cr0, %eax 8292
8243 orl $CR0_PE, %eax 8293
8244 movl %eax, %cr0 8294
8245 8295
8246 8296
8247 8297
8248 8298
8249 8299
Sheet 82 Sheet 82
Sep 5 23:39 2011 xv6/bootmain.c Page 1 Sep 5 23:39 2011 xv6/bootmain.c Page 2
Sheet 83 Sheet 83