XV6

entry.S 的任务：开启大页模式、设置栈寄存器 %esp，然后通过设置 %eip 进入了 main() 函数。

所以我们这节将开始调试 main 函数，将追踪 XV6 是如何初始化各种资源的。像前面几节一样进入调试，然后将断点打开 main 入口。

(gdb) b main
Breakpoint 1 at 0x80102eb0: file main.c, line 19.
(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0x80102eb0 <main>:	lea    0x4(%esp),%ecx

Thread 1 hit Breakpoint 1, main () at main.c:19
19	{
(gdb) l
14	// Bootstrap processor starts running C code here.
15	// Allocate a real stack and switch to it, first
16	// doing some setup required for memory allocator to work.
17	int
18	main(void)
19	{
20	  kinit1(end, P2V(4*1024*1024)); // phys page allocator
21	  kvmalloc();      // kernel page table
22	  mpinit();        // detect other processors
23	  lapicinit();     // interrupt controller

我们可以看到，main 的逻辑地址是 0x80102eb0，系统运行在 i386 架构上，main 函数位于 main.c 的第 19 行。

1. 初始化内存

main 函数一开始就调用了 kinit1 函数，收集空闲物理页帧，用于之后的页表分配。进入函数查看一下：

(gdb) s
=> 0x80102ebf <main+15>:	sub    $0x8,%esp
main () at main.c:20
20	  kinit1(end, P2V(4*1024*1024)); // phys page allocator
(gdb) s
=> 0x80102405 <kinit1+5>:	mov    0xc(%ebp),%esi
kinit1 (vstart=0x801154a8, vend=0x80400000) at kalloc.c:33
33	{

kinit1 收集的是 0x1154a8~0x400000 的空闲页帧，此时虽然是大页模式，但是收集页帧的单位已经是 4 K 了，这里的 end 是 kernel 的结束位置，也就是说物理内存中 0x100000 ~ end 已经被内核使用，而 end ~ 0x400000 依旧是空闲内存。

这里的 end 是由链接器定义的，从 ELF 中加载进来的，所以这里使用的是声明 extern。接着往下看

(gdb) n
=> 0x80102408 <kinit1+8>:	sub    $0x8,%esp
34	  initlock(&kmem.lock, "kmem");
(gdb) n
=> 0x8010241a <kinit1+26>:	mov    0x8(%ebp),%eax
36	  freerange(vstart, vend);

为了管理内存，这里使用到了 kmem 结构体，由于内存是共享资源（临界资源），所以提供了自旋锁实现互斥，一开始 kmem.use_lock = 0（这里没有显示，是因为编译的时候会打乱一些指令的顺序，这涉及到流水线的知识），因为处理器上只有进程也就是内核本身，所以不需要互斥锁。

紧接着就调用 freerange 函数回收 [vstart, vend) 的内存。

freerange 函数其实就是对每一页帧调用 kfree 回收罢了。因此 main 的第一个函数分析完毕。

2. 内核页表初始化

在调用 kinit1 函数的时候，内核还是使用大页模式，用的是 entry.S 中定义的简易的页表 entrypgdir[]，此时查看 %cr3 的值即可找到 entrypgdir[] 的物理地址。在 QEMU 调试器下查看 CR3=00109000，说明页表 entrypgdir[] 就在该物理地址处。

我们看 kvmalloc 函数是如何设置内核页表的。打断点进入该函数：

(gdb) b kvmalloc
Breakpoint 2 at 0x80106ca0: file vm.c, line 142.
(gdb) c
Continuing.
=> 0x80106ca0 <kvmalloc>:	push   %ebp

Thread 1 hit Breakpoint 2, kvmalloc () at vm.c:142
142	{
(gdb) s
=> 0x80106ca6 <kvmalloc+6>:	call   0x80106c20 <setupkvm>
143	  kpgdir = setupkvm();

函数使用了一个指针 kpgdir 指向 setupkvm 函数的返回值， setupkvm 函数从 kmem 链表中摘取一页物理页帧作为页表，首先将该页帧初始化为全 0。然后根据 kmap 的布局对页表进行设置。

kmap 结构体的定义如下：

// This table defines the kernel's mappings, which are present in
// every process's page table.
static struct kmap {
  void *virt;
  uint phys_start;
  uint phys_end;
  int perm;
} kmap[] = {
 { (void*)KERNBASE, 0,             EXTMEM,    PTE_W}, // I/O space
 { (void*)KERNLINK, V2P(KERNLINK), V2P(data), 0},     // kern text+rodata
 { (void*)data,     V2P(data),     PHYSTOP,   PTE_W}, // kern data+memory
 { (void*)DEVSPACE, DEVSPACE,      0,         PTE_W}, // more devices
};

布局很简单，共有四个主要区，分别是 IO 区、kernel 的代码和只读数据、kernel 的变量和空闲内存、设备区。区域划分如下表，其中 data 变量是由 kernel.ld 定义的，在 GDB 中用 p/x data 查看发现 data=0x80108000

	虚拟起始地址	物理起始地址	物理结束地址	读写权限
IO	0x80000000	0	0x100000	可写
代码和只读数据	0x80100000	0x100000	0x108000	0
变量和内存	0x80108000	0x108000	0xE000000	可写
设备区	0xFE000000	0xFE000000	0xFFFFFFFF	可写

当内核页表设置好，我们就可以调用 switchkvm 函数来切换并使用新的内核页表了。我们查看调用 switchkvm() 前后 %cr3 的值。调用前 CR3=00109000，调用后 CR3=003ff000。

至此，内核页表初始化完毕。

3. 检测其他 CPU 核

XV6 是支持多处理器的，接下来 main 调用 mpinit 函数来检测其他处理器，并根据收集的信息设置 cpus[] 数组和 ismp 多核标志。首先我们看看 cpu 结构体：

 // Per-CPU state
struct cpu {
  uchar apicid;                // Local APIC ID
  struct context *scheduler;   // swtch() here to enter scheduler
  struct taskstate ts;         // Used by x86 to find stack for interrupt
  struct segdesc gdt[NSEGS];   // x86 global descriptor table
  volatile uint started;       // Has the CPU started?
  int ncli;                    // Depth of pushcli nesting.
  int intena;                  // Were interrupts enabled before pushcli?
  struct proc *proc;           // The process running on this cpu or null
};

mpinit 只是定义了 cpus[] 数组并设置了 cpus->apicid

接下来开始调试：

(gdb) b mpinit
Breakpoint 1 at 0x80103050: file mp.c, line 93.
(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0x80103050 <mpinit>:	push   %ebp

Thread 1 hit Breakpoint 1, mpinit () at mp.c:93
93	{

这里有个关键字 thread 1，说明还有其他的线程，我们查看一下：

(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 1 (CPU#0 [running]) mpinit () at mp.c:93
  2    Thread 2 (CPU#1 [halted ]) 0x000fd412 in ?? ()

果然，已经有两个 CPU 了，不过 CPU#1 还处于 halted 状态。往下继续，mpinit 首先调用 mpconfig 函数，mpconfig 又调用 mpsearch 函数，这里涉及到一个 mp 结构体，如下：

struct mp {             // floating pointer
  uchar signature[4];           // "_MP_"
  void *physaddr;               // phys addr of MP config table
  uchar length;                 // 1
  uchar specrev;                // [14]
  uchar checksum;               // all bytes must add up to 0
  uchar type;                   // MP system config type
  uchar imcrp;
  uchar reserved[3];
};

其实就是为了找到内存上存储 MP 结构体信息，然后再根据 MP->physaddr 找到 mpconf 结构体，内容如下：

struct mpconf {         // configuration table header
  uchar signature[4];           // "PCMP"
  ushort length;                // total table length
  uchar version;                // [14]
  uchar checksum;               // all bytes must add up to 0
  uchar product[20];            // product id
  uint *oemtable;               // OEM table pointer
  ushort oemlength;             // OEM table length
  ushort entry;                 // entry count
  uint *lapicaddr;              // address of local APIC
  ushort xlength;               // extended table length
  uchar xchecksum;              // extended table checksum
  uchar reserved;
};

如果没找到 moconf 就说明是单核，将执行在一个 SMP （Symmetrical Multi-Processing）上。

接下来将根据 mpconf 表头信息配置 lapic（Local APIC）。配置信息存储在两个结构体当中，分别是 mpproc 和 mpioapic。

struct mpproc {         // processor table entry
  uchar type;                   // entry type (0)
  uchar apicid;                 // local APIC id
  uchar version;                // local APIC verison
  uchar flags;                  // CPU flags
    #define MPBOOT 0x02           // This proc is the bootstrap processor.
  uchar signature[4];           // CPU signature
  uint feature;                 // feature flags from CPUID instruction
  uchar reserved[8];
};

struct mpioapic {       // I/O APIC table entry
  uchar type;                   // entry type (2)
  uchar apicno;                 // I/O APIC id
  uchar version;                // I/O APIC version
  uchar flags;                  // I/O APIC flags
  uint *addr;                  // I/O APIC address
};

APIC 全称 Advanced Programmable Interrupt Controller， APIC 是为了多核平台而设计的。它由两个部分组成 IOAPIC 和 LAPIC。其中 IOAPIC 用于处理设备所产生的各种中断，LAPIC 则是每个 CPU 都会有一个。

所以检测其他 CPU 核其实就是配置这些相关信息，为多核 CPU 启动作准备。

4. 中断控制器的初始化

接下来开始初始化中断控制器 LAPIC，每个 CPU 都有一个。首先开启 LAPIC：

// Enable local APIC; set spurious interrupt vector.
lapicw(SVR, ENABLE | (T_IRQ0 + IRQ_SPURIOUS));

然后开启时钟中断：

// The timer repeatedly counts down at bus frequency
// from lapic[TICR] and then issues an interrupt.
// If xv6 cared more about precise timekeeping,
// TICR would be calibrated using an external time source.
lapicw(TDCR, X1);
lapicw(TIMER, PERIODIC | (T_IRQ0 + IRQ_TIMER));
lapicw(TICR, 10000000);

然后关闭逻辑中断线：

// Disable logical interrupt lines.
lapicw(LINT0, MASKED);
lapicw(LINT1, MASKED);

禁止计数器溢出中断：

// Disable performance counter overflow interrupts
// on machines that provide that interrupt entry.
if(((lapic[VER]>>16) & 0xFF) >= 4)
  lapicw(PCINT, MASKED);

错误中断映射：

// Map error interrupt to IRQ_ERROR.
lapicw(ERROR, T_IRQ0 + IRQ_ERROR);

清除错误状态寄存器，置 0

// Clear error status register (requires back-to-back writes).
lapicw(ESR, 0);
lapicw(ESR, 0);

同步总裁机制：

 // Send an Init Level De-Assert to synchronise arbitration ID's.
 lapicw(ICRHI, 0);
 lapicw(ICRLO, BCAST | INIT | LEVEL);
 while(lapic[ICRLO] & DELIVS)
  ;

启用 APIC 上的中断（但处理器还没开启）

// Enable interrupts on the APIC (but not on the processor).
lapicw(TPR, 0);

5. 初始化段表

之前 bootmain 已经对分段机制做了一些处理，现在我们继续完善。首先我们看看原来在 bootasm.S 中设置的 GDT 如下

# Bootstrap GDT
.p2align 2                                # force 4 byte alignment
gdt:
  SEG_NULLASM                             # null seg
  SEG_ASM(STA_X|STA_R, 0x0, 0xffffffff)   # code seg
  SEG_ASM(STA_W, 0x0, 0xffffffff)         # data seg

gdtdesc:
  .word   (gdtdesc - gdt - 1)             # sizeof(gdt) - 1
  .long   gdt                             # address gdt

相应的段寄存器如下：

ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0000 00000000 00000000 00000000
GS =0000 00000000 00000000 00000000

相应的段表地址如下：

GDT=     00007c60 00000017
IDT=     00000000 000003ff

接下来我们看看执行完 seginit 函数后这些都有什么变化。首先看看 seginit 函数如下：

// Set up CPU's kernel segment descriptors.
// Run once on entry on each CPU.
void
seginit(void)
{
  struct cpu *c;

  // Map "logical" addresses to virtual addresses using identity map.
  // Cannot share a CODE descriptor for both kernel and user
  // because it would have to have DPL_USR, but the CPU forbids
  // an interrupt from CPL=0 to DPL=3.
  c = &cpus[cpuid()];
  c->gdt[SEG_KCODE] = SEG(STA_X|STA_R, 0, 0xffffffff, 0);
  c->gdt[SEG_KDATA] = SEG(STA_W, 0, 0xffffffff, 0);
  c->gdt[SEG_UCODE] = SEG(STA_X|STA_R, 0, 0xffffffff, DPL_USER);
  c->gdt[SEG_UDATA] = SEG(STA_W, 0, 0xffffffff, DPL_USER);
  lgdt(c->gdt, sizeof(c->gdt));
}

首先获取当前 cpu 的私有变量，接着设置 GDT，此时的 GDT 比 bootblock 时期多了用户态段，说明此时开启了段保护机制，我们制作一个表格来直观感受：

	权限	段长	权限
SEG_KNODE	可读可执行	4 G	0
SEG_KDATA	可写	4 G	0
SEG_UCODE	可读可执行	4 G	3
SEG_UDATA	可写	4 G	3

最后使用 lgdt 更新 GDTR 寄存器。更新后 GDT= 801127f0 0000002f 。由于相应的偏移没有变，所以段寄存器的值也不需要变化。

6. 禁用 PIC

main 接着采用 picinit 函数禁用 8259A 中断控制器（单核），XV6 是使用 SMP 硬件（多核）。

// Don't use the 8259A interrupt controllers.  Xv6 assumes SMP hardware.
void
picinit(void)
{
  // mask all interrupts
  outb(IO_PIC1+1, 0xFF);
  outb(IO_PIC2+1, 0xFF);
}

7. 启用 APIC

XV6 是运行在多核 SMP 系统上的，因此中断控制器是使用 APIC。

void
ioapicinit(void)
{
  int i, id, maxintr;

  ioapic = (volatile struct ioapic*)IOAPIC;
  maxintr = (ioapicread(REG_VER) >> 16) & 0xFF;
  id = ioapicread(REG_ID) >> 24;
  if(id != ioapicid)
    cprintf("ioapicinit: id isn't equal to ioapicid; not a MP\n");

  // Mark all interrupts edge-triggered, active high, disabled,
  // and not routed to any CPUs.
  for(i = 0; i <= maxintr; i++){
    ioapicwrite(REG_TABLE+2*i, INT_DISABLED | (T_IRQ0 + i));
    ioapicwrite(REG_TABLE+2*i+1, 0);
  }
}

8. 控制台初始化

使用 consoleinit 初始化终端设备，主要是将读端和写端配置好。

void
consoleinit(void)
{
  initlock(&cons.lock, "console");

  devsw[CONSOLE].write = consolewrite;
  devsw[CONSOLE].read = consoleread;
  cons.locking = 1;

  ioapicenable(IRQ_KBD, 0);
}

9. 初始化串口

紧接着调用 uartinit 函数初始化串口。关掉 FIFO，设置接收中断，然后向其输出 xv6... 字符，说明初始化完成。

void
uartinit(void)
{
  char *p;

  // Turn off the FIFO
  outb(COM1+2, 0);

  // 9600 baud, 8 data bits, 1 stop bit, parity off.
  outb(COM1+3, 0x80);    // Unlock divisor
  outb(COM1+0, 115200/9600);
  outb(COM1+1, 0);
  outb(COM1+3, 0x03);    // Lock divisor, 8 data bits.
  outb(COM1+4, 0);
  outb(COM1+1, 0x01);    // Enable receive interrupts.

  // If status is 0xFF, no serial port.
  if(inb(COM1+5) == 0xFF)
    return;
  uart = 1;

  // Acknowledge pre-existing interrupt conditions;
  // enable interrupts.
  inb(COM1+2);
  inb(COM1+0);
  ioapicenable(IRQ_COM1, 0);

  // Announce that we're here.
  for(p="xv6...\n"; *p; p++)
    uartputc(*p);
}

10. 初始化进程信息表

其实就是对 ptable 锁初始化，使其处于开锁状态。

void
pinit(void)
{
  initlock(&ptable.lock, "ptable");
}

11. 设置中断向量

对中断向量初始化，总共 256 个。

void
tvinit(void)
{
  int i;

  for(i = 0; i < 256; i++)
    SETGATE(idt[i], 0, SEG_KCODE<<3, vectors[i], 0);
  SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER);

  initlock(&tickslock, "time");
}

12. 磁盘初始化

对内核的读写其实是对 bcache 结构体的操作，我们先看看 bcache 结构体：

struct {
  struct spinlock lock;
  struct buf buf[NBUF];

  // Linked list of all buffers, through prev/next.
  // head.next is most recently used.
  struct buf head;
} bcache;

其中有一个自旋锁，还有 NBUF 个缓存存，我们看看盘块的结构体 buf 如下：

struct buf {
  int flags;
  uint dev;
  uint blockno;
  struct sleeplock lock;
  uint refcnt;
  struct buf *prev; // LRU cache list
  struct buf *next;
  struct buf *qnext; // disk queue
  uchar data[BSIZE];
};

其中包括设备号 dev，数据 data。bcache 是磁盘的代表，初始化主要让这些盘块串成一个双链表，为之后的调度做准备。

void
binit(void)
{
  struct buf *b;

  initlock(&bcache.lock, "bcache");

//PAGEBREAK!
  // Create linked list of buffers
  bcache.head.prev = &bcache.head;
  bcache.head.next = &bcache.head;
  for(b = bcache.buf; b < bcache.buf+NBUF; b++){
    b->next = bcache.head.next;
    b->prev = &bcache.head;
    initsleeplock(&b->lock, "buffer");
    bcache.head.next->prev = b;
    bcache.head.next = b;
  }
}

13. 初始化文件表

文件表 ftable 记录所有打开的文件，可类比程序表 ptable。

void
fileinit(void)
{
  initlock(&ftable.lock, "ftable");
}

14. IDE 控制器初始化

IDE 磁盘驱动器的初始化，IDE 属于临界资源，故也需要自旋锁保护。

void
ideinit(void)
{
  int i;

  initlock(&idelock, "ide");
  ioapicenable(IRQ_IDE, ncpu - 1);
  idewait(0);

  // Check if disk 1 is present
  outb(0x1f6, 0xe0 | (1<<4));
  for(i=0; i<1000; i++){
    if(inb(0x1f7) != 0){
      havedisk1 = 1;
      break;
    }
  }

  // Switch back to disk 0.
  outb(0x1f6, 0xe0 | (0<<4));
}

15. 多核启动

此时开始初始化其他 CPU 核，调用 startothers 函数。AP 的启动过程和 BP（boot processer）基本一致，首先将 entryother.S 拷贝到无用的物理地址 0x7000（之前是 bootblock 的内存），然后设置好栈（通过 kalloc 分配和设置 %esp）、入口地址（设置 %eip）、页表（使用 entrypgdir）等。然后 BP 发出中断，进入死循环等待 AP 开启完毕。

// Start the non-boot (AP) processors.
static void
startothers(void)
{
  extern uchar _binary_entryother_start[], _binary_entryother_size[];
  uchar *code;
  struct cpu *c;
  char *stack;

  // Write entry code to unused memory at 0x7000.
  // The linker has placed the image of entryother.S in
  // _binary_entryother_start.
  code = P2V(0x7000);
  memmove(code, _binary_entryother_start, (uint)_binary_entryother_size);

  for(c = cpus; c < cpus+ncpu; c++){
    if(c == mycpu())  // We've started already.
      continue;

    // Tell entryother.S what stack to use, where to enter, and what
    // pgdir to use. We cannot use kpgdir yet, because the AP processor
    // is running in low  memory, so we use entrypgdir for the APs too.
    stack = kalloc();
    *(void**)(code-4) = stack + KSTACKSIZE;
    *(void(**)(void))(code-8) = mpenter;
    *(int**)(code-12) = (void *) V2P(entrypgdir);

    lapicstartap(c->apicid, V2P(code));

    // wait for cpu to finish mpmain()
    while(c->started == 0)
      ;
  }
}

设置好后会调用 entryother.S 中的代码，entryother.S 相当于 bootasm.S 和 entry.S 的结合体，AP 设置完毕后调用 mpenter 函数，并设置页表等，然后在进入 scheduler 之前让 BP 继续完成接下来的工作。

16. 收集空闲物理页帧

AP 启动完成后，会调用 mpmain 进入调度程序，进入前会告诉 BP 自己已经启动，于是 BP 从 startothers 函数返回，开始将所有的空闲物理内存收集起来，可用物理地址为 [0, 240 MB]，其中 [0,4 MB] 以及由 kernel 占用或被 kinit1 收集，剩下的 [4 MB, 240 MB] 交给 kinit2 函数负责。

void
kinit2(void *vstart, void *vend)
{
  freerange(vstart, vend);
  kmem.use_lock = 1;
}

之后就开始使用 kmem.lock 了。

17. 第一个用户进程

通过调用 userinit 来启动第一个用户进程，首先调用 allocproc 函数分配 PCB，我们看看 PCB 结构体是如何的：

// Per-process state
struct proc {
  uint sz;                     // Size of process memory (bytes)
  pde_t* pgdir;                // Page table
  char *kstack;                // Bottom of kernel stack for this process
  enum procstate state;        // Process state
  int pid;                     // Process ID
  struct proc *parent;         // Parent process
  struct trapframe *tf;        // Trap frame for current syscall
  struct context *context;     // swtch() here to run process
  void *chan;                  // If non-zero, sleeping on chan
  int killed;                  // If non-zero, have been killed
  struct file *ofile[NOFILE];  // Open files
  struct inode *cwd;           // Current directory
  char name[16];               // Process name (debugging)
};

上面的注释已经很清楚了，allocproc 函数首先找一个空闲的 PCB，然后将 PCB 状态置为 EMBRYO，pid 由全局自增变量赋值，接着分配内核栈，每个进程都有自己的内核栈，然后将栈首地址赋给 sp 指针。

由于栈的地址是从高到低的，所以栈地址应该记录栈顶，也就是加上 KSTACKSIZE。

进程的内核栈初始化后的结构如下：

/*
进程的kernel stack初始化状态，

                  /   +---------------+ <-- stack base(= p->kstack + KSTACKSIZE)
                  |   | ss            |                           
                  |   +---------------+                           
                  |   | esp           |                           
                  |   +---------------+                           
                  |   | eflags        |                           
                  |   +---------------+                           
                  |   | cs            |                           
                  |   +---------------+                           
                  |   | eip           | <-- 从此往上部分，在iret时自动弹出到相关寄存器中，只需把%esp指到这里即可
                  |   +---------------+    
                  |   | err           |  
                  |   +---------------+  
                  |   | trapno        |  
                  |   +---------------+                       
                  |   | ds            |                           
                  |   +---------------+                           
                  |   | es            |                           
                  |   +---------------+                           
                  |   | fs            |                           
 struct trapframe |   +---------------+                           
                  |   | gs            |                           
                  |   +---------------+   
                  |   | eax           |   
                  |   +---------------+   
                  |   | ecx           |   
                  |   +---------------+   
                  |   | edx           |   
                  |   +---------------+   
                  |   | ebx           |   
                  |   +---------------+                        
                  |   | oesp          |   
                  |   +---------------+   
                  |   | ebp           |   
                  |   +---------------+   
                  |   | esi           |   
                  |   +---------------+   
                  |   | edi           |   
                  \   +---------------+ <-- p->tf                 
                      | trapret       |                           
                  /   +---------------+ <-- forkret will return to
                  |   | eip(=forkret) | <-- return addr           
                  |   +---------------+                           
                  |   | ebp           |                           
                  |   +---------------+                           
   struct context |   | ebx           |                           
                  |   +---------------+                           
                  |   | esi           |                           
                  |   +---------------+                           
                  |   | edi           |                           
                  \   +-------+-------+ <-- p->context            
                      |       |       |                           
                      |       v       |                           
                      |     empty     |                           
                      +---------------+ <-- p->kstack           
 */

紧接着全局指针 initproc 标记第一个进程。然后配置进程空间的内核区，紧接着配置第一个进程空间的用户区，如下：

// Load the initcode into address 0 of pgdir.
// sz must be less than a page.
void
inituvm(pde_t *pgdir, char *init, uint sz)
{
  char *mem;

  if(sz >= PGSIZE)
    panic("inituvm: more than a page");
  mem = kalloc();
  memset(mem, 0, PGSIZE);
  mappages(pgdir, 0, PGSIZE, V2P(mem), PTE_W|PTE_U);
  memmove(mem, init, sz);
}

配置完继续完善 PCB，包括文件大小、段寄存器、EFLAGS 标志寄存器、栈寄存器、指令寄存器、名字、目录。最后设置状态为 RUNNABLE。

//PAGEBREAK: 32
// Set up first user process.
void
userinit(void)
{
  struct proc *p;
  extern char _binary_initcode_start[], _binary_initcode_size[];

  p = allocproc();
  
  initproc = p;
  if((p->pgdir = setupkvm()) == 0)
    panic("userinit: out of memory?");
  inituvm(p->pgdir, _binary_initcode_start, (int)_binary_initcode_size);
  p->sz = PGSIZE;
  memset(p->tf, 0, sizeof(*p->tf));
  p->tf->cs = (SEG_UCODE << 3) | DPL_USER;
  p->tf->ds = (SEG_UDATA << 3) | DPL_USER;
  p->tf->es = p->tf->ds;
  p->tf->ss = p->tf->ds;
  p->tf->eflags = FL_IF;
  p->tf->esp = PGSIZE;
  p->tf->eip = 0;  // beginning of initcode.S

  safestrcpy(p->name, "initcode", sizeof(p->name));
  p->cwd = namei("/");

  // this assignment to p->state lets other cores
  // run this process. the acquire forces the above
  // writes to be visible, and the lock is also needed
  // because the assignment might not be atomic.
  acquire(&ptable.lock);

  p->state = RUNNABLE;

  release(&ptable.lock);
}

18. BP 初始化完成

最后 BP 也调用 mpmain 函数进入调度函数 scheduler()，所以我们可以发现，其实 BP 是最后进入调度函数的，每个 CPU 进入调度函数之前都会向终端输出启动信息，mpmain 代码如下：

// Common CPU setup code.
static void
mpmain(void)
{
  cprintf("cpu%d: starting %d\n", cpuid(), cpuid());
  idtinit();       // load idt register
  xchg(&(mycpu()->started), 1); // tell startothers() we're up
  scheduler();     // start running processes
}

This site is open source. Improve this page.