OmegaGB Nav
Mail Me
bit (a) mutantlemon.com
Check it
GB Resources
Intro
OmegaGB is a Nintendo Game Boy emulator I am developing using the Haskell programming language. This Devlog tracks my progress.
Enjoy,
-- bit
Map Viewer - Part 2
April 22 2006
The code snapshot relevant to this post is snapshot 20060422
My previous post ended with me still on my short-term quest to get a map viewer working. I had discovered that I would need to emulate the V-Blank interrupt in order to proceed. That is what I set out to do.
The code I had for periodically incrementing the LY register would be the
base for emulation of the V-Blank interrupt. I added to the IrqStates record
a field for counting down to the V-Blank cycle. I updated the
irqUpdate function so that it would decrement the counter,
and when it reaches zero, trigger the V-Blank interrupt by pushing the
Program Counter to the stack and then modify the Program Counter to the
V-Blank jump address. The already ugly irqUpdate function
became even uglier. Querying and modifying state is not very pretty when
done functional style. I plan to later convert this function to use a
state monad.
But at least it seems to work! Games no longer seem to get stuck in an
infinite loop waiting for the V-Blank. However, my program soon crashes
with a friendly error message: <<loop>>.
Apparently my program is getting stuck in what I think they call a
blackhole. The GHC runtime is detecting that a computation contains
an infinite loop, and so it bails out. When I run my program from within
ghci, I don't get this error, the program simply freezes.
Here begins my first great haskell debugging saga.
I've never really had to debug this kind of problem in haskell before, so I'm not sure what to do. There don't seem to really be any debuggers for haskell, for some reason people claim to not need them. So I decide to use the old fashioned "printf" method of debugging: everywhere you can in the code, put calls to print something to the console. Then when the crash happens, you look at the last thing that was printed. You now know that the error happened somewhere between the point in the code that prints this message, and the next point in code that prints a message.
I scatter putStrLn calls in my step function,
and figure out the the crash is occuring somewhere inside my
updateDebugPanel function, so I scatter putStrLn
calls in there. This function is responsible for updating the GUI so
that all of the displayed values of the registers are updated. My
"printf" debugging tells me that the error is happening somewhere in
here:
reg_a `entrySetText` showHex1 (getRegState regS M_A)
I find this totally baffling. I ask myself how this simple call can be causing an infinite loop? It doesn't make any sense. All it's doing is taking the value of the A register from the state, and updating the GUI with it. How could this be causing some blackhole error?
The obvious answer soon hits me: haskell is lazy! The reading of the A register can cause an entire chain of computations to be executed, computations that have been delayed until the value of A is actually requested. Now I have a problem. How am I supposed to find out where the bug is? My code base is no longer trivially short, so there is no way I'm going through it line by line, looking for a needle in a haystack. The "printf" debugging method won't help me, since the bug is in "pure" functional code, where printf is not allowed.
I could probably use Debug.trace,
which is like printf, but it's allowed also in pure functional code.
But I have a better idea. I switch from my graphical debugger back
to my previous text emulator, the one that simply prints the CPU
instructions as they are executed. I want to know what's going on
that causes the strange behaviour in the A register. So I modify
the program to also print the value of the A register along with
each instruction.
I run the program, it dies with the <<loop>>
error. I look at the last executed CPU instruction before the crash,
and it's an ADD instruction that I had just recently implemented. I
look at my code for this ADD instruction, and sure enough, there it is:
let v'' = v + v''
Arrgh! How could I make such a stupid mistake? And why did it take me so long to find it? (In imperative programming languages, like C, code like this is used all the time for incrementing variables. But haskell doesn't have variables, and is more similar to mathematics, where such an equation makes no sense)
Overall my debugging session went pretty well. The important thing
is that I eventually found the bug
.
This experience helped make lazy evaluation more "real" for me.
My way of going about the debugging wasn't that great, but I
feel that from now on, I should be able to nail similar bugs
quicker.
After examining some game roms, I realize that I'm going to need to emulate more then just the V-Blank interrupt if I want to be able to execute them up to the beginning of the title screen even. I read through the pan docs gameboy interrupts sections over and over again. Some aspects aren't totally clear to me. "joat" from a gameboy irc channel helps me out, and gives me the following algorithm for dealing with gameboy interrupts:
joat: how to work it good:
joat: whenever a peripheral triggers an irq, set the bit in the IF latch
joat: whenever the cpu writes to IF, latch the value
joat: every cycle (this is the slow way, depends on your emu arch on how to
speed it up), if IME is 1 and IE & IF != 0, trigger the highest
priority interrupt set in IF, and clear that bit in IF
joat: servicing the interrupt also disables IME
joat: the bits that don't get servied remain set in IF
joat: waiting for a future time when their corresponding IE bit gets set (if it
isn't), and they too get serviced, or the user clears them manually
Thanks joat! Things are a lot clearer now. It's time to clean
up the irqUpdate function. I switch to using a state
monad, clean things up a bit, and implement full emulation support
of the STAT register, including all of it's various interrupts.
Everything I need for the the map viewer should be working now! I hack up the debugger, and add a viewport for displaying a real-time dump of the tiles in video ram:
Yes! Finally some graphics! A confirmation that my
emulator actually does something, and doesn't just print random
numbers to the screen
.
Gameboy background maps index into these tiles. I'm just one
step away to getting such a map viewer working, and here it is:
That is the map that "Dropzone" uses for its title screen. Kind of boring, isn't it? The previous screenshot showing the tile patterns dump is a bit cooler I think. Now that I have this map viewer working I realize that my emulator should in fact be complete enough in order to actually display the intro and title screen of some games!
I quickly put together a new minimal gui, and hack up some code to render the display of the gameboy. I only implement the minimum of the minimum needed to display the background map, which should be enough for the simple scroll effects that games use for their intros. It seems to work great! Here is a screenshot and two animated gifs. Cowabunga!
"Brainbender" has got some pretty funky scaling going on, and I
seem to be emulating it properly
.
It appears that the emulator is working awesome so far! It still
needs a lot of work though. Out the games I've tested, the
emulator only succeeded in running and displaying the title
screen of these two. The majority of games will not work for the
simple reason that they use cartridges with memory banking support,
and I have not implemented emulation for this yet. This is why
I've been using "Dropzone", "Brainbender" and "Asteroids" for
testing. They are of the few 32Kb(meaning no banking) games that
I have. By the way, "Asteroids" sort of shows the title screen,
but it's a bit garbled. This is almost definitely because of lack of
rendering support for sprites or the gameboy "window".
The big problem though is speed. The emulator is
slow
.
I was expecting it to be slow, since my strategy so far
has been to not think about performance at all; just get
things working. At least things are working! The emulator runs
at about 1/10th the speed it needs to for real time emulation.
I will eventually need to do some optimizations. That might be
the theme of the next post.
Peace,
-- bit
Map Viewer - Part 1
April 17 2006
The code snapshot relevant to this post is snapshot 20060417
Now that I can execute game ROM code, I set a short term goal to get a gameboy background map-tile viewer working. It should have been simple enough.
Gameboy has video ram that contains tile data and also a tilemap indexing the tiles. All I would need to do to get the map viewer working would be to read this video ram every V-Blank period(LCD display refresh period) and draw it.
First I implemented a missing cpu instruction, so that "Asteroids" wouldn't bail out after a few hundred instructions. Now this sucker executes, and executes... and sucks up an assload of my system ram!
I use the iterate function to make an infinite list of the
machine state as it changes after each CPU instruction, and then i simply
fetch the instruction from each state and print it. I was hoping that GHC
would be able to figure out that it only needs to have one state in memory
that can be mutated, but I guess this is not happening: it seems that all
of the old states are building up in ram, until all available ram runs
out and I get a "Killed" message - I'm guessing as a result of the operating
system shutting down the process.
Someone from #haskell mentions using STArray instead of regular
Array which I've been using. The array is used for storing
the gameboy's RAM data. So I look through the GHC library docs, and along
with STArray, I also see UArray, which is supposed
to be unboxed arrays.
According to what it says there, my regular Array that I've been using
boxes all of it's elements! Since my array contains thousands of
Word8 elements, this is not cool! The docs say that switching
from boxed Array to unboxed UArray can be
done by simply changing the type signature of your array.
I replace "Array" with "UArray" in a few places in my code, and all of my problems seem to be solved! Memory usage now stays constant, and I can execute roms without the process getting "Killed". I'm not sure why GHC was able to figure out that it can mutate unboxed arrays, while boxed arrays it couldn't. Puzzling indeed, but I don't spend too much time thinking about it since everything works now, and I want to continue and get the map viewer working!
Before I can start working on the map viewer, I notice that all roms that I test get stuck in a loop like this one:
$60e LDH A,($FF00+$44) $610 CP A,$91 $612 JR NZ,-6
These 3 instructions are executed over and over again. Memory address
$FF44 on the gameboy is called the LY register. According
to the Pan Docs:
The LY indicates the vertical line to which the present data is transferred to the LCD Driver. The LY can take on any value between 0 through 153. The values between 144 and 153 indicate the V-Blank period. Writing will reset the counter.
So this loop monitors the LY register in order to wait for the
V-Blank(Why do they wait for LY to be 145($91 = 145) and not 144? I guess to
play it safe or something
).
Unfortunately for the game, this will never happen since I haven't
implemented emulation for any of the features of the gameboy hardware
except for it's CPU
The big question is: How should I implement the "ticking" of the LY register? This is actually part of the bigger question: How should I implement interrupts and joypad input and sound output and all these other things things? I start thinking about some sort of design that would allow pluggable memory-register-injection routines and registering to notifications of register changes, and all kinds of other support systems so that everything needed to be done would be doable, yet the system would still be flexible.
I forget about this idea and go with the more practical, "Do The Simplest Thing That Could Possibly Work". This post finally gets some code:
hBlankPeriod = 456 data IrqStates = IrqStates { irqStateIME :: Bool, -- Interrupt Master Enable irqStateHBlankCounter :: Int -- CPU cycles until next hblank } initialIrqStates = IrqStates { irqStateIME = False, irqStateHBlankCounter = hBlankPeriod }
IrqStates is additional state data that must now be a part
of the machine state. Not sure if "Irq" is the greatest name for it,
since the H-Blank counter(which will be used to incremement the LY register)
isn't really related to interrupts in this context. But I couldn't think
of any other name, and one could conceptually think about LY incrementing
as some sort of internal interrupt that executes builtin code for incrementing
the LY register. Anyway... I need to transform this state after each CPU
instruction is executed, so I come up with this really ugly function:
irqUpdate :: IrqStates -> CycleCount -> (IrqStates, (RegisterStates, Memory) -> (RegisterStates, Memory)) irqUpdate is c = let is' = is { irqStateHBlankCounter = (irqStateHBlankCounter is)-c } in if irqStateHBlankCounter is' <= 0 then (is' { irqStateHBlankCounter = (irqStateHBlankCounter is')+hBlankPeriod }, \(r, m) -> let oldLY = (memRam m)!0xFF44 newLY = incWrap 153 oldLY in (r, m { memRam = (memRam m)//[(0xFF44, newLY)] }) ) else (is', id) where incWrap limit n = if (n+1) > limit then 0 else n+1
Notice that this works with the Memory type directly, instead
of the abstract MemoryModel type class. This is because I need direct
access to the memRam field; I can't use the writeMem function because it will
filter writes to the various special memory areas that must be modified. (Check
out Memory.hs to see exactly what's going on.)
I Modified the updateMachine function so that it works with and
updates this new additional state:
updateMachine :: ((RegisterStates, Memory), IrqStates) -> ((RegisterStates, Memory), IrqStates) updateMachine (state@(regS, memS), irqS) = let stepInstruction = machineStepInstruction state pc = getReg2State regS M_PC opcode = readMem memS pc cycles = opcodeCycleCount opcode (irqS', interrupt) = irqUpdate irqS cycles in (interrupt stepInstruction, irqS')
This allows the H-Blank counter to decrease the amount of cpu cycles that the
executed instruction takes. The machine state is now:
((RegisterStates, Memory), IrqStates). I use a nested tuple
like this, since the few other existing functions work with
(RegisterStates, Memory). It's kind of ugly. This and that new
irqUpdate function is causing severe uglification of the code!
I'm not too worried though, since Haskell is great for code refactoring:
The language is powerful enough to keep code short, so the ugly code is only like
20 lines. And all of the code that directly interfaces with the ugly code is
only another 20 lines. In addition, Haskell's strong type system helps
you feel confident that your refactoring is correct.
Enough rambling! LY should be working! I run the "Dropzone" game, and it does indeed get past that previously problematic loop! But now it gets stuck in another loop:
$27c4 LD A,($c103) $27c7 OR A,A $27c8 JR Z,-6
This apparently is a loop that waits for memory address $C103 to
be non-zero. $C103 is not an IO Register or anything like that.
It's in the gameboy's internal RAM area. The only explanation I can think of
is that an interrupt routine is supposed to modify $C103
I put an error function in my memory writing routine to catch
when there is a write to the Interrupt Enable register(IE), so I'll be able to find
out which interrupt the game wants enabled(V-Blank? Timer? Joypad?).
Unfortunately the game zeroes the IE register near start up so this triggers
the error and I am stuck with no new clues. I could hack up some
Debug.Trace calls and scatter some putStrLns in
order to get some information, but I decide that I might as well do things
right.
I fire up glade-2, and design some sort of gameboy debugger GUI, inspired from the NO$GMB GameBoy emulator/debugger. Unfortunately, glade freezes up just before I save and I lose all my work. Bad glade! I design the whole thing again from scratch saving every 5 seconds.
Now I start coding the debugger, using gtk2hs, of course, which is a gtk+ binding for haskell, with support for loading glade user interfaces. I encounter some bizarre haskell behaviour. For some reason, this code doesn't compile:
do let bindWidget = xmlGetWidget windowXml window_main <- bindWidget castToWindow "window_main" button_open <- bindWidget castToButton "button_open"
If I change that bindWidget line to
let bindWidget x y = xmlGetWidget windowXml x y
then it compiles and works! I'm not sure why I can't curry the function
in this way, probably because of some weirdness of do notation.
I don't have time to get to the bottom of this though, it works, and I am
eager to get the debugger into working shape! I soon encounter another bizarre
oddity. An error that is triggered from a write to an undocumented
IO register seems to shutdown/crash ghci! I try to catch it as an exeption but
it's useless. Not sure why this is happening. I find it very disturbing.
It doesn't matter though, because the debugger works! Check it:
Looks pretty cool, right?
It's pretty minimal, but it gets the job done, and it's less then 200 lines
of haskell. Programming with gtk2hs has a similar feel to programming in
Python PyGtk, but with haskell you have
all these really cool monad combinators and very well defined mutable state.
The resulting code, even though it's imperative, looks like no other gui
code I've seen in other languages. When they say, "Haskell is the best
imperative language", it's not just a joke!
Let's get back to $C103. In the screenshot you can see that
I've run "Dropzone" and paused it once it got stuck in the $C103
loop. The debug panel on the lower left shows me what I need to know.
The Interrupt Master Enable(IME) is set, confirming that interrupts have
been enabled with the EI CPU instruction. The Interrupt Enable(IE) Register
shows $03 meaning that bits 0 and 1 are set. This means that the only
interrupts that are enabled are V-Blank and LCDC. LCDC is actually several
interrupts related to the LCD display controller. The STAT IO Register is
used to select which LCD controller interrupts are enabled. The debug panel
shows that STAT equals $80 which means that only bit 7 is set. According
to the Pan Docs, bit 7
isn't used. Since no other bits are set, all of the LCDC interrupts are
disabled. This means that the only interrupt that we are dealing with right
now is V-Blank! The mystery is solved.
I will need to implement emulation of the V-Blank interrupt in order to
continue. But before I do that, I hexdump the game rom. The V-Blank
interrupt start address is $0040, and here we have:
C3 C0 1F. C3 is the opcode for JP,
so this code disassembles to the instruction JP $1FC0.
I scroll down the hexdump of the game rom till I get to $1FC0.
I scan the following bytes, and sure enough, I see the sequence
EA 03 C1, which disassembles to LD ($C103),A.
The V-Blank interrupt does indeed seem to write to our good friend
$C103
Implementation of the V-Blank interrupt will have to wait until next time. What was supposed to be a simple task -- getting a map viewer working -- turned out to be littered with obstacles, that forced me even to build a graphical debugger. Will I succeed in getting the anticipated map viewer working in time for the next post?
Tune in next time to find out!
-- bit
Hello, World!
April 16 2006
I finally decided to learn Haskell, that weird programming language that is supposed to be so great. I played around with it for a while and did a few small tests and discovered that it really is great.
So I was looking for a cool project idea to practice my Haskell. Something that would be serious, fun to do, and useful. A Nintendo Game Boy emulator may not be the most useful program in the universe, but everyone loves the gameboy, and writing an emulator is considered a pretty serious programming job. I deliberated for a while between gameboy or NES, and decided to go with gameboy since it's simpler.
I actually started coding the emulator about a week ago, so I've already got a bit of code done. I'll describe what I've got so far. (Source code is in snapshot 20060414)
CPU Emulation
CPU Emulation is pretty much working, except for a few instructions left that
need to be implemented. I use an Instruction type that looks like
this:
data Instruction = LDR Register Register | -- LD r,r LDRN Register Word8 | -- LD r,n {- ... -} PUSH StackRegister | -- PUSH sr POP StackRegister | -- POP sr ADD Register | -- ADD A,r {- ... -} AND Register | -- AND A,r ANDN Word8 | -- AND A,n ANDHL | -- AND A,(HL) {- ... -} SCF | -- SCF NOP | -- NOP HALT | -- * HALT STOP | -- * STOP DI | -- * DI EI | -- * EI {- ... -} JP (Maybe FlagCondition) Word16 | -- JP nn JP cc,nn JPHL | -- JP (HL) JR (Maybe FlagCondition) Int8 | -- JR a JR cc,a CALL (Maybe FlagCondition) Word16 | -- CALL a CALL cc,a RST RestartAddress | -- RST n RET (Maybe FlagCondition) | -- RET RET cc RETI -- * RETI
It has a total of 94 constructors. Yikes! But at least it staticly guarantees
that all instructions are valid. Then there is a function
machineCodeToInstruction that assembles an instruction
from machine code bytes.
In order to execute an Instruction, I wanted to be able to describe,
using some sort of mini-language, the operation of each cpu instruction. This
is where haskell shows its power. With the help of chatting with Philippa
on #haskell, and her
MonadArticleThingy,
I was able to get this ExecutionAST type:
data ExecutionAST result where Return :: result -> ExecutionAST result Bind :: (ExecutionAST oldres) -> (oldres -> ExecutionAST result) -> ExecutionAST result WriteRegister :: M_Register -> Word8 -> ExecutionAST () ReadRegister :: M_Register -> ExecutionAST Word8 WriteRegister2 :: M_Register2 -> Word16 -> ExecutionAST () ReadRegister2 :: M_Register2 -> ExecutionAST Word16 WriteMemory :: Word16 -> Word8 -> ExecutionAST () ReadMemory :: Word16 -> ExecutionAST Word8 instance Monad ExecutionAST where return = Return (>>=) = Bind
Thanks Philippa! It took me a while to get this right, but in the process, my
understanding of monads improved a lot, and I also learned about GADTs(Generic
Algebraic Data Types). This type does exactly what I wanted, it describes a
mini language with a minimal set of operations for reading/writing registers
and memory. Now I can implement an executeInstruction function:
executeInstruction :: Instruction -> ExecutionAST () executeInstruction ins = case ins of LDR r1 r2 -> do a <- readRegister (mRegister r2) writeRegister (mRegister r1) a incPC1 {- ... -} PUSH sr -> do a <- readRegister2 (mRegister2 SP) let a' = a - 1 let a'' = a - 2 v <- readRegister2 (mRegisterS sr) let (hi, lo) = splitWord16 v writeMemory a' hi writeMemory a'' lo writeRegister2 (mRegister2 SP) a'' incPC1 {- ... -} where incPC1 :: ExecutionAST () incPC1 = do pc <- readRegister2 M_PC let pc' = pc + 1 writeRegister2 M_PC pc'
The do haskell syntax sugar makes writing this code easy, in
an imperative style, but of course it's all still purely functional. Now comes
the fun part, the last step of the CPU emulation. I need an interpreter for the
ExecutionAST "mini-language". Philippa's pretty much showed us
how to do it:
class MemoryModel a where readMem :: a -> Word16 -> Word8 writeMem :: a -> Word16 -> Word8 -> a machineCpuExecute :: (MemoryModel m) => (RegisterStates, m) -> ExecutionAST () -> (RegisterStates, m) machineCpuExecute s e = fst (machineCpuExecute' s e) machineCpuExecute' :: (MemoryModel m) => (RegisterStates, m) -> ExecutionAST a -> ((RegisterStates, m), a) machineCpuExecute' state@(regS, memS) e = case e of Return result -> (state, result) Bind l r -> let (s, result) = machineCpuExecute' state l in machineCpuExecute' s (r result) WriteRegister reg n -> ((setRegState regS reg n, memS), ()) ReadRegister reg -> (state, getRegState regS reg) WriteRegister2 reg2 nn -> ((setReg2State regS reg2 nn, memS), ()) ReadRegister2 reg2 -> (state, getReg2State regS reg2) WriteMemory a n -> ((regS, writeMem memS a n), ()) ReadMemory a -> (state, readMem memS a)
This code is pretty complicated, at least for me
.
It works by transforming
the state of the registers and memory, plus returning an additional computation
result. (I use a MemoryModel type class because
I wanted to test the CPU with a ReallySimpleMemory to make sure
that it works, before implementing gameboy's complicated memory system, with
its switchable banks, echoed areas and IO registers.) machineCpuExecute'
pattern matches against the CpuExecution constructors:
-
Return- The state is unchanged, the computation result is the value passed to Return -
Bind- This is where the magic happens. this is pretty much function composition with an additional threading of state. If you are a newbie like me then read the code over and over again until you understand it
-
WriteXXX- Transforms the state and returns as the computation result the unit value -
ReadXXX- The state is unchanged, computation result is the value of what we want to read
I really love this function
It shows off some of haskell's great features.
Together, all of the code I've described so far shows how haskell helps you
organize your software into layers. Anyway, I made two more functions,
fetchInstruction and stepInstruction and I am able
to execute some test programs.
Game Boy ROM loading
I coded a simple function for loading rom images into memory, and extracting the header information. It works great for 32KB and 64KB roms, but for some reason it takes a while for it to load a 128KB rom, and loading a 256KB rom is real slow. 512KB roms take forever. Ok, it's not super bad, but loading of all of these should be instantaneous.
Now that I can load gameboy ROM images, I try and succeed in executing them! Totally sweet. Check it:
Here I am reminded of another great quality of Haskell: Your programs work the first time you run them. Haskell's awesome type system and it's way of encouraging(forcing?) you to code in a certain(functional?) style keeps bugs to a minimum. No hours of debugging, only to find that you made some stupid coding mistake. I, a Haskell newbie, was able to get an almost fully functional gameboy CPU emulator working with only a few intense hours of work. Pretty impressive I think.
This is the end of the first post to this devlog! I hope you enjoyed it. I had to cover quite a bit of development with this post, so I didn't get to go into as much detail as I would have liked to. Next time I will talk in depth about problems encountered during development, design thoughts, my struggles with understanding parts of the gameboy hardware, and more!
Stay tuned...
-- bit
Game Boy is a registered trademark of Nintendo. The author is in no way affiliated with Nintendo.