R5900 Emulation

Because I never even try to write meaningful things here anymore, I thought it might be a nice change of pace to actually mention something I've been working on lately. That would be a new R5900 emulator for Noesis. This was kind of born on a tangent from initially just wanting to be able to debug retail PS2 games. More related to the task, though, I've been wanting a way to avoid hand-translating a bunch of disassembled R5900 code every time I run into a new compression method. Which happens a whole hell of a lot. I know that when I'm developing a game and have need to compress some data, my first thought is generally not "I think I will waste hours of development time by making a brand new compression algorithm." Apparently, however, a lot of other developers differ from me in this respect. At least they did, back in the Saturn-PS2 eras. I suppose good platform compression standards were a tiny bit harder to come by back then, though. These days everyone seems to just be lazy and use zlib. Hell, why not? It's usually good enough.

So then I thought, wouldn't it be great if instead of having to disassemble code and translate it to extract encrypted/compressed data, I could just run the actual native decompression code in Noesis, after doing a little bit of work to figure out what the function expects as stack/register inputs. As a bonus, tons of PS2 ELF binaries leave their symbol table fully intact. So in a good number of cases, I can execute my code from symbol offsets, likely retaining compatibility with different versions of the same game from different regions and so on.

On top of that, I really like writing CPU emulators. Sometimes. Once I start getting into it and writing logic for yet another variant on a bit shift instruction, it becomes less fun. Nonetheless, though, I found the idea of writing a R5900 emulator really appealing. I figured it would also improve my MIPS-reading skills, which it has. My MIPS was pretty rusty before I started the emulator.

Bujingai was the first to fall victim to my emulator of ruin and decompression. I was immediately able to find the game's decompression routines through the ELF symbol names. It then only took a few minutes to figure out what should be on the stack and how to set the registers before executing the routine, and the output would be in the R5900's memory right where I pointed one of the values fed in via register. Using data breakpoints in my emulator made figuring all of that out pretty trivial.

Bujingai also stores ROM tables for each of its asset containers within the executable itself. These are even attached to appropriately-named symbols, so the info table can be retrieved automatically based on the name of the file being extracted. All of this basically turns into some code like this in Noesis:
	BYTE *elfData = Noesis_LoadPairedFile("Bujingai US ELF (SLUS_208.95)", ".95",
				elfDataSize, NULL);
	DWORD sysMemSize = 0x02000000;
	sysContext_t *ctx = R5900_AllocSysContext(sysMemSize);
	bool loadedElf = R5900_LoadELF(elfData, elfDataSize, ctx);
	unpooled_free(elfData); //no longer needed once it's been loaded
	if (!loadedElf)
		richprintf("ERROR: Unable to load ELF.\n");
		return false;

	//try to base offsets on symbols if possible
	elfSymbol_t *decompSym = R5900_FindSymbol(ctx, "decomp", true);
	elfSymbol_t *decompGetByteSym = R5900_FindSymbol(ctx, "decompGetByte", true);
	DWORD codeOfsDecomp = (decompSym) ? decompSym->val+0x40 : 0x00125140;
	DWORD codeOfsHalt = (decompSym) ? decompSym->val+0x9C : 0x0012519C;
	DWORD codeOfsReadEv = (decompGetByteSym) ? decompGetByteSym->val+0x0C :

	//stop the simulation at this offset using a custom instruction
	R5900_MakeInstruction(ctx, codeOfsHalt, OP_CUST_STOPEXEC);
	//v0 is guaranteed to be the source offset here. the readevent function sets
	//the source buffer pointer for local tracking.
	R5900_SetLogicHook(ctx, BujingaiBin_ReadEvent, codeOfsReadEv);
	ctx->cpu.regs[REG_SP].ur[0] = 0x01000000; //put the stack here

	//finds the symbol pointing to the file table for this container
	elfSymbol_t *rlSym = R5900_FindSymbol(ctx, romTableSymName, false);
	if (rlSym && rlSym->size >= sizeof(bujinRomInfo_t))
		romInfo = (bujinRomInfo_t *)(ctx->sysMem+rlSym->val);
		romInfoNum = rlSym->size / sizeof(bujinRomInfo_t);
		richprintf("Found romInfo table for '%s'!\n", romTableSymName);
		richprintf("WARNING: Couldn't find symbol, bruteforcing.\n");

	DWORD dataOfs = 0x01800000; //stick the compressed file here
	memcpy(ctx->sysMem+dataOfs, fileBuffer, bufferLen); //put it in system memory

	//loop through all of the rom table entries and do this
		ctx->cpu.pc = codeOfsDecomp; //start of decompression code

		//s0/s1 will be used in looking for pointers around here
		DWORD *x = (DWORD *)(ctx->sysMem + dataOfs + bufferLen);
		DWORD decompDestOfs = dataOfs + bufferLen + 0x00100000;
		x[1] = decompDestOfs;
		x[2] = dataOfs + cmpOfs; //start of the data at the compression header

		//make s0 and s1 point to x
		ctx->cpu.regs[REG_S0].ui[0] = dataOfs + bufferLen;
		ctx->cpu.regs[REG_S1].ui[0] = dataOfs + bufferLen;

		BYTE *decompChunk = ctx->sysMem + decompDestOfs; //output dest
		R5900_RunCPUForCycles(ctx, -1);
		DWORD decompChunkSize = (ri) ? ri->decompSize :
		if (!romInfo)
		{ //couldn't find rom info, so just guessing
			DWORD readSourceBytes = g_lastBujRead-dataOfs;
			cmpOfs = readSourceBytes;
			int tryReadUp = 0;
			while (cmpOfs < bufferLen-4 && tryReadUp < 16 &&
				memcmp(ctx->sysMem+dataOfs+cmpOfs, "CMP3", 4))
			{ //read up, since there is no consistent alignment

		int outFlags = (ri) ? ri->flags[1] : -2;
		BujingaiBin_AddFileForOutput(decompChunk, dstTotalOfs,
			decompChunkSize, outFlags, bitStream, dstOfsList);
So that covers it. No translating disassembly and writing custom decompression routines. The decompression happens pretty quickly, too, all things considered. I also plan to expose the R5900 core to plugin authors, once the interface is more stable and I've implemented a few more games to put my instruction logic to the test. I did write about 50 instructions while drunk over St. Patrick's day, so I was surprised Bujingai's decompression routine actually managed to run correctly on the first attempt.

I'm sure Señor Casaroja will be very pleased to get this new functionality in his amazing Noesis.


17 comments in total.
Post a comment



February 12, 2014 at 2:21 pm (CST)
Perhaps I can upload the game's ELF file somewhere and give you the link + address of where the data I'm looking at is located. It's a music game - Konami's DDRMAX2 - Dance Dance Revolution 7th Mix... part of me is sure what I'm looking at either is, or was, variable related not just because of the text, and how each piece looked like a variable, but also because it seems consistent with how variables were labeled in other Konami BEMANI series games - like, for example, beatmania 5th Mix for the Playstation [whose source code was inadvertently added to pad out the disk for beatmania Extra Mix ... heh, oops...]


Rich Whitehouse

February 8, 2014 at 8:45 am (CST)
It could be that you're looking at a data section rather than symbol data. I couldn't really say more without looking at the ELF myself.



February 7, 2014 at 1:20 pm (CST)
I loaded the ELF into IDA Pro, but it isn't finding the specific thing I seemed to have stumbled across at the end of the file - neither does creating a dump and running it through available tools. A shame, since it is clear the strings I am finding could be, or could have been at one time, variable names - and it is clear it isn't just a list of names per-se, since there are substantial bits of data between each string. Maybe I shall peruse that PDF.


Rich Whitehouse

September 16, 2013 at 4:35 pm (CST)
Debuggers like IDA will automatically find and use symbol tables. I use it in Noesis as well for games that have it, in order to specify routine addresses by symbol so that my code works on different binaries and regions. If you want to dig out the symbol table yourself, here's a random ELF spec found by Googling:




September 15, 2013 at 7:00 pm (CST)
"As a bonus, tons of PS2 ELF binaries leave their symbol table fully intact. "bol table at the end of the ELF.

I have been investigating a Playstation 2 game's ELF file a couple days ago, and I saw what looked like a symbol table at the very end of the file - how can I tell, and is it possible to use the information, if it is what I think it is, to aid in disassembling said game?


Rich Whitehouse

May 16, 2011 at 7:55 pm (CST)
Oh, hey, it's nice to hear from one of you guys! Thanks for the IRC invitation. I'll definitely take you up on that, if I can clear my plate off enough to get time to revisit PCSX2 and the idea of IDA Pro integration. I'm in the middle of trying to finish up a rather ambitious project on a deadline, at the moment.

I'm also happy to hear that you guys do care about having a workable debugger in PCSX2. I had actually assumed it was a low-to-non-existent priority, as that seems to be the way debuggers go too often in emulation projects. The one in DosBox has been pretty sadly neglected as well, although it is pretty usable once you get used to all that typing. :)

I've generally been of the mind that IDA Pro integration would be the best approach for PCSX2, mainly because making a really good debugger from scratch is a hell of a lot of work. The down-side is that IDA Pro is not free (well, at least the more feature-ful version of it isn't), which makes things less accessible. I have also had some thoughts on how to handle debugging of r5900-native code in dynarec mode that I would love to run by you guys, but I guess I'll save that for a time when I'm more freed up to actually contribute myself.



May 16, 2011 at 11:38 am (CST)
Hey Rich, if you need any help / thoughts / advice for fixing (or redoing) the PCSX2 debugger, feel free to come chat with the emu devs in #pcsx2-dev on Efnet.
Fixing the debugger has long been on top of our wish list, but we had to focus on making games work (fast), first.
We've just released a milestone though and can now focus on the interesting stuff again ;)


Rich Whitehouse

April 15, 2011 at 2:36 am (CST)
Looks like bad pointweighting on the model itself. Likely went unnoticed because it's harder to see at PSX resolution and battle cam perspective, and looks fine in the idle anim. (0009)

Likewise, Ragtime Mouse looks like some of his anims are busted. But they are in fact just broken in the game itself. For him in particular I was able to find a video of someone getting one if his questions wrong, where he plays the anim with his arm angle screwed up in-game.

So if you think something's broken with a monster in the future, find me pics of it not being broken in the actual game. This game is pretty full of screwups and bad UV'ing, I've found it to be quite a step down from FF8 in the tech and asset departments.



April 14, 2011 at 9:25 pm (CST)
I was recently using your "modelviewer" and I noticed in FF9 on Disc 3 file 53 the Torama model seems to have some kind of issue on its hind legs. Just wanted to bring that to your attention.


Rich Whitehouse

April 9, 2011 at 3:39 pm (CST)
I've been using PCSX2, IDA, and my own R5900 debugger/disassembler. Which has not been too pleasant, but it's a usable setup. Since PCSX2's debugger is totally broken and unusable, I've actually been debugging PCSX2 itself, and doing stuff like inserting custom code segments into the VIF unpack routine in order to catch model data that I want to take a look at, and otherwise just setting data breakpoints in system/VU memory to find code that deals with the particular data I'm interested in.

Unfortunately, I haven't found any good alternatives. The most important thing for me is to be able to debug games in their normal operating conditions, and to that end I'm not aware of any PS2 emulators that actually do a better job than PCSX2. I would consider fixing their debugger myself, if not for the fact that it is a really terrible hack that just sits on top of the app and does not play nicely with anything else. (e.g. handling breakpoints by trying to step the CPU in its own private loop, endlessly, in interpreted mode, choking every other system in the emulator) So I think if I were going to take up that job, my time would be much better-spent integrating support with IDA Pro.

It's also a big pain in the ass to try to step through binary in interpreted mode and match it to my pre-disassembled code. Occasionally I've just left dynarec on, and stepped through dynarec'd x86 for EE/VU code instead.

Comment Pages:
[1] 2 ... Next

Post a comment


Enter the following (refresh if you can't read it):
Read image


2555260 page hits since February 11, 2009.

Site design and contents (c) 2009 Rich Whitehouse. Except those contents which happen to be images or screenshots containing shit that is (c) someone/something else entirely. That shit isn't really mine. Fair use though! FAIR USE!
All works on this web site are the result of my own personal efforts, and are not in any way supported by any given company. You alone are responsible for any damages which you may incur as a result of this web site or files related to this web site.