This forum uses cookies
This forum makes use of cookies to store your login information if you are registered, and your last visit if you are not. Cookies are small text documents stored on your computer; the cookies set by this forum can only be used on this website and pose no security risk. Cookies on this forum also track the specific topics you have read and when you last read them. Please confirm whether you accept or reject these cookies being set.

A cookie will be stored in your browser regardless of choice to prevent you being asked this question again. You will be able to change your cookie settings at any time using the link in the footer.

Thread Rating:
  • 3 Vote(s) - 3.67 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Some (possibly) New Ideas for RPCS3
#31
(05-17-2014, 07:33 PM)gamenoob Wrote: Are you asking that, people should de-compile game code to something workable and then recompile to x86? That is an enormous task. And it is not generic either,you have to work for every game. Basically porting a huge collection ps3 games.

Isn't this what the SPU recompiler already does? I see a ton of .log files with x86 assembler in them Wink
Asus N55SF, i7-2670QM (~2,8 ghz under typical load), GeForce GT 555M (only OpenGL)
Reply
#32
(05-17-2014, 09:53 PM)ssshadow Wrote:
(05-17-2014, 07:33 PM)gamenoob Wrote: Are you asking that, people should de-compile game code to something workable and then recompile to x86? That is an enormous task. And it is not generic either,you have to work for every game. Basically porting a huge collection ps3 games.

Isn't this what the SPU recompiler already does? I see a ton of .log files with x86 assembler in them Wink

I think he meant statically (and I doubt RPCS3 is going to suddenly gain a static PPU recompiler), whereas the SPU recompiler is a JIT.
Reply
#33
Well, that is just AOT (ahead of time) instead of JIT (just in time.) I thought I read that Short Waves was doing AOT, I might be wrong.

As long as self-modifying code is 100% forbidden, AOT is probably a good idea. But either JIT or AOT will get rid of the decode stage. An intermediate representation can be created to simplify creating multiple native versions, but there's no point interpreting that IL. No matter what it should end up executing as native code for the CPU to decode in hardware.

-[Unknown]
Reply
#34
(05-18-2014, 12:01 AM)derpf Wrote:
(05-17-2014, 09:53 PM)ssshadow Wrote:
(05-17-2014, 07:33 PM)gamenoob Wrote: Are you asking that, people should de-compile game code to something workable and then recompile to x86? That is an enormous task. And it is not generic either,you have to work for every game. Basically porting a huge collection ps3 games.

Isn't this what the SPU recompiler already does? I see a ton of .log files with x86 assembler in them Wink

I think he meant statically (and I doubt RPCS3 is going to suddenly gain a static PPU recompiler), whereas the SPU recompiler is a JIT.

I grab the executable, let it run and analyze the ppu-spu instructions, data pointers . I got all instructions, then I trace it back to something high level language aka decompiling. Then depending upon runtime platform(x86, arm etc), I recompile these. And finally run it to host hardware. Yes it is static thing. The problem is: every game will have different instructions. So you have to do it for every game. The catch is: a system of similar raw power or less should be able to emulate ps3 pretty nicely.

There is another "probable" problem. I said "probable" because I am not sure of it. Lets consider an unity3d game, on which there is a script. Consider the pseudo code snippet of this script:

Code:
if(boolA){assemblyFuncA();}
else if(boolB){assemblyFuncB();}
else{assemblyFuncC();}


Here there are three function. For simplicity, lets consider the inside content of function is pure assembly. So depending upon situation(the bools), different set of instruction will run. When game engine compile the whole game project, one or multiple executable is formed. Lets consider you are analyzing this game. Which one can you do?

Case 1: we grab the executable, we can decompile it. We can get all the "function's inside instructions".
Case 2: We grab the executable, we let it run. When appropriate condition is met, proper set of instruction will run. For example: if "boolB" never happens in an instance of emulator(with respective game title running), instructions within "assemblyFuncB()" will never be collected either. So how can you decode in this case 2?


If you can do case 1, then prior decoding will be great. If not, then I guess we have to do something else.
Reply
#35
Wouldn't AOT compilation enable PS3 games to be much faster through the emulator? Because if it's already compiled to a targetable execution format that needs no decoding it would make sense that the games would be almost like native executables on the target system ... and since it's all taken care of ahead of time all the emulator has to do is use a light abstraction layer-like engine to handle graphics, emulated RAM ranges, sound, etc.? Or do I have this wrong?
Reply
#36
So... what will get is some cool ass decoder or some stuff like that? Since I get kinda lost without tutorials, I hope we get some "click here load this" button, so people like me can just test games without knowing about this decoding magic
Reply
#37
(06-07-2014, 01:19 AM)Threule Wrote: So... what will get is some cool ass decoder or some stuff like that? Since I get kinda lost without tutorials, I hope we get some "click here load this" button, so people like me can just test games without knowing about this decoding magic

That's what's already in place. Anyone who cannot figure it out probably should not be using it at this point in time, though.
Reply
#38
mhmm, nice discussion, guess devs should know best
btw. did anybody of you ever looked at the cellbe "simulator"? if the developper
of the cell or a team near them with full insight of the dox do it aot or jit or maybe
some mix - this should be the "golden" way as they should know best.

back in 2007 (= time when i played around with this nice peace of software),
as far as i remember, ydl ran at decent speed - i programmed some little speed
testing stuff in it (no gfx/audio certainly) but i was impressed how fast it was - ok, 7 years ago, so far my recalls...
Reply
#39
Hey all ...

Well.. let me first tell that I already read all stuff about words on the air and about doing some POC before post Tongue .

But I'm with that idea on my mind for a good amount of time, and I don't had already read too much line of code from rpcs3. This is more about a brain storm, where I don't know exactly what to do;

Okay... so. All those discussions were about how to fetch, transcode, interpret etc etc from Cell instruction set to x86. Good... If I'm right .. all that workload is done by CPU, and GPU is used only for OpenGL as backend. Is that right?

What I'm raising here is just an organization and architecture idea that comes ... anyway... Direct to the point, if a part of ISA is *executed* on the GPU, majority(or only or part of) the SIMD ones dispatched to SPE? It can be cached until X number of instructions then dispatched to GPU.

I'm not saying that it's better or easily done... I'm just raising the idea: "If some workload is done by OpenCL on the GPU. Releasing or reducing the workload on CPU". And I know all the problems to access memory outside GPU, the gap between dispatch and execute per instruction or per block of instructions.

I have little HPC experience on hybrid cluster, using OpenMP and OpenCL(for the fun..even it were a financed project by my university :}) , and *maybe* some thing can be reached dividing the workload on CPU and GPU for emulation purpose.

Right now I'm reading "Emu / CPU /", "Emu / Cell /" and "Emu / Memory" .. so I'm posting that before actually understand all the code... because I will do that on my extra free time ^^...and it can take some days.

Well, thanks Big Grin
Reply
#40
As far as I remember Dolphin had their experiences with OpenCL and even doing something so fitting for it like texture decoding did not yield significant speed-ups and more often than not slowed things down.

I don't think the GPU is going to help us a lot with CPU emulation for the time being. I can imagine doing game specific "spu kernel replacements" later on. Similar to the way that for example PPSSPP detects some copy functions by hashing them and then doing the copying natively.

But I can't imagine a general purpose SPU to OpenCL translation working out too well, but you're free to try.

As a side-note if anyone has gotten some real-world compute-heavy SPU kernels you could post them here for s8box to do some preliminary tests (just hand-write a cpu and an opencl implementation outside of rpcs3 and benchmark those to see if the CPU-time spend copying and retrieving the data is even worth it)
Reply


Forum Jump:


Users browsing this thread: 3 Guest(s)