QJ.NET | Videos | Forums | iPhone | MMORPG | Nintendo DS | Wii | PlayStation 3 | PSP | Xbox 360 | PC | Downloads | Contact Us
Forums | Gaming News | Videos | Downloads | Today's Posts | Mark Forums Read | Chat | FAQ | Members List | Contact

QJ.net Game Discussion - PSP, Xbox, Wii, PS3, PSP Homebrew, and PSP Guides

Go Back   QJ.net Game Discussion - PSP, Xbox, Wii, PS3, PSP Homebrew, and PSP Guides > Developers Corner > PSP Development, Hacks, and Homebrew > PSP Development Forum
The above video goes away if you are a member and logged in, so log in now!

C/C++ Optimizations

This is a discussion on C/C++ Optimizations within the PSP Development Forum forums, part of the PSP Development, Hacks, and Homebrew category; Post here whatever code optimizations you know of, and I'll add to this first post! Here's some links I picked ...

Reply
 
LinkBack Thread Tools
Old 09-09-2006, 09:27 AM   #1

sceKernelExitGame();
 
Bronx's Avatar
 
Join Date: Jan 2006
Location: New York
Posts: 3,125
Trader Feedback: 0
Default C/C++ Optimizations

Post here whatever code optimizations you know of, and I'll add to this first post!

Here's some links I picked up from psp-programming.com (Thanks )

c optimization
c++ optimization
pspgu optimization

Nexis2600

There are a few reasons to do while(!ExitLoop), benifit is anywhere in the code it can set the varible to true and force the main loop to exit.

Another reason why someone might use a varible vs a break is you have to make sure you exit the loop at the proper time.

For instance, you enter a loop. At the start you start a new display list. Then break before the end display list is called. Next time you try to start a new display list your bound to lock up the psp or cause an error.

Fanjita

Get yourself a copy of Mike Abrash's "Zen of Code Optimisation". That will teach you an enormous amount about optimisation techniques, and ways of thinking.

At the end of the day, the best route to optimisation is to get to know your target platform, how computers work at a low level, and to understand the theory of algorithms.


Insomniac197

Use sceCtrlPeekBufferPositive over sceCtrlReadBufferPositive .

Use hardware acceleration wherever possible instead of software.

Don't overuse sceKernelDcacheWritebackI nvalidateAll().

Always swizzle your images.

Use the texture cache.

VRAM is faster than RAM (but you have very little of it to use) - so place textures that are ALWAYS on screen in VRAM (for example the player sprite).

Harleyg

Use >> and << instead of /

Never use while(1), use while(foo) so you can stop the loop by setting foo to 0.
If you just want a while(1) equivalent use:
Code:
for(;;)
********

For programming on the PSP it's important to minimize memory usage overhead. Reusing the same buffer spaces for multiple things is dirty but can achieve this; you'll get higher cache hit rates. For small temporary allocations it's also better to allocate on the stack with dynamically sized arrays than it is on the heap with malloc/new, because that'll both improve spatial locality of your neighboring elements, temporal locality over the stack, and decrease memory overhead overall. PSP only has a small amount of cache so when doing computationally expensive work this is important. If your programming leads you to balance icache usage over dcache usage it's probably better to favor dcache usage, because icache can be prefetched more transparently and thus efficiently (is probably the reason why some platforms have less icache than dcache).

If you're not using VRAM that heavily anyway (happens, well, in emulators usually) you can place non-video related things there... the ME's eDRAM is good for this too but can only be accessed on the ME of course. If you're just going to use it like an extension normal RAM you shouldn't have to worry about caching problems, but be sure the memory there is aligned to the cache line width (64 bytes).

AnonymousTipster

Use VFPU accelerated functions where applicable. There are VFPU ASM code fragments on ps2dev.org here:
http://forums.ps2dev.org/viewtopic.php?t=6609
http://forums.ps2dev.org/viewtopic.php?t=5557
http://forums.ps2dev.org/viewtopic.php?t=6478
You need to be careful of the overhead of moving data when utilising the VFPU, or the performance gain will be negated.
I've only made minimal use of the VFPU thus far, but it could potentially be very powerful in skilled hands.

Yaustar

-Profiling:
Under GCC IIRC, you add -pg to the CFLAGs and perform a recompile. When you run the executable, if will record the time that functions take to finish, the call charts etc. This information will be in a file gmon.out (I think) which can be read using a tool like gprof.

Here is what one of mine looks like for the project I am working on:


Code:
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ts/call  Ts/call  name    
100.00      0.02     0.02                             zoomSurfaceRGBA
  0.00      0.02     0.00       10     0.00     0.00  Core::CLogger::operator<<(std::string const&)
  0.00      0.02     0.00        2     0.00     0.00  Core::CSurfaceObject::FreeSurface()
  0.00      0.02     0.00        1     0.00     0.00  global destructors keyed to _ZN4Core14CSurfaceObjectC2Ev
  0.00      0.02     0.00        1     0.00     0.00  global destructors keyed to _ZN4Core6ErrLogE
  0.00      0.02     0.00        1     0.00     0.00  __static_initialization_and_destruction_0(int, int)
  0.00      0.02     0.00        1     0.00     0.00  __static_initialization_and_destruction_0(int, int)
  0.00      0.02     0.00        1     0.00     0.00  Core::CSurfaceObject::ScaleImage(float)
  0.00      0.02     0.00        1     0.00     0.00  Core::CSurfaceObject::LoadImage(std::string const&)
  0.00      0.02     0.00        1     0.00     0.00  Core::CSurfaceObject::CSurfaceObject()
  0.00      0.02     0.00        1     0.00     0.00  Core::CSurfaceObject::~CSurfaceObject()
  0.00      0.02     0.00        1     0.00     0.00  Core::CLogger::~CLogger()

 %         the percentage of the total running time of the
time       program used by this function.

cumulative a running sum of the number of seconds accounted
 seconds   for by this function and those listed above it.

 self      the number of seconds accounted for by this
seconds    function alone.  This is the major sort for this
           listing.

calls      the number of times this function was invoked, if
           this function is profiled, else blank.
 
 self      the average number of milliseconds spent in this
ms/call    function per call, if this function is profiled,
	   else blank.

 total     the average number of milliseconds spent in this
ms/call    function and its descendents per call, if this 
	   function is profiled, else blank.

name       the name of the function.  This is the minor sort
           for this listing. The index shows the location of
	   the function in the gprof listing. If the index is
	   in parenthesis it shows where it would appear in
	   the gprof listing if it were to be printed.

		     Call graph (explanation follows)


granularity: each sample hit covers 4 byte(s) for 50.00% of 0.02 seconds

index % time    self  children    called     name
                                                 <spontaneous>
[1]    100.0    0.02    0.00                 zoomSurfaceRGBA [1]
-----------------------------------------------
                0.00    0.00       1/10          Core::CSurfaceObject::ScaleImage(float) [11]
                0.00    0.00       2/10          Core::CSurfaceObject::CSurfaceObject() [13]
                0.00    0.00       2/10          Core::CSurfaceObject::~CSurfaceObject() [14]
                0.00    0.00       2/10          Core::CSurfaceObject::LoadImage(std::string const&) [12]
                0.00    0.00       3/10          Core::CSurfaceObject::FreeSurface() [6]
[5]      0.0    0.00    0.00      10         Core::CLogger::operator<<(std::string const&) [5]
-----------------------------------------------
                0.00    0.00       1/2           Core::CSurfaceObject::~CSurfaceObject() [14]
                0.00    0.00       1/2           Core::CSurfaceObject::LoadImage(std::string const&) [12]
[6]      0.0    0.00    0.00       2         Core::CSurfaceObject::FreeSurface() [6]
                0.00    0.00       3/10          Core::CLogger::operator<<(std::string const&) [5]
-----------------------------------------------
                0.00    0.00       1/1           __do_global_dtors [1263]
[7]      0.0    0.00    0.00       1         global destructors keyed to _ZN4Core14CSurfaceObjectC2Ev [7]
                0.00    0.00       1/1           __static_initialization_and_destruction_0(int, int) [10]
-----------------------------------------------
                0.00    0.00       1/1           __do_global_dtors [1263]
[8]      0.0    0.00    0.00       1         global destructors keyed to _ZN4Core6ErrLogE [8]
                0.00    0.00       1/1           __static_initialization_and_destruction_0(int, int) [9]
-----------------------------------------------
                0.00    0.00       1/1           global destructors keyed to _ZN4Core6ErrLogE [8]
[9]      0.0    0.00    0.00       1         __static_initialization_and_destruction_0(int, int) [9]
                0.00    0.00       1/1           Core::CLogger::~CLogger() [15]
-----------------------------------------------
                0.00    0.00       1/1           global destructors keyed to _ZN4Core14CSurfaceObjectC2Ev [7]
[10]     0.0    0.00    0.00       1         __static_initialization_and_destruction_0(int, int) [10]
-----------------------------------------------
                0.00    0.00       1/1           SDL_main [58]
[11]     0.0    0.00    0.00       1         Core::CSurfaceObject::ScaleImage(float) [11]
                0.00    0.00       1/10          Core::CLogger::operator<<(std::string const&) [5]
-----------------------------------------------
                0.00    0.00       1/1           SDL_main [58]
[12]     0.0    0.00    0.00       1         Core::CSurfaceObject::LoadImage(std::string const&) [12]
                0.00    0.00       2/10          Core::CLogger::operator<<(std::string const&) [5]
                0.00    0.00       1/2           Core::CSurfaceObject::FreeSurface() [6]
-----------------------------------------------
                0.00    0.00       1/1           SDL_main [58]
[13]     0.0    0.00    0.00       1         Core::CSurfaceObject::CSurfaceObject() [13]
                0.00    0.00       2/10          Core::CLogger::operator<<(std::string const&) [5]
-----------------------------------------------
                0.00    0.00       1/1           SDL_main [58]
[14]     0.0    0.00    0.00       1         Core::CSurfaceObject::~CSurfaceObject() [14]
                0.00    0.00       2/10          Core::CLogger::operator<<(std::string const&) [5]
                0.00    0.00       1/2           Core::CSurfaceObject::FreeSurface() [6]
-----------------------------------------------
                0.00    0.00       1/1           __static_initialization_and_destruction_0(int, int) [9]
[15]     0.0    0.00    0.00       1         Core::CLogger::~CLogger() [15]
-----------------------------------------------

 This table describes the call tree of the program, and was sorted by
 the total amount of time spent in each function and its children.

 Each entry in this table consists of several lines.  The line with the
 index number at the left hand margin lists the current function.
 The lines above it list the functions that called this function,
 and the lines below it list the functions this one called.
 This line lists:
     index	A unique number given to each element of the table.
		Index numbers are sorted numerically.
		The index number is printed next to every function name so
		it is easier to look up where the function in the table.

     % time	This is the percentage of the `total' time that was spent
		in this function and its children.  Note that due to
		different viewpoints, functions excluded by options, etc,
		these numbers will NOT add up to 100%.

     self	This is the total amount of time spent in this function.

     children	This is the total amount of time propagated into this
		function by its children.

     called	This is the number of times the function was called.
		If the function called itself recursively, the number
		only includes non-recursive calls, and is followed by
		a `+' and the number of recursive calls.

     name	The name of the current function.  The index number is
		printed after it.  If the function is a member of a
		cycle, the cycle number is printed between the
		function's name and the index number.


 For the function's parents, the fields have the following meanings:

     self	This is the amount of time that was propagated directly
		from the function into this parent.

     children	This is the amount of time that was propagated from
		the function's children into this parent.

     called	This is the number of times this parent called the
		function `/' the total number of times the function
		was called.  Recursive calls to the function are not
		included in the number after the `/'.

     name	This is the name of the parent.  The parent's index
		number is printed after it.  If the parent is a
		member of a cycle, the cycle number is printed between
		the name and the index number.

 If the parents of the function cannot be determined, the word
 `<spontaneous>' is printed in the `name' field, and all the other
 fields are blank.

 For the function's children, the fields have the following meanings:

     self	This is the amount of time that was propagated directly
		from the child into the function.

     children	This is the amount of time that was propagated from the
		child's children to the function.

     called	This is the number of times the function called
		this child `/' the total number of times the child
		was called.  Recursive calls by the child are not
		listed in the number after the `/'.

     name	This is the name of the child.  The child's index
		number is printed after it.  If the child is a
		member of a cycle, the cycle number is printed
		between the name and the index number.

 If there are any cycles (circles) in the call graph, there is an
 entry for the cycle-as-a-whole.  This entry shows who called the
 cycle (as parents) and the members of the cycle (as children.)
 The `+' recursive calls entry shows the number of function calls that
 were internal to the cycle, and the calls entry for each member shows,
 for that member, how many times it was called from other members of
 the cycle.


Index by function name

   [7] global destructors keyed to _ZN4Core14CSurfaceObjectC2Ev (CSurfaceObject.cpp) [11] Core::CSurfaceObject::ScaleImage(float) [14] Core::CSurfaceObject::~CSurfaceObject()
   [8] global destructors keyed to _ZN4Core6ErrLogE (CLogger.cpp) [6] Core::CSurfaceObject::FreeSurface() [15] Core::CLogger::~CLogger()
  [10] __static_initialization_and_destruction_0(int, int) (CSurfaceObject.cpp) [12] Core::CSurfaceObject::LoadImage(std::string const&) [5] Core::CLogger::operator<<(std::string const&)
   [9] __static_initialization_and_destruction_0(int, int) (CLogger.cpp) [13] Core::CSurfaceObject::CSurfaceObject() [1] zoomSurfaceRGBA

- Dont allocate or deallocate memory between frames. Do it at the end or begginning of a 'game 'state' or even better used fixed memory pools.
- Keep class hierarchies as flat as possible
- Avoid Pointer chains (eg a->b->c->d->blah = 10)
- Trust your compiler to deal with little optimisations (eg loop unrolling)
- In loops, count down to 0 rather then up to an X number (saves on several operations per loop)
- sometimes -O2 will give better results then -O3 (dont ask me why because it shouldn't :/)

Last edited by Bronx; 09-10-2006 at 05:07 PM..
Bronx is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 10:08 AM   #2

Developer
 
Join Date: Mar 2006
Posts: 1,026
Trader Feedback: 0
Default

Use sceCtrlPeekBufferPositive over sceCtrlReadBufferPositive .

Use hardware acceleration wherever possible instead of software.

Don't overuse sceKernelDcacheWritebackI nvalidateAll().

Always swizzle your images.

Use the texture cache.

VRAM is faster than RAM (but you have very little of it to use) - so place textures that are ALWAYS on screen in VRAM (for example the player sprite).
__________________

Check out my homebrew & C tutorials at http://insomniac.0x89.org/
Coder formerly known as Insomniac197

Quote:
tshirtz: what is irshell ??
Atarian_: it's where people who work for the IRS go when they die
Insert_Witty_Name is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 10:13 AM   #3

sceKernelExitGame();
 
Bronx's Avatar
 
Join Date: Jan 2006
Location: New York
Posts: 3,125
Trader Feedback: 0
Default

Added
Bronx is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 10:56 AM   #4

total-Z
 
youresam's Avatar
 
Join Date: Jul 2005
Location: texas
Posts: 2,803
Trader Feedback: 0
Default

Nice.
__________________
牧来栠摩琠敨映汩獥
PSN: youresam
From Earth the Frozen Ipaqs shall rise and be silenced and all will live free.
--Mike Hollingsworth
youresam is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 11:09 AM   #5
 
Twin891's Avatar
 
Join Date: Jul 2006
Location: Canada
Posts: 70
Trader Feedback: 0
Default

Great idea! but maybe you could add in some explinations to it. For example... "sceCtrlPeekBufferPositiv e" and "sceCtrlReadBufferPositiv e" what are the differences? Why should i use one over the other?
Twin891 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 11:14 AM   #6

AKA Homer
 
Moonchild's Avatar
 
Join Date: Jan 2006
Location: Sweden
Posts: 1,779
Trader Feedback: 0
Default

Quote:
Originally Posted by Twin891
Great idea! but maybe you could add in some explinations to it. For example... "sceCtrlPeekBufferPositiv e" and "sceCtrlReadBufferPositiv e" what are the differences? Why should i use one over the other?
Yeah, I believe there's no documentation on the Peek thing.
__________________


Click Here if you want a Winamp Currently Playing Userbar like the one above.
Moonchild is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 11:22 AM   #7

My name is Mud
 
Join Date: Dec 2005
Posts: 1,538
Trader Feedback: 0
Default

Use >> and << instead of /
Only one i can think of...
Homer: Your signature, it's a cartoon but i still love it.
__________________

hàrléyg² is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 12:09 PM   #8

Developer
 
Join Date: Aug 2006
Posts: 209
Trader Feedback: 0
Default

Quote:
Originally Posted by hàrléyg²
Use >> and << instead of /
Only one i can think of...
Homer: Your signature, it's a cartoon but i still love it.
Any optimizing compiler will pick this one up very easily. Instead I'd say when doing integer division by powers of two make sure to be using unsigned types if you don't need negatives, so it doesn't add in the sign correction to round it towards 0.

There are a lot of good techniques that compilers may get now, like strength reduction over loops, inner loop hoisting.. but sometimes if you do those things yourself you might open up new opportunities that the compiler hasn't seen.

For programming on the PSP it's important to minimize memory usage overhead. Reusing the same buffer spaces for multiple things is dirty but can achieve this; you'll get higher cache hit rates. For small temporary allocations it's also better to allocate on the stack with dynamically sized arrays than it is on the heap with malloc/new, because that'll both improve spatial locality of your neighboring elements, temporal locality over the stack, and decrease memory overhead overall. PSP only has a small amount of cache so when doing computationally expensive work this is important. If your programming leads you to balance icache usage over dcache usage it's probably better to favor dcache usage, because icache can be prefetched more transparently and thus efficiently (is probably the reason why some platforms have less icache than dcache).

If you're not using VRAM that heavily anyway (happens, well, in emulators usually) you can place non-video related things there... the ME's eDRAM is good for this too but can only be accessed on the ME of course. If you're just going to use it like an extension normal RAM you shouldn't have to worry about caching problems, but be sure the memory there is aligned to the cache line width (64 bytes).
Exophase is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 12:44 PM   #9

Developer
 
Join Date: Mar 2006
Posts: 1,026
Trader Feedback: 0
Default

Quote:
Originally Posted by Twin891
Great idea! but maybe you could add in some explinations to it. For example... "sceCtrlPeekBufferPositiv e" and "sceCtrlReadBufferPositiv e" what are the differences? Why should i use one over the other?
sceCtrlReadBufferPositive is blocking and thus waits for vsync before returning a value - sceCtrlPeekBufferPositive doesn't wait for vsync and therefore is faster.
__________________

Check out my homebrew & C tutorials at http://insomniac.0x89.org/
Coder formerly known as Insomniac197

Quote:
tshirtz: what is irshell ??
Atarian_: it's where people who work for the IRS go when they die
Insert_Witty_Name is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 12:55 PM   #10

My name is Mud
 
Join Date: Dec 2005
Posts: 1,538
Trader Feedback: 0
Default

Never use while(1), use while(foo) so you can stop the loop by setting foo to 0.
If you just want a while(1) equivalent use:
Code:
for(;;)
__________________

hàrléyg² is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 01:12 PM   #11
 
Join Date: Jan 2006
Posts: 4,288
Trader Feedback: 0
Default

Quote:
Originally Posted by hàrléyg²
Never use while(1), use while(foo) so you can stop the loop by setting foo to 0.
I'm just wondering, why/what does that do?
Quote:
If you just want a while(1) equivalent use:
Code:
for(;;)
I think I read somewhere that most compilers already make that optimization themselves, but I'm not sure...
__________________
[URL="http://www.newlilwayne.com"]www.NewLilWayne.com[/URL]
soccerPMN is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 01:26 PM   #12

Developer
 
AnonymousTipster's Avatar
 
Join Date: Jun 2005
Location: Under a Large rock called Fred
Posts: 693
Trader Feedback: 0
Default

Use VFPU accelerated functions where applicable. There are VFPU ASM code fragments on ps2dev.org here:
http://forums.ps2dev.org/viewtopic.php?t=6609
http://forums.ps2dev.org/viewtopic.php?t=5557
http://forums.ps2dev.org/viewtopic.php?t=6478
You need to be careful of the overhead of moving data when utilising the VFPU, or the performance gain will be negated.
I've only made minimal use of the VFPU thus far, but it could potentially be very powerful in skilled hands.
__________________
Developer of
Tipster Unzip/Unrar ThrottleX RoboTORN3D ODEPsp


Now, with the power of my PSP, I will finally RULE THE WORLD. Muhahahah.
AnonymousTipster is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 01:41 PM   #13

...in a dream...
 
SG57's Avatar
 
Join Date: Jul 2005
Posts: 4,957
Trader Feedback: 0
Default

Quote:
Originally Posted by Insomniac197
sceCtrlReadBufferPositive is blocking and thus waits for vsync before returning a value - sceCtrlPeekBufferPositive doesn't wait for vsync and therefore is faster.
Yes, but it has a better chance of the button not even working.

@harleyg - Using while(1) is perfectly fine, as calling a statement such as 'break' will stop the loop just as easy. You can't necessarly 'restart' the loop as you could with a variable, but you could just make it into a function and just re-call it to start it over...
__________________
SG57 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 02:06 PM   #14

My name is Mud
 
Join Date: Dec 2005
Posts: 1,538
Trader Feedback: 0
Default

Quote:
@harleyg - Using while(1) is perfectly fine, as calling a statement such as 'break' will stop the loop just as easy. You can't necessarly 'restart' the loop as you could with a variable, but you could just make it into a function and just re-call it to start it over...
It's not about that, i'm sure i've read somewhere that while(1) checks to see if 1 is true or not every loop, where as for doesn't.
Not much, but bleh.
__________________

hàrléyg² is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 02:07 PM   #15

sceKernelExitGame();
 
Bronx's Avatar
 
Join Date: Jan 2006
Location: New York
Posts: 3,125
Trader Feedback: 0
Default

Everything is added! Great input guys, keep it up!
Bronx is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 02:23 PM   #16

...in a dream...
 
SG57's Avatar
 
Join Date: Jul 2005
Posts: 4,957
Trader Feedback: 0
Default

Any loop, you can 'break'...
Code:
for (;;) { break; }
while(1) { break; }
do { break; } while(1);
__________________
SG57 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 02:27 PM   #17

My name is Mud
 
Join Date: Dec 2005
Posts: 1,538
Trader Feedback: 0
Default

Quote:
It's not about that,
Do you even read before you post?
__________________

hàrléyg² is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 02:38 PM   #18

Developer
 
yaustar's Avatar
 
Join Date: Jun 2006
Location: UK
Posts: 2,317
Trader Feedback: 0
Default

The greatest optimisation? Use a profiler to find out WHERE is optismation is truely needed in a program.
yaustar is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 03:04 PM   #19

Your Fate is Grim...
 
Grimfate126's Avatar
 
Join Date: Oct 2005
Posts: 2,269
Trader Feedback: 0
Default

Quote:
Originally Posted by head_54us
The greatest optimisation? Use a profiler to find out WHERE is optismation is truely needed in a program.
how do u do that?? guess i need to
__________________
--------------------------------------------------------------------------------------
Grimfate126 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 03:06 PM   #20

...in a dream...
 
SG57's Avatar
 
Join Date: Jul 2005
Posts: 4,957
Trader Feedback: 0
Default

Okkk.... Then what's it about?
__________________
SG57 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 03:09 PM   #21

My name is Mud
 
Join Date: Dec 2005
Posts: 1,538
Trader Feedback: 0
Default

I guess you don't or can't read.
Quote:
i'm sure i've read somewhere that while(1) checks to see if 1 is true or not every loop, where as for doesn't.
__________________

hàrléyg² is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 03:12 PM   #22

OMFG
 
Slasher's Avatar
 
Join Date: Jul 2005
Location: Toronto
Posts: 2,816
Trader Feedback: 0
Default

Quote:
Originally Posted by SG57
Okkk.... Then what's it about?
The for loop doesn't check if while(something) is true. The for loop skips that entirely and just loops continuously, just to give it an extra oomph.
Slasher is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 03:13 PM   #23
 
Join Date: Jan 2006
Posts: 4,288
Trader Feedback: 0
Default

Quote:
Originally Posted by Alexisonfire
The for loop doesn't check if while(something) is true. The for loop skips that entirely and just loops continuously, just to give it an extra oomph.
Yep, one (or more?) less instruction statements that the CPU has to compute.
__________________
[URL="http://www.newlilwayne.com"]www.NewLilWayne.com[/URL]
soccerPMN is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 03:14 PM   #24

My name is Mud
 
Join Date: Dec 2005
Posts: 1,538
Trader Feedback: 0
Default

For someone who calls themself a coder/hacker, he really can't read.

But anyway, back on topic.
__________________

hàrléyg² is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 03:16 PM   #25

Developer
 
yaustar's Avatar
 
Join Date: Jun 2006
Location: UK
Posts: 2,317
Trader Feedback: 0
Default

Quote:
Originally Posted by Grimfate126
how do u do that?? guess i need to
Under GCC IIRC, you add -pg to the CFLAGs and perform a recompile. When you run the executable, if will record the time that functions take to finish, the call charts etc. This information will be in a file gmon.out (I think) which can be read using a tool like gprof.

Here is what one of mine looks like for the project I am working on:
Spoiler for gprof output:
Code:
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ts/call  Ts/call  name    
100.00      0.02     0.02                             zoomSurfaceRGBA
  0.00      0.02     0.00       10     0.00     0.00  Core::CLogger::operator<<(std::string const&)
  0.00      0.02     0.00        2     0.00     0.00  Core::CSurfaceObject::FreeSurface()
  0.00      0.02     0.00        1     0.00     0.00  global destructors keyed to _ZN4Core14CSurfaceObjectC2Ev
  0.00      0.02     0.00        1     0.00     0.00  global destructors keyed to _ZN4Core6ErrLogE
  0.00      0.02     0.00        1     0.00     0.00  __static_initialization_and_destruction_0(int, int)
  0.00      0.02     0.00        1     0.00     0.00  __static_initialization_and_destruction_0(int, int)
  0.00      0.02     0.00        1     0.00     0.00  Core::CSurfaceObject::ScaleImage(float)
  0.00      0.02     0.00        1     0.00     0.00  Core::CSurfaceObject::LoadImage(std::string const&)
  0.00      0.02     0.00        1     0.00     0.00  Core::CSurfaceObject::CSurfaceObject()
  0.00      0.02     0.00        1     0.00     0.00  Core::CSurfaceObject::~CSurfaceObject()
  0.00      0.02     0.00        1     0.00     0.00  Core::CLogger::~CLogger()

 %         the percentage of the total running time of the
time       program used by this function.

cumulative a running sum of the number of seconds accounted
 seconds   for by this function and those listed above it.

 self      the number of seconds accounted for by this
seconds    function alone.  This is the major sort for this
           listing.

calls      the number of times this function was invoked, if
           this function is profiled, else blank.
 
 self      the average number of milliseconds spent in this
ms/call    function per call, if this function is profiled,
	   else blank.

 total     the average number of milliseconds spent in this
ms/call    function and its descendents per call, if this 
	   function is profiled, else blank.

name       the name of the function.  This is the minor sort
           for this listing. The index shows the location of
	   the function in the gprof listing. If the index is
	   in parenthesis it shows where it would appear in
	   the gprof listing if it were to be printed.

		     Call graph (explanation follows)


granularity: each sample hit covers 4 byte(s) for 50.00% of 0.02 seconds

index % time    self  children    called     name
                                                 <spontaneous>
[1]    100.0    0.02    0.00                 zoomSurfaceRGBA [1]
-----------------------------------------------
                0.00    0.00       1/10          Core::CSurfaceObject::ScaleImage(float) [11]
                0.00    0.00       2/10          Core::CSurfaceObject::CSurfaceObject() [13]
                0.00    0.00       2/10          Core::CSurfaceObject::~CSurfaceObject() [14]
                0.00    0.00       2/10          Core::CSurfaceObject::LoadImage(std::string const&) [12]
                0.00    0.00       3/10          Core::CSurfaceObject::FreeSurface() [6]
[5]      0.0    0.00    0.00      10         Core::CLogger::operator<<(std::string const&) [5]
-----------------------------------------------
                0.00    0.00       1/2           Core::CSurfaceObject::~CSurfaceObject() [14]
                0.00    0.00       1/2           Core::CSurfaceObject::LoadImage(std::string const&) [12]
[6]      0.0    0.00    0.00       2         Core::CSurfaceObject::FreeSurface() [6]
                0.00    0.00       3/10          Core::CLogger::operator<<(std::string const&) [5]
-----------------------------------------------
                0.00    0.00       1/1           __do_global_dtors [1263]
[7]      0.0    0.00    0.00       1         global destructors keyed to _ZN4Core14CSurfaceObjectC2Ev [7]
                0.00    0.00       1/1           __static_initialization_and_destruction_0(int, int) [10]
-----------------------------------------------
                0.00    0.00       1/1           __do_global_dtors [1263]
[8]      0.0    0.00    0.00       1         global destructors keyed to _ZN4Core6ErrLogE [8]
                0.00    0.00       1/1           __static_initialization_and_destruction_0(int, int) [9]
-----------------------------------------------
                0.00    0.00       1/1           global destructors keyed to _ZN4Core6ErrLogE [8]
[9]      0.0    0.00    0.00       1         __static_initialization_and_destruction_0(int, int) [9]
                0.00    0.00       1/1           Core::CLogger::~CLogger() [15]
-----------------------------------------------
                0.00    0.00       1/1           global destructors keyed to _ZN4Core14CSurfaceObjectC2Ev [7]
[10]     0.0    0.00    0.00       1         __static_initialization_and_destruction_0(int, int) [10]
-----------------------------------------------
                0.00    0.00       1/1           SDL_main [58]
[11]     0.0    0.00    0.00       1         Core::CSurfaceObject::ScaleImage(float) [11]
                0.00    0.00       1/10          Core::CLogger::operator<<(std::string const&) [5]
-----------------------------------------------
                0.00    0.00       1/1           SDL_main [58]
[12]     0.0    0.00    0.00       1         Core::CSurfaceObject::LoadImage(std::string const&) [12]
                0.00    0.00       2/10          Core::CLogger::operator<<(std::string const&) [5]
                0.00    0.00       1/2           Core::CSurfaceObject::FreeSurface() [6]
-----------------------------------------------
                0.00    0.00       1/1           SDL_main [58]
[13]     0.0    0.00    0.00       1         Core::CSurfaceObject::CSurfaceObject() [13]
                0.00    0.00       2/10          Core::CLogger::operator<<(std::string const&) [5]
-----------------------------------------------
                0.00    0.00       1/1           SDL_main [58]
[14]     0.0    0.00    0.00       1         Core::CSurfaceObject::~CSurfaceObject() [14]
                0.00    0.00       2/10          Core::CLogger::operator<<(std::string const&) [5]
                0.00    0.00       1/2           Core::CSurfaceObject::FreeSurface() [6]
-----------------------------------------------
                0.00    0.00       1/1           __static_initialization_and_destruction_0(int, int) [9]
[15]     0.0    0.00    0.00       1         Core::CLogger::~CLogger() [15]
-----------------------------------------------

 This table describes the call tree of the program, and was sorted by
 the total amount of time spent in each function and its children.

 Each entry in this table consists of several lines.  The line with the
 index number at the left hand margin lists the current function.
 The lines above it list the functions that called this function,
 and the lines below it list the functions this one called.
 This line lists:
     index	A unique number given to each element of the table.
		Index numbers are sorted numerically.
		The index number is printed next to every function name so
		it is easier to look up where the function in the table.

     % time	This is the percentage of the `total' time that was spent
		in this function and its children.  Note that due to
		different viewpoints, functions excluded by options, etc,
		these numbers will NOT add up to 100%.

     self	This is the total amount of time spent in this function.

     children	This is the total amount of time propagated into this
		function by its children.

     called	This is the number of times the function was called.
		If the function called itself recursively, the number
		only includes non-recursive calls, and is followed by
		a `+' and the number of recursive calls.

     name	The name of the current function.  The index number is
		printed after it.  If the function is a member of a
		cycle, the cycle number is printed between the
		function's name and the index number.


 For the function's parents, the fields have the following meanings:

     self	This is the amount of time that was propagated directly
		from the function into this parent.

     children	This is the amount of time that was propagated from
		the function's children into this parent.

     called	This is the number of times this parent called the
		function `/' the total number of times the function
		was called.  Recursive calls to the function are not
		included in the number after the `/'.

     name	This is the name of the parent.  The parent's index
		number is printed after it.  If the parent is a
		member of a cycle, the cycle number is printed between
		the name and the index number.

 If the parents of the function cannot be determined, the word
 `<spontaneous>' is printed in the `name' field, and all the other
 fields are blank.

 For the function's children, the fields have the following meanings:

     self	This is the amount of time that was propagated directly
		from the child into the function.

     children	This is the amount of time that was propagated from the
		child's children to the function.

     called	This is the number of times the function called
		this child `/' the total number of times the child
		was called.  Recursive calls by the child are not
		listed in the number after the `/'.

     name	This is the name of the child.  The child's index
		number is printed after it.  If the child is a
		member of a cycle, the cycle number is printed
		between the name and the index number.

 If there are any cycles (circles) in the call graph, there is an
 entry for the cycle-as-a-whole.  This entry shows who called the
 cycle (as parents) and the members of the cycle (as children.)
 The `+' recursive calls entry shows the number of function calls that
 were internal to the cycle, and the calls entry for each member shows,
 for that member, how many times it was called from other members of
 the cycle.


Index by function name

   [7] global destructors keyed to _ZN4Core14CSurfaceObjectC2Ev (CSurfaceObject.cpp) [11] Core::CSurfaceObject::ScaleImage(float) [14] Core::CSurfaceObject::~CSurfaceObject()
   [8] global destructors keyed to _ZN4Core6ErrLogE (CLogger.cpp) [6] Core::CSurfaceObject::FreeSurface() [15] Core::CLogger::~CLogger()
  [10] __static_initialization_and_destruction_0(int, int) (CSurfaceObject.cpp) [12] Core::CSurfaceObject::LoadImage(std::string const&) [5] Core::CLogger::operator<<(std::string const&)
   [9] __static_initialization_and_destruction_0(int, int) (CLogger.cpp) [13] Core::CSurfaceObject::CSurfaceObject() [1] zoomSurfaceRGBA
yaustar is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 03:18 PM   #26
 
Join Date: Jan 2006
Posts: 4,288
Trader Feedback: 0
Default

Quote:
Originally Posted by head_54us
Under GCC IIRC, you add -pg to the CFLAGs and perform a recompile. When you run the executable, if will record the time that functions take to finish, the call charts etc. This information will be in a file gmon.out (I think) which can be read using a tool like gprof.
Cool Thanks for the info.
__________________
[URL="http://www.newlilwayne.com"]www.NewLilWayne.com[/URL]
soccerPMN is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 03:19 PM   #27

...in a dream...
 
SG57's Avatar
 
Join Date: Jul 2005
Posts: 4,957
Trader Feedback: 0
Default

For someone working on a new firmware and is 26 on a 'kiddies' forum, they really have too much time on there hands eh? (pedophile? )

On Topic:

Why would you want to set a variable to determine whether to loop or not? 'true' is already defined as 1, so setting '1' in the loop will make it loop forever. To stop this, why not just call 'break'... Doesnt take up any memory on the stack as a variable... That's not too good when optimizing code...

And what does my sudden misunderstanding of your poorly written response have to do with me being a 'coder/hacker'? Seems like you need to get back to coding that firmware as all the newly acquired fresh air you got from leaving your house is messing with your brain

@head_54us - Nice, would never have known that Im gonna give it a test run...
__________________
SG57 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 03:21 PM   #28

My name is Mud
 
Join Date: Dec 2005
Posts: 1,538
Trader Feedback: 0
Default

Reported to moderator.
__________________

hàrléyg² is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 03:24 PM   #29

...in a dream...
 
SG57's Avatar
 
Join Date: Jul 2005
Posts: 4,957
Trader Feedback: 0
Default

Quit going off topic Harleyg, your ruining this thread. Oh and nice response, really backs yourself up.

On Topic:

@head_54us - Know any other helpful things for optimization? Seems like you've done your homework
__________________
SG57 is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 09-09-2006, 03:24 PM   #30

sceKernelExitGame();
 
Bronx's Avatar
 
Join Date: Jan 2006
Location: New York
Posts: 3,125
Trader Feedback: 0
Default

Quote:
Originally Posted by SG57
For someone working on a new firmware and is 26 on a 'kiddies' forum, they really have too much time on there hands eh? (pedophile )

On Topic:

Why would you want to set a variable to determine whether to loop or not? 'true' is already defined as 1, so setting '1' in the loop will make it loop forever. To stop this, why not just call 'break'... Doesnt take up any memory on the stack as a variable... That's not too good when optimizing code...

And what does my sudden misunderstanding of your poorly written response have to do with me being a 'coder/hacker'? Seems like you need to get back to coding that firmware as all the newly acquired fresh air you got from leaving your house is messing with your brain

@head_54us - Nice, would never have known that Im gonna give it a test run...
We weren't talking about using break at all... Harley was suggestion a more efficient way to run a loop that goes on forever. And as slasher suggested the for doesnt go through the cpu everytime, therefore providing an extar "oomph"
Bronx is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply

Tags
c or c , optimizations

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off



All times are GMT -8. The time now is 01:18 AM.



Use of this Web site constitutes acceptance of the TERMS & CONDITIONS and PRIVACY POLICY
Copyright © 2009, QJ.NET. All Rights Reserved.
Contact Us