GCC Myths and Facts

tram · 发表于 2003-6-1 16:27:22

GCC Myths and Facts(关于 gcc 3 的一些说明，对编译优化很有用处)
by Joao Seabra, in Editorials - Saturday, February 15th 2003 00:00 PDT

Since my good old Pentium 166 days, I've liked to search for the best optimizations possible so programs can take the maximum advantage of hardware/CPU cycles. If I have a nice piece of hardware, why not run it at its full power, using every little feature? Shouldn't we all try to get the best results from the money invested in our machines?

Copyright notice: All reader-contributed material on freshmeat.net is the property and responsibility of its author; for reprint rights, please contact the author directly.

This article is written for the average desktop Linux user and with the x86 architecture and C/C++ in mind, but some of its content can be applied to all architectures and languages.
GCC 3 Improvements
GCC 3 is the biggest step forward since GCC 2 and represents more than ten years of work and two of hard development. It has major benefits over its predecessor, including:
Target Improvements

* A new x86 backend, generating much-improved code.
* Support for a generic i386-elf target.
* A new option to emit x86 assembly code using an Intel-style syntax.
* Better code generated for floating point-to-integer conversions, leading to better performance by many 3D applications.

Language Improvements

* A new C++ ABI. On the IA-64 platform, GCC is capable of interoperating with other IA-64 compilers.
* A significant reduction in the size of symbol and debugging information (thanks to the new ABI).
* A new C++ support library and many C++ bugfixes, vastly improving conformance to the ISO C++ standard.
* A new inliner for C++.
* A rewritten C preprocessor, integrated into the C, C++, and Objective C compilers, with many improvements, including ISO C99 support and improvements to dependency generation.

General Optimizations

* Infrastructure for profile-driven optimizations.
* Support for data prefetching.
* Support for SSE, SSE2, 3DNOW!, and MMX instructions.
* A basic block reordering pass.
* New tail call and sibling call elimination optimizations.

Why do some programmers and users fail to take advantage of these amazing new features? I admit that some of them are still "experimental", but not all of them. Perhaps the PGCC (Pentium compiler group) project gave rise to several misunderstandings which persist today. (PGCC offered several Pentium-specific optimizations. I looked at it when it first started, but benchmarks showed that the improvement was only about 2%-5% over GCC 2.7.2.3.)

We should clear the air about the GCC misconceptions. Let's start with the most loved and hated optimization: -Ox.
Myths
I use -O69 because it is faster than -O3.

This is wrong!

The highest optimization is -O3.

From the GCC 3.2.1 manual:

   -O3 Optimize yet more.  -O3 turns on all optimizations
            specified by -O2 and also turns  on  the
            -finline-functions and -frename-registers options.

The most skeptical can verify this in gcc/topolev.c:

/* Scan to see what optimization level has been specified.
That will determine the default value of many flags. */

-snip-

  if (optimize >= 3)

   {

   flag_inline_functions = 1;

   flag_rename_registers = 1;

   }

If you are using GCC, there's no point in using anything higher than 3.
-O2 turns on loop unrolling.

In the GCC manpage, it's clearly written that:

-O2 turns on all optional optimizations except for loop unrolling [...]

Skeptics: check topolev.c.

So when you use -O2, which optimizations are you using?

The -O2 flag turns on the following flags:

* -O1, which turns on:
      o defer pop (see -fno-defer-pop)
      o -fthread-jumps
      o -fdelayed-branch (on, but specific machines may handle it differently)
      o -fomit-frame-pointer (only on if the machine can debug without a frame pointer; otherwise, you need to specify)
      o guess-branch-prob (see -fno-guess-branch-prob)
      o cprop-registers (see -fno-cprop-registers)
* -foptimize-sibling-calls
* -fcse-follow-jumps
* -fcse-skip-blocks
* -fgcse
* -fexpensive-optimizations
* -fstrength-reduce
* -frerun-cse-after-loop
* -frerun-loop-opt
* -fcaller-saves
* -flag_force_mem
* peephole2 (a machine-dependent option; see -fno-peephole2)
* -fschedule-insns (if supported by the target machine)
* -fregmove
* -fstrict-aliasing
* -fdelete-null-pointer-checks
* reorder blocks

There's no point in using -O2 -fstrength-reduce, etc., since O2 implies all this.
Facts
The truth about -O*

This leaves us with -O3, which is the same as -O2 and:

* -finline-functions
* -frename-registers

Inline-functions is useful in some cases (mainly with C++) because it lets you define the size of inlined functions (600 by default) with -finline-limit. Unfortunately, if you set a high number, at compile time you will probably get an error complaining about lack of memory. This option needs a huge amount of memory, takes more time to compile, and makes the binary big. Sometimes, you can see a profit, and sometimes, you can't.

Rename-registers attempts to avoid false dependencies in scheduled code by making use of registers left over after register allocation. This optimization will most benefit processors with lots of registers. It can, however, make debugging impossible, since variables will no longer stay in a "home register". Since i386 is not a register-rich architecture, I don't think this will have much impact.

A higher -O does not always mean improved performance. -O3 increases the code size and may introduce cache penalties and become slower than -O2. However, -O2 is almost always faster than -O.
-march and -mcpu

With GCC 3, you can specify the type of processor you're using with -march or -mcpu. Although they seem the same, they're not, since one specifies the architecture, and other the CPU. The available options are:

* i386
* i486
* i586
* i686

* Pentium
* pentium-mmx
* pentiumpro
* pentium2
* pentium3
* pentium4

* k6
* k6-2
* k6-3

* athlon
* athlon-tbird
* athlon-4
* athlon-xp
* athlon-mp

-march implies -mcpu, so when you use -march, there's no need to use -mcpu.

-mcpu generates code tuned for the specified CPU, but it does not alter the ABI and the set of available instructions, so you can still run the resulting binary on other CPUs (it turns on flags like mmx/3dnow, etc.).

When you use -march, you generate code for the specified machine type, and the available instructions will be used, which means that you probably cannot run the binary on other machine types.
Conclusion

Fine-tune your Makefile, remove those redundant options, and take a look at the GCC manpage. I bet you will save yourself a lot of time. There's probably a bug somewhere that can be smashed by turning off some of GCC's default flags.

This article discusses only a few of GCC's features, but I won't broaden its scope. I just want to try to clarify some of the myths and misunderstandings. There's a lot left to say, but nothing that can't be found in the Fine Manual, HOWTOs, or around the Internet. If you have patience, a look at the GCC sources can be very rewarding.

When you're coding a program, you'll inevitably run into bugs. Occasionally, you'll find one that's GCC's fault. When you do, stop to think about the time and effort that's gone into the compiler project and all that it's given you. You might think twice before simply flaming GCC.
Interesting Links

* http://www.gnu.org/software/gcc/
* http://gcc.gnu.org/onlinedocs/gcc-3.2/gcc/
* http://gcc.gnu.org/onlinedocs/gcc/Gcov-and-Optimization.html
* http://www.redhat.com/software/gnupro/technical/gnupro_gcc.html
* http://www.freshmeat.net/projects/prelink/
* http://www.tldp.org/HOWTO/GCC-HOWTO/index.html (last updated in May of 1999)

realhyg · 发表于 2003-6-1 17:27:26

请问gcc 3.3如何安装，我用slackware的安装脚本安装后它还叫我改什么 *.la的文件，我不知道怎么改，所以就没改，结果后来编译glibc2.3.2时就出问题了。
这是slackware关于gcc3.3的提示：

cat << EOF

************************************************
* OK, now you must edit the *.la files to make *
* sure libtool in it's infinate wisdom has not *
* inserted lots of $TMP paths into the       *
* dependancy_libs section.  I've verified that *
* it's not the "make install prefix=/...."    *
* (or now, the "make install DESTDIR=$PKG1") *
* causing this -- it's libtool failing to have *
* a clue.  It's possible that a sed script    *
* could also strip this garbage out, but I    *
* trust doing it myself more.  YMMV.          *
************************************************

NOTE:  failure to edit the .la files with produce
   no ill effects, but looks sloppy.

Edit the .la files in the package-gcc-*
directories in another console now (or don't),
and then press ENTER to build the .tgz packages.

EOF
echo -n "Hit Enter to build final packages -->"
read inputjunk;

tram · 发表于 2003-6-1 20:03:26

推荐现在不要安装gcc-3.3，这个版本和gcc-3.2系列变化比较大，会造成很多软件不能编译，装gcc-3.2.3就挺好的，呵呵。
安装指导：
http://lfs.cosoft.org.cn/lfscvs/chapter06/gcc.html

realhyg · 发表于 2003-6-1 23:24:34

"如果你把这个包缺省的优化参数(包括 -march 和 -mcpu参数)改变的话，它会有很不良的表现。最好不要优化这个包"
why??

and gcc-3.2.3 also has this infomation,
i don not use gcc 3.3, then to bulid glibc 2.3.2, it is also has not succse.

tram · 发表于 2003-6-2 02:23:03

最好别升级glibc，会有很多segfault.
至于安装gcc-3.3，你用的安装脚本是哪里来的？
"如果你把这个包缺省的优化参数(包括 -march 和 -mcpu参数)改变的话，它会有很不良的表现。最好不要优化这个包"
why??
try it and you'll know.

realhyg · 发表于 2003-6-2 02:38:44

我是从slackware的网站上下的，slackware本身就是用这些脚本做的.tgz包，请
其实我觉得的slackware比redhat要方便升级，就用这些脚本我升级了qt 3.1.2和kde 3.1.2。
看他的gcc3.3的制作.tgz包的脚本，“.txt”是我加的为了上传：

realhyg · 发表于 2003-6-2 02:40:38

它提示的那段英文我看不懂，tram版主能否解释一下。

tram · 发表于 2003-6-2 15:27:31

它说的编辑.la文件，是为了与libtool软件包配合工作，libtool软件包是编译的时候去决定库文件的依赖关系的，如果不编辑.la文件，不会有任何的影响，因为它会调用gcc来找出dependencies.呵呵，glibc编译出错不是这个原因。
在LFS的邮件列表里有人编译了glibc，用的是gcc-3.3，要改东西的。详细的看：
http://archive.linuxfromscratch. ... v/2003/05/1191.html
还有一个thread没压缩，我贴一下：

Hi,

I tried to compile glibc 2.3.2 with gcc 3.3 and I got this:

...
sscanf.c:31: warning: conflicting types for built-in function `sscanf'
sscanf.c: In function `sscanf':
sscanf.c:37: error: `va_start' used in function with fixed args
make[2]: *** [/root/glibc-build/stdio-common/sscanf.o] Error 1
make[2]: Leaving directory `/root/glibc-2.3.2/stdio-common'
make[1]: *** [stdio-common/subdir_lib] Error 2
make[1]: Leaving directory `/root/glibc-2.3.2'
make: *** [all] Error 2

When I compile with gcc 3.2.3 all is ok.

Anyone can help me ?

--
Florian
--
Unsubscribe: send email to listar@linuxfromscratch.org
and put 'unsubscribe lfs-dev' in the subject header of the message

Florian Fernandez wrote:

> Hi,
>
> I tried to compile glibc 2.3.2 with gcc 3.3 and I got this:
>
> ...
> sscanf.c:31: warning: conflicting types for built-in function `sscanf'
> sscanf.c: In function `sscanf':
> sscanf.c:37: error: `va_start' used in function with fixed args
> make[2]: *** [/root/glibc-build/stdio-common/sscanf.o] Error 1
> make[2]: Leaving directory `/root/glibc-2.3.2/stdio-common'
> make[1]: *** [stdio-common/subdir_lib] Error 2
> make[1]: Leaving directory `/root/glibc-2.3.2'
> make: *** [all] Error 2
>
> When I compile with gcc 3.2.3 all is ok.
>
> Anyone can help me ?
>

The only workaround I found is:
in file usr/lib/gcc-lib/i686-pc-linux-gnu/3.3/include/stdarg.h
add commentary at line 50:

//#define va_start(v,l) __builtin_va_start(v,l)

anyone has an other idea ?

Florian Fernandez wrote:
> Hi,
>
> I tried to compile glibc 2.3.2 with gcc 3.3 and I got this:
>
> ...
> sscanf.c:31: warning: conflicting types for built-in function `sscanf'
> sscanf.c: In function `sscanf':
> sscanf.c:37: error: `va_start' used in function with fixed args
> make[2]: *** [/root/glibc-build/stdio-common/sscanf.o] Error 1
> make[2]: Leaving directory `/root/glibc-2.3.2/stdio-common'
> make[1]: *** [stdio-common/subdir_lib] Error 2
> make[1]: Leaving directory `/root/glibc-2.3.2'
> make: *** [all] Error 2
>
> When I compile with gcc 3.2.3 all is ok.
>
> Anyone can help me ?
>

glibc-2.3.2-sscanf.patch

--- stdio-common/sscanf.c.~1.8.~ 2003-01-16 11:25:20.000000000 +0100
+++ stdio-common/sscanf.c 2003-03-05 12:07:34.000000000 +0100
@@ -1,4 +1,4 @@
-/* Copyright (C) 1991,95,96,98,2002 Free Software Foundation, Inc.
+/* Copyright (C) 1991,95,96,98,2002, 2003 Free Software
Foundation, Inc.
   This file is part of the GNU C Library.

   The GNU C Library is free software; you can redistribute it
and/or
@@ -27,9 +27,7 @@
/* Read formatted input from S, according to the format string
FORMAT.  */
/* VARARGS2 */
int
-sscanf (s, format)
-    const char *s;
-    const char *format;
+sscanf (const char *s, const char *format, ...)
{
   va_list arg;
   int done;

--
Confucius:  He who play in root, eventually kill tree.
Registered with The Linux Counter.  http://counter.li.org/
Slackware 9.0 Kernel 2.4.20 i686 (GCC) 3.3
Uptime: 13 days, 21:49, 1 user, load average: 1.35, 1.25, 1.34

David wrote:

> You'll also find that it fails during the "test-lfs-timeout" and
> can use this patch for that.
>
> glibc-2.3.2-test-lfs-timeout.patch
>
> --- ./io/test-lfs.c~  Fri Feb  9 18:04:07 2001
> +++ ./io/test-lfs.c Sat Feb 17 04:30:18 2001
> @@ -34,7 +34,7 @@
> #define PREPARE do_prepare
>
> /* We might need a bit longer timeout.  */
> -#define TIMEOUT 20 /* sec */
> +#define TIMEOUT 120 /* sec */
>
> /* This defines the `main' function and some more.  */
> #include <test-skeleton.c>
>

Great Thanks  David. Explicit and fast answer

!

最后我还是贴一段 LFS FAQ 里的话：
#

gcc-3.3: Don't use this yet unless you're a developer and can fix errors because it breaks many packages.
#

		自动登录	找回密码
密码			注册

GCC Myths and Facts

本帖子中包含更多资源

本帖子中包含更多资源