Short: NCC - C source program analysis tool for GCC Author: sxanth@ceid.upatras.gr Type: dev/gg Architecture: m68k-amigaos Uploaded: louise@louise.amiga.hu (LouiSe) Url: http://students.ceid.upatras.gr/~sxanth/ncc/ Ported by LouiSe more info and other AMIGA ports at: http://louise.amiga.hu ----------------------------------------- ncc 0.5 WHATIS ====== Basically, ncc is a tool for hackers designed to provide program analysis data of C source code. That is program flow and usage of variables. Some big programs out there are by default obfuscated, either due to extreme size, programming style, hacks upon hacks and other crazyness. In order to do program analysis correctly, there has to be compilation of expressions, and thus ncc is really a compiler (supporting zero architectures). At the same time, ncc is small and easy to understand so you can hack it and add custom features and extensions at any stage of the compilation, to match what you expect and consider useful as output. Most common GNU extensions are supported and there has been an effort be practically useful in the GNU system (which is not easy because the GNU system is very gcc-friendly). The goal is to be able to replace 'ncc' in Makefiles and work with the big open source projects. INSTALL ======= 'make' and copy the file doc/nognu to /usr/include. This file is used to fix some madness of libc header files and remove some GNU extensions which violate the C grammar and can be removed without problems. If you don't want to copy it to /usr/include, edit config.h and recompile. USAGE ===== ncc uses gcc for preprocessing because the standard library headers eventually need some other architecture specific header which are somewhere where gcc knows where. Any options starting with -D and -I will be passed to gcc for preprocessing. Generally, because ncc should be able to work from makefiles instead of gcc, all options unless starting with '-nc' produce no error (and may be even passed to gcc in a special mode). The files compiled with ncc, will have the __GCC__ macro defined, because many programs are written for gcc and take some gcc extensions for granted. ncc additionally defines __NCC__ macro. the default output (at stdout) is the report of :function calls, use of global variables and use of members of structures. with "-ncmv" each use of global variable or member of structure is reported multiple times as used. This is a way to understand better how the code works, by looking the use of variables between function calls. with "-nc2dm" the output is suitable for the 2dmap viewer and includes only the function calls. with "-nchelp" help is displayed on the -nc options. with "-ncoo" the output goes to a file sourcefile.c.nccout HACKING mpg123 ============== This one is easy (because it's done "the right way", programs are exponential: the number of tasks a program can do is N^2 if N are the lines of code. Thus any program of more than 50000 lines has probably design flaws (unless it's device drivers)) Anyway, to view the calls of mpg123, the command is: for a in *.c; do ncc $a -ncoo -nc2dm; done cat *.nccout > code.map 2dmap HACKING LARGER PROGRAMS ======================= The obvious way is to use make with ncc, so that the required -D and -I options are invoked, and only the right files are compiled (if there are depenencies). Normally, changing "CC=gcc" to "CC=ncc -ncoo" would be enough. But alas! often it isn't. So now you have to devise ways to hack the Makefiles or think of other tricks to get the job done. Sometimes the make procedure expects object files which ncc does not produce and it may fail. Other programs even compile and run helpers in the procedure of make. If all else fails, the last resort that always works, is using the "-ncgcc" option. with "-ncgcc", ncc will also run gcc in parallel with all it's options except the -nc ones. So nobody will understand that ncc was even run and the makefiles will be happy. It takes 1000% more time, but computers do get faster every day. In this case, it is generally a good idea to remove any '-O2 -g' options. BYTECODE ======== with "-nccc", the output is some bytecode for the expressions. In this mode ncc does full syntax and semantics tests, unlike the other modes which'd better work with sources known to be correct. The output is definatelly incomplete and of little use, but its fun to look at. A tip is that variables are taken one level down : the '&' operator disappears and an extra '*' goes infront of variables. The enlightening example is: --------------------------------- int **pp, *pa [10], a [10][10]; pp[1][1]; pa[1][1]; a[1][1]; --------------------------------- BTW, since in C: &a == a == &a[0] for an array 'a', ncc supports &(&(&(&a))) == a which is mathematically and logically correct as the address-of operator may have no effect and still be valid (for pointer operands that are not lvalues). TROUBLESHOOTING =============== As this is the first release of ncc, braindead bugs should still be in here. However, thanks to open source, there are infinitive test cases. ncc has been tested with: linux kernel (partial according to depend), Imagemagick, gcc xanim, mpg123, bladeenc, bzip2, gtk, gnu-fileutils, less, mpeg_play, nasm, ncftp, vim, sox, bind, gdb although these programs are correct and ncc lacks testing on finding errors on wrong programs. Also read the file doc/TROUBLES TODO ==== - GNU statements in expressions, are parsed but the return type is not saved and it's done int. That of course is wrong but since for the moment there were no problems during testing it stays. Will be fixed. - The bytecode can be : optimized, turned into architecture assembly, run with an intepreter, etc. Bytecode for the statements will be added if there is interest for any of the above. - It is easy to implement parsing structures when lookup for a member is done for the first time. That will save both space and time as more structures declared in header files are not used. But there is no reason to get paranoid with optimization. The major slowdown factor is having to use -ncgcc afterall. - Maybe get into C++. THEREST ======= Program written by Stelios Xanthakis. e-mail: sxanth@ceid.upatras.gr ncc latest download: http://students.ceid.upatras.gr/~sxanth/ncc/ Check out: http://students.ceid.upatras.gr/~sxanth/PP/ for the solution to symmetrical cryptography.