alexey.zaytsev | 3 Sep 23:52

[PATCH 0/10] Sparse linker

Hello.

I've been working on a "sparse linker" this summer as my Google
Summer of Code project. Wasn't neraly as productive as I hoped,
but I've got some results that I would like to share. Moreover,
I plan continuing this work, and would like to hear comments on
what was done so far.

The design didn't change much from what was proposed. We run
sparse to generate a "sparse object" file containing a list of
symbols, then run the "linker" to unite those object files into
bigger ones. This way, in the end we get a file containing all
the global symbols appearing in the program. After learning
more on the subject, I now agree that we should include the
intermediate code representation into the object files.

The implementation is built around a generic serialization
mechanism [PATCH 01]. It handles many sorts of complex data
structures, with pointers, cycles, unions, etc. E.g. it is able
to serialize beasts like the sparse pointer lists. The price
for this is a four byte overhead prepended to every
serializable structure by the allocation wrapper. Also, you
have to use a macro when declaring a serializable structure
(or an array of such) statically. One limitation I was unable
to overcome is the inability to work with structures used both
stand-alone and embedded into bigger ones. Luckily, we have no
such cases in the sparse codebase. The serializer produces C
code, containing the data structures beind serialized. For the
structure definitions, the generated code includes the original
headers, defining the structures. After serializing a bunch of
(Continue reading)

alexey.zaytsev | 3 Sep 23:55

[PATCH 01/10] Serialization engine

From: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>

Signed-off-by: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>
---
 Makefile             |   11 ++-
 serialization-test.c |  120 +++++++++++++++++++++++++
 serialization-test.h |   32 +++++++
 serialization.c      |   99 +++++++++++++++++++++
 serialization.h      |  240 ++++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 499 insertions(+), 3 deletions(-)
 create mode 100644 serialization-test.c
 create mode 100644 serialization-test.h
 create mode 100644 serialization.c
 create mode 100644 serialization.h

diff --git a/Makefile b/Makefile
index 077003c..721979e 100644
--- a/Makefile
+++ b/Makefile
@@ -27,7 +27,7 @@ INCLUDEDIR=$(PREFIX)/include
 PKGCONFIGDIR=$(LIBDIR)/pkgconfig

 PROGRAMS=test-lexing test-parsing obfuscate compile graph sparse test-linearize example \
-	 test-unssa test-dissect ctags
+	 test-unssa test-dissect ctags serialization-test

 
 INST_PROGRAMS=sparse cgcc
@@ -40,12 +40,13 @@ endif

(Continue reading)

alexey.zaytsev | 3 Sep 23:55

[PATCH 02/10] Handle -emit_code and the -o file options.

From: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>

Signed-off-by: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>
---
 lib.c |   19 ++++++++++++++++---
 lib.h |    3 +++
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/lib.c b/lib.c
index e274750..edb8ac9 100644
--- a/lib.c
+++ b/lib.c
@@ -214,6 +214,8 @@ int dbg_entry = 0;
 int dbg_dead = 0;

 int preprocess_only;
+int emit_code = 0;
+const char *output_file = "a.out";

 static enum { STANDARD_C89,
               STANDARD_C94,
@@ -347,11 +349,14 @@ static char **handle_switch_m(char *arg, char **next)

 static char **handle_switch_o(char *arg, char **next)
 {
-	if (!strcmp (arg, "o")) {       // "-o foo"
-		if (!*++next)
+	if (!strcmp (arg, "o")) { 	// "-o foo"
+		next++;
+		if (!*next)
(Continue reading)

alexey.zaytsev | 3 Sep 23:55

[PATCH 03/10] Check stdin if no input files given, like cc1.

From: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>

Signed-off-by: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>
---
 lib.c    |   42 +++++++++++++++++++++---------------------
 lib.h    |    6 +++---
 sparse.c |    3 +++
 3 files changed, 27 insertions(+), 24 deletions(-)

diff --git a/lib.c b/lib.c
index edb8ac9..0e30424 100644
--- a/lib.c
+++ b/lib.c
@@ -215,7 +215,7 @@ int dbg_dead = 0;

 int preprocess_only;
 int emit_code = 0;
-const char *output_file = "a.out";
+const char *output_file = NULL;

 static enum { STANDARD_C89,
               STANDARD_C94,
@@ -856,38 +856,38 @@ struct symbol_list *sparse_initialize(int argc, char **argv, struct string_list
 		if (!arg)
 			break;

-		if (arg[0] == '-' && arg[1]) {
-			args = handle_switch(arg+1, args);
+		if (arg[0] == '-') {
+			if (arg[1])
(Continue reading)

alexey.zaytsev | 3 Sep 23:55

[PATCH 04/10] Add char *first_string(struct string_list *)

From: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>

Signed-off-by: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>
---
 lib.h     |    5 +++++
 ptrlist.h |   17 ++++++++++++++++-
 2 files changed, 21 insertions(+), 1 deletions(-)

diff --git a/lib.h b/lib.h
index 19a724f..532e7a4 100644
--- a/lib.h
+++ b/lib.h
@@ -186,6 +186,11 @@ static inline pseudo_t first_pseudo(struct pseudo_list *head)
 	return first_ptr_list((struct ptr_list *)head);
 }

+static inline char *first_string(struct string_list *head)
+{
+	return first_ptr_list_notag((struct ptr_list *)head);
+}
+
 static inline void concat_symbol_list(struct symbol_list *from, struct symbol_list **to)
 {
 	concat_ptr_list((struct ptr_list *)from, (struct ptr_list **)to);
diff --git a/ptrlist.h b/ptrlist.h
index dae0906..fe43de1 100644
--- a/ptrlist.h
+++ b/ptrlist.h
@@ -73,15 +73,30 @@ static inline void *first_ptr_list(struct ptr_list *list)
 	return PTR_ENTRY(list, 0);
(Continue reading)

alexey.zaytsev | 3 Sep 23:55

[PATCH 05/10] Serializable ptr lists.

From: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>

Signed-off-by: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>
---
 ptrlist.c |   30 +++++++++++++++++++++++++++---
 ptrlist.h |    7 +++++++
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/ptrlist.c b/ptrlist.c
index 2620412..fb6b6db 100644
--- a/ptrlist.c
+++ b/ptrlist.c
@@ -12,9 +12,33 @@
 #include "ptrlist.h"
 #include "allocate.h"
 #include "compat.h"
+#include "lib.h"
+#include "serialization.h"
+
+static int ptr_list_serializer(struct serialization_stream *s, struct ptr_list *w)
+{
+	die("Don't serialize abstract ptr lists, serialize your custom ones.");
+	return 0;
+}
+
+__DECLARE_ALLOCATOR(struct ptr_list, ptr_list_core);
+__ALLOCATOR(struct ptr_list, "ptr list", ptr_list_core);
+DO_WRAP(struct ptr_list, ptr_list, "ptrlist.h", __alloc_ptr_list_core,
+	__alloc_ptrlist, __free_ptr_list_core, __free_ptrlist,
+	ptr_list_serializer);
(Continue reading)

alexey.zaytsev | 3 Sep 23:55

[PATCH 06/10] Linker core, serialization and helper functions.

From: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>

Signed-off-by: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>
---
 Makefile |    6 +++-
 link.c   |   57 ++++++++++++++++++++++++++++++++++++++++
 link.h   |   87 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 148 insertions(+), 2 deletions(-)
 create mode 100644 link.c
 create mode 100644 link.h

diff --git a/Makefile b/Makefile
index 721979e..dd1fe8a 100644
--- a/Makefile
+++ b/Makefile
@@ -40,13 +40,13 @@ endif

 LIB_H=    token.h parse.h lib.h symbol.h scope.h expression.h target.h \
 	  linearize.h bitmap.h ident-list.h compat.h flow.h allocate.h \
-	  storage.h ptrlist.h dissect.h serialization.h
+	  storage.h ptrlist.h dissect.h serialization.h link.h

 LIB_OBJS= target.o parse.o tokenize.o pre-process.o symbol.o lib.o scope.o \
 	  expression.o show-parse.o evaluate.o expand.o inline.o linearize.o \
 	  sort.o allocate.o compat-$(OS).o ptrlist.o \
 	  flow.o cse.o simplify.o memops.o liveness.o storage.o unssa.o dissect.o \
-	  serialization.o
+	  serialization.o link.o

 LIB_FILE= libsparse.a
(Continue reading)

alexey.zaytsev | 3 Sep 23:55

[PATCH 07/10] Let sparse serialize the symbol table of the checked file

From: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>

Signed-off-by: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>
---
 sparse.c |  105 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 102 insertions(+), 3 deletions(-)

diff --git a/sparse.c b/sparse.c
index b7a1f8b..1b25d7e 100644
--- a/sparse.c
+++ b/sparse.c
@@ -8,6 +8,8 @@
  *
  *  Licensed under the Open Software License version 1.1
  */
+#define _GNU_SOURCE
+
 #include <stdarg.h>
 #include <stdlib.h>
 #include <stdio.h>
@@ -17,6 +19,7 @@
 #include <fcntl.h>

 #include "lib.h"
+#include "link.h"
 #include "allocate.h"
 #include "token.h"
 #include "parse.h"
@@ -595,18 +598,114 @@ static void check_symbols(struct symbol_list *list)
 	} END_FOR_EACH_PTR(sym);
(Continue reading)

alexey.zaytsev | 3 Sep 23:55

[PATCH 08/10] Sparse Object Link eDitor

From: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>

Signed-off-by: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>
---
 Makefile |    9 +++-
 sold.c   |  127 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 133 insertions(+), 3 deletions(-)
 create mode 100644 sold.c

diff --git a/Makefile b/Makefile
index dd1fe8a..4fa2f82 100644
--- a/Makefile
+++ b/Makefile
@@ -2,7 +2,6 @@ VERSION=0.4.1

 OS = linux

-
 CC = gcc
 CFLAGS = -O2 -finline-functions -fno-strict-aliasing -g
 CFLAGS += -Wall -Wwrite-strings
@@ -27,10 +26,10 @@ INCLUDEDIR=$(PREFIX)/include
 PKGCONFIGDIR=$(LIBDIR)/pkgconfig

 PROGRAMS=test-lexing test-parsing obfuscate compile graph sparse test-linearize example \
-	 test-unssa test-dissect ctags serialization-test
+	 test-unssa test-dissect ctags serialization-test sold

 
-INST_PROGRAMS=sparse cgcc
(Continue reading)

alexey.zaytsev | 3 Sep 23:55

[PATCH 09/10] Rewrite cgcc, add cld and car to wrap ld and ar

From: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>

cgcc now compiles the serialized data produced by sparse
and also now it is able to handle multiple source files
well.

cld and car are there to integrate the sparse linker
into your build environment, wrapping ld and ar, and
compiling the data serialized by the sparse linker.

Signed-off-by: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>
---
 Makefile |    2 +-
 car      |   72 ++++++++++++++++++
 cgcc     |  252 ++++++++++++++++++++++++++++++++++++++++++++++++++++----------
 cld      |   84 +++++++++++++++++++++
 4 files changed, 369 insertions(+), 41 deletions(-)
 create mode 100755 car
 create mode 100755 cld

diff --git a/Makefile b/Makefile
index 4fa2f82..877634c 100644
--- a/Makefile
+++ b/Makefile
@@ -29,7 +29,7 @@ PROGRAMS=test-lexing test-parsing obfuscate compile graph sparse test-linearize
 	 test-unssa test-dissect ctags serialization-test sold

 
-INST_PROGRAMS=sparse cgcc sold
+INST_PROGRAMS=sparse sold cgcc cld car
(Continue reading)

alexey.zaytsev | 3 Sep 23:55

[PATCH 10/10] A simple demonstrational program that looks up symbols in sparse object files.

From: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>

Signed-off-by: Alexey Zaytsev <alexey.zaytsev <at> gmail.com>
---
 Makefile |    6 +++++-
 where.c  |   61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 66 insertions(+), 1 deletions(-)
 create mode 100644 where.c

diff --git a/Makefile b/Makefile
index 877634c..446e053 100644
--- a/Makefile
+++ b/Makefile
@@ -26,7 +26,7 @@ INCLUDEDIR=$(PREFIX)/include
 PKGCONFIGDIR=$(LIBDIR)/pkgconfig

 PROGRAMS=test-lexing test-parsing obfuscate compile graph sparse test-linearize example \
-	 test-unssa test-dissect ctags serialization-test sold
+	 test-unssa test-dissect ctags serialization-test sold where

 
 INST_PROGRAMS=sparse sold cgcc cld car
@@ -141,6 +141,9 @@ serialization-test: serialization-test.o $(LIBS)
 sold: sold.o $(LIBS)
 	$(QUIET_LINK)$(CC) $(LDFLAGS) -o $@ $< $(LIBS) -ldl

+where: where.o $(LIBS)
+	$(QUIET_LINK)$(CC) $(LDFLAGS) -o $@ $< $(LIBS) -ldl
+
 $(LIB_FILE): $(LIB_OBJS)
(Continue reading)


Gmane