[haiku-commits] haiku: hrev53415 - src/tools

  • From: waddlesplash <waddlesplash@xxxxxxxxx>
  • To: haiku-commits@xxxxxxxxxxxxx
  • Date: Wed, 28 Aug 2019 23:00:35 -0400 (EDT)

hrev53415 adds 4 changesets to branch 'master'
old head: 43895d31477772300e0bc8933004dcd7ea9df5b7
new head: cca88a8133f0c9ddfbc8b6f0f83697f31edaab1d
overview: 
https://git.haiku-os.org/haiku/log/?qt=range&q=cca88a8133f0+%5E43895d314777

----------------------------------------------------------------------------

a5f58aba5700: tools: Add an "exec" tool.
  
  This utility takes command-strings, e.g. "gcc -c file.c -D...",
  parses them into an argv, and then execvp()s that. The use-case
  is Jam, which cannot do this itself, but instead simply calls
  JAMSHELL (usually just "/bin/sh -c") to do that for it.
  
  Shells in general have a large amount of overhead (and bash in
  particular is especially bad here), so using a utility like this
  as JAMSHELL in most cases can be a significant speed-up.
  
  For example, on Haiku (32-bit):
  
  $ time sh -c 'for i in {1..100}; do sh -c "./exec test"; done'
  real    0m3.335s
  user    0m1.603s
  sys     0m1.612s
  
  $ time sh -c 'for i in {1..100}; do ./exec test; done'
  real    0m1.547s
  user    0m0.597s
  sys     0m0.867s
  
  So this means for every 100 executions, using bash has about 3.3s of
  overhead, and this tool cuts out over half of that. Probably for
  longer command strings, the overhead is significantly greater.
  But that should be clear soon enough...

b9b6a688e343: tools/exec: Exit with an error upon attempting to run multiple 
commands.
  
  This way, things that need a real shell will be more clear.

3cfe881d8840: OverriddenJamRules: Remove an unneeded and erroneous ";"

cca88a8133f0: tools/exec: Implement basic environment overrides.
  
  VAL=xxx... and VAL=$VAL:xxx... are supported; all other syntaxes
  will fail with an error message.
  
  When combined with a build/jam patch that will come in a later
  commit, this makes it possible to build a large number of targets
  using exec as JAMSHELL; including all of libroot. The performance
  difference is extremely obvious:
  
  jam -j2 libroot, JAMSHELL=/bin/sh (32-bit Haiku)
  real 1m43.571s
  user 1m10.961s
  sys  1m7.965s
  
  jam -j2 libroot, JAMSHELL=exec
  real 1m28.364s
  user 0m58.190s
  sys  0m57.563s
  
  So that is a savings of 15.21 seconds, or 15% of the build time.
  Something that is less I/O bound and more fork-bound (e.g.
  linking application catalogs) will almost certainly see
  an even bigger performance difference.
  
  Changes to add the necessary JAMSHELL overrides for those
  targets which need it, in order to make it possible to
  enable usage of "exec" by default, will be coming
  over the next few days/weeks...

                              [ Augustin Cavalier <waddlesplash@xxxxxxxxx> ]

----------------------------------------------------------------------------

2 files changed, 206 insertions(+), 1 deletion(-)
build/jam/OverriddenJamRules |   2 +-
src/tools/exec.c             | 205 +++++++++++++++++++++++++++++++++++++++

############################################################################

Commit:      a5f58aba5700ad79115ccd94a93985182d7c9f59
URL:         https://git.haiku-os.org/haiku/commit/?id=a5f58aba5700
Author:      Augustin Cavalier <waddlesplash@xxxxxxxxx>
Date:        Thu Aug 29 00:18:59 2019 UTC

tools: Add an "exec" tool.

This utility takes command-strings, e.g. "gcc -c file.c -D...",
parses them into an argv, and then execvp()s that. The use-case
is Jam, which cannot do this itself, but instead simply calls
JAMSHELL (usually just "/bin/sh -c") to do that for it.

Shells in general have a large amount of overhead (and bash in
particular is especially bad here), so using a utility like this
as JAMSHELL in most cases can be a significant speed-up.

For example, on Haiku (32-bit):

$ time sh -c 'for i in {1..100}; do sh -c "./exec test"; done'
real    0m3.335s
user    0m1.603s
sys     0m1.612s

$ time sh -c 'for i in {1..100}; do ./exec test; done'
real    0m1.547s
user    0m0.597s
sys     0m0.867s

So this means for every 100 executions, using bash has about 3.3s of
overhead, and this tool cuts out over half of that. Probably for
longer command strings, the overhead is significantly greater.
But that should be clear soon enough...

----------------------------------------------------------------------------

diff --git a/src/tools/exec.c b/src/tools/exec.c
new file mode 100644
index 0000000000..5b93fcdab8
--- /dev/null
+++ b/src/tools/exec.c
@@ -0,0 +1,132 @@
+/*
+ * Copyright 2019, Haiku, Inc. All rights reserved.
+ * Distributed under the terms of the MIT License.
+ *
+ * Authors:
+ *             Augustin Cavalier <waddlesplash>
+ */
+
+/* Pass this tool a string, and it will parse it into an argv and execvp(). */
+
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <errno.h>
+
+
+static void
+append_char(char c, char** arg, int* argLen, int* argBufferLen)
+{
+       if ((*argLen + 1) >= *argBufferLen) {
+               *arg = realloc(*arg, *argBufferLen + 32);
+               if (*arg == NULL) {
+                       puts("oom");
+                       exit(1);
+               }
+               *argBufferLen += 32;
+       }
+
+       (*arg)[*argLen] = c;
+       (*argLen)++;
+}
+
+
+static void
+parse_quoted(const char* str, int* pos, char** currentArg, int* currentArgLen,
+       int* currentArgBufferLen)
+{
+       char end = str[*pos];
+       while (1) {
+               char c;
+               (*pos)++;
+               c = str[*pos];
+               if (c == '\0') {
+                       puts("mismatched quotes");
+                       exit(1);
+               }
+               if (c == end)
+                       break;
+
+               switch (c) {
+               case '\\':
+                       (*pos)++;
+                       // fall through
+               default:
+                       append_char(str[*pos], currentArg, currentArgLen,
+                               currentArgBufferLen);
+               break;
+               }
+       }
+}
+
+
+int
+main(int argc, const char* argv[])
+{
+       char** args = NULL, *currentArg = NULL;
+       const char* str;
+       int argsLen = 0, argsBufferLen = 0, currentArgLen = 0,
+               currentArgBufferLen = 0, pos;
+
+       if (argc != 2) {
+               printf("usage: %s \"program arg 'arg1' ...\"\n", argv[0]);
+               return 1;
+       }
+
+       str = argv[1];
+       pos = 0;
+       while (1) {
+               switch (str[pos]) {
+               case ' ':
+               case '\t':
+               case '\r':
+               case '\n':
+               case '\0':
+                       if (currentArgLen == 0)
+                               break; // do nothing
+
+                       append_char('\0', &currentArg, &currentArgLen,
+                               &currentArgBufferLen);
+
+                       if ((argsLen + 2) >= argsBufferLen) {
+                               args = realloc(args, (argsBufferLen + 8) * 
sizeof(char*));
+                               if (args == NULL) {
+                                       puts("oom");
+                                       return 1;
+                               }
+                               argsBufferLen += 8;
+                       }
+
+                       args[argsLen] = currentArg;
+                       args[argsLen + 1] = NULL;
+                       argsLen++;
+
+                       currentArg = NULL;
+                       currentArgLen = 0;
+                       currentArgBufferLen = 0;
+               break;
+
+               case '\'':
+               case '"':
+                       parse_quoted(str, &pos, &currentArg, &currentArgLen,
+                               &currentArgBufferLen);
+               break;
+
+               case '\\':
+                       pos++;
+                       // fall through
+               default:
+                       append_char(str[pos], &currentArg, &currentArgLen,
+                               &currentArgBufferLen);
+               break;
+               }
+               if (str[pos] == '\0')
+                       break;
+               pos++;
+       }
+
+       pos = execvp(args[0], args);
+       if (pos != 0)
+               printf("exec failed: %s\n", strerror(errno));
+       return pos;
+}

############################################################################

Commit:      b9b6a688e343f60eb30827360354385f2d9adcb6
URL:         https://git.haiku-os.org/haiku/commit/?id=b9b6a688e343
Author:      Augustin Cavalier <waddlesplash@xxxxxxxxx>
Date:        Thu Aug 29 00:52:10 2019 UTC

tools/exec: Exit with an error upon attempting to run multiple commands.

This way, things that need a real shell will be more clear.

----------------------------------------------------------------------------

diff --git a/src/tools/exec.c b/src/tools/exec.c
index 5b93fcdab8..12a0851cab 100644
--- a/src/tools/exec.c
+++ b/src/tools/exec.c
@@ -20,7 +20,7 @@ append_char(char c, char** arg, int* argLen, int* 
argBufferLen)
        if ((*argLen + 1) >= *argBufferLen) {
                *arg = realloc(*arg, *argBufferLen + 32);
                if (*arg == NULL) {
-                       puts("oom");
+                       puts("exec: oom");
                        exit(1);
                }
                *argBufferLen += 32;
@@ -41,7 +41,7 @@ parse_quoted(const char* str, int* pos, char** currentArg, 
int* currentArgLen,
                (*pos)++;
                c = str[*pos];
                if (c == '\0') {
-                       puts("mismatched quotes");
+                       puts("exec: mismatched quotes");
                        exit(1);
                }
                if (c == end)
@@ -66,7 +66,7 @@ main(int argc, const char* argv[])
        char** args = NULL, *currentArg = NULL;
        const char* str;
        int argsLen = 0, argsBufferLen = 0, currentArgLen = 0,
-               currentArgBufferLen = 0, pos;
+               currentArgBufferLen = 0, encounteredNewlineAt = -1, pos;
 
        if (argc != 2) {
                printf("usage: %s \"program arg 'arg1' ...\"\n", argv[0]);
@@ -77,13 +77,25 @@ main(int argc, const char* argv[])
        pos = 0;
        while (1) {
                switch (str[pos]) {
-               case ' ':
-               case '\t':
                case '\r':
                case '\n':
+                       // In normal shells, this would imply a second command.
+                       // We don't support that here, so we need to make sure
+                       // that either we have not parsed any arguments yet,
+                       // or there are no more arguments pushed after this.
+                       if (argsLen == 0 && currentArgLen == 0)
+                               break;
+                       encounteredNewlineAt = argsLen + 1;
+                       // fall through
+               case ' ':
+               case '\t':
                case '\0':
                        if (currentArgLen == 0)
                                break; // do nothing
+                       if (encounteredNewlineAt == argsLen) {
+                               puts("exec: running multiple commands not 
supported!");
+                               return 1;
+                       }
 
                        append_char('\0', &currentArg, &currentArgLen,
                                &currentArgBufferLen);
@@ -91,7 +103,7 @@ main(int argc, const char* argv[])
                        if ((argsLen + 2) >= argsBufferLen) {
                                args = realloc(args, (argsBufferLen + 8) * 
sizeof(char*));
                                if (args == NULL) {
-                                       puts("oom");
+                                       puts("exec: oom");
                                        return 1;
                                }
                                argsBufferLen += 8;
@@ -114,6 +126,9 @@ main(int argc, const char* argv[])
 
                case '\\':
                        pos++;
+                       // don't append newlines to the current argument
+                       if (str[pos] == '\r' || str[pos] == '\n')
+                               break;
                        // fall through
                default:
                        append_char(str[pos], &currentArg, &currentArgLen,

############################################################################

Commit:      3cfe881d8840593a02d6eb4fa3fa5b2db9e09772
URL:         https://git.haiku-os.org/haiku/commit/?id=3cfe881d8840
Author:      Augustin Cavalier <waddlesplash@xxxxxxxxx>
Date:        Thu Aug 29 02:52:22 2019 UTC

OverriddenJamRules: Remove an unneeded and erroneous ";"

----------------------------------------------------------------------------

diff --git a/build/jam/OverriddenJamRules b/build/jam/OverriddenJamRules
index cf6652ead1..01659c9a1a 100644
--- a/build/jam/OverriddenJamRules
+++ b/build/jam/OverriddenJamRules
@@ -208,7 +208,7 @@ rule As
 
 actions As
 {
-       $(CC) -c "$(2)" -O2 $(ASFLAGS) -D_ASSEMBLER $(ASDEFS) $(ASHDRS) -o 
"$(1)" ;
+       $(CC) -c "$(2)" -O2 $(ASFLAGS) -D_ASSEMBLER $(ASDEFS) $(ASHDRS) -o 
"$(1)"
 }
 
 rule Lex

############################################################################

Revision:    hrev53415
Commit:      cca88a8133f0c9ddfbc8b6f0f83697f31edaab1d
URL:         https://git.haiku-os.org/haiku/commit/?id=cca88a8133f0
Author:      Augustin Cavalier <waddlesplash@xxxxxxxxx>
Date:        Thu Aug 29 02:53:22 2019 UTC

tools/exec: Implement basic environment overrides.

VAL=xxx... and VAL=$VAL:xxx... are supported; all other syntaxes
will fail with an error message.

When combined with a build/jam patch that will come in a later
commit, this makes it possible to build a large number of targets
using exec as JAMSHELL; including all of libroot. The performance
difference is extremely obvious:

jam -j2 libroot, JAMSHELL=/bin/sh (32-bit Haiku)
real 1m43.571s
user 1m10.961s
sys  1m7.965s

jam -j2 libroot, JAMSHELL=exec
real 1m28.364s
user 0m58.190s
sys  0m57.563s

So that is a savings of 15.21 seconds, or 15% of the build time.
Something that is less I/O bound and more fork-bound (e.g.
linking application catalogs) will almost certainly see
an even bigger performance difference.

Changes to add the necessary JAMSHELL overrides for those
targets which need it, in order to make it possible to
enable usage of "exec" by default, will be coming
over the next few days/weeks...

----------------------------------------------------------------------------

diff --git a/src/tools/exec.c b/src/tools/exec.c
index 12a0851cab..d8e7b28986 100644
--- a/src/tools/exec.c
+++ b/src/tools/exec.c
@@ -12,6 +12,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <errno.h>
+#include <unistd.h>
 
 
 static void
@@ -66,7 +67,8 @@ main(int argc, const char* argv[])
        char** args = NULL, *currentArg = NULL;
        const char* str;
        int argsLen = 0, argsBufferLen = 0, currentArgLen = 0,
-               currentArgBufferLen = 0, encounteredNewlineAt = -1, pos;
+               currentArgBufferLen = 0, encounteredNewlineAt = -1,
+               modifiedEnvironment = 0, pos;
 
        if (argc != 2) {
                printf("usage: %s \"program arg 'arg1' ...\"\n", argv[0]);
@@ -83,7 +85,7 @@ main(int argc, const char* argv[])
                        // We don't support that here, so we need to make sure
                        // that either we have not parsed any arguments yet,
                        // or there are no more arguments pushed after this.
-                       if (argsLen == 0 && currentArgLen == 0)
+                       if (argsLen == 0 && currentArgLen == 0 && 
!modifiedEnvironment)
                                break;
                        encounteredNewlineAt = argsLen + 1;
                        // fall through
@@ -97,9 +99,65 @@ main(int argc, const char* argv[])
                                return 1;
                        }
 
+                       // the current argument hasn't been terminated, do that 
now
                        append_char('\0', &currentArg, &currentArgLen,
                                &currentArgBufferLen);
 
+                       // handle environs
+                       {
+                       char* val;
+                       if (argsLen == 0 && (val = strstr(currentArg, "=")) != 
NULL) {
+                               const char* dollar;
+                               char* newVal = NULL;
+                               *val = '\0';
+                               val++;
+
+                               // handle trivial variable substitution, i.e. 
VAL=$VAL:...
+                               dollar = strstr(val, "$");
+                               if (dollar != NULL) {
+                                       const char* oldVal;
+                                       int oldValLen, valLen, nameLen;
+
+                                       if (dollar != val) {
+                                               puts("exec: environ expansion 
not at start of "
+                                                       "line unsupported");
+                                               return 1;
+                                       }
+                                       val++; // skip the $
+                                       valLen = strlen(val);
+                                       nameLen = strlen(currentArg);
+
+                                       // if the new value does not start with 
the environ name
+                                       // (which is broken by a 
non-alphanumeric character), bail.
+                                       if (strncmp(val, currentArg, nameLen) 
!= 0
+                                                       || 
isalnum(val[nameLen])) {
+                                               puts("exec: environ expansion 
of other variables "
+                                                       "unsupported");
+                                               return 1;
+                                       }
+
+                                       // get the old value and actually do 
the expansion
+                                       oldVal = getenv(currentArg);
+                                       oldValLen = strlen(oldVal);
+                                       newVal = malloc(valLen + oldValLen + 1);
+                                       memcpy(newVal, oldVal, oldValLen);
+                                       memcpy(newVal + oldValLen, val + 
nameLen, valLen + 1);
+                                       val = newVal;
+                               }
+
+                               setenv(currentArg, val, 1);
+                               free(newVal);
+                               modifiedEnvironment = 1;
+
+                               free(currentArg);
+                               currentArg = NULL;
+                               currentArgLen = 0;
+                               currentArgBufferLen = 0;
+                               break;
+                       }
+                       }
+
+                       // actually add the argument to the array
                        if ((argsLen + 2) >= argsBufferLen) {
                                args = realloc(args, (argsBufferLen + 8) * 
sizeof(char*));
                                if (args == NULL) {


Other related posts:

  • » [haiku-commits] haiku: hrev53415 - src/tools - waddlesplash