GA returning empty set as plausible repair #7

P.D. Reiter pdreiter · Jun'19

Using the same bugzoo bug infrastructure, I am evaluating two input Darjeeling yml files to identify a patch for my fauxware toy problem.

exhaustive search - this yml + Darjeeling finds an appropriate patch
genetic search - this yml + Darjeeling returns an empty set as the plausible patch.

I've been playing with the GA's hyperparameters, particularly increasing population size and mutation-rate, but I keep getting the empty set for the ga output - and it doesn't seem to be evaluating more than one generation.

last ga configuration, rest of yml file is identical to exhaustive search yml (see below):

algorithm:
  type: genetic
  population: 200
  generations: 200
  tournament-size: 20
  mutation-rate: 0.8
  crossover-rate: 0.4
  # look at entire test suite for test sampling [subset of testsuite is 100%]
  test-sample-size: null

Have you seen this before? Do you have any ideas?
I'm currently looking for the test passing rate right now in the logger to understand some of the underlying GA execution details and why the empty set of mutations is identified as a solution in the genetic yml file.

exhaustive search yml:

version: '1.0'
snapshot: 'sefcom:fauxware'
language: c
seed: 0
threads: 16
localization:
  type: spectrum
#  metric: genprog
  metric: tarantula
  exclude-files:
    - foo.c
algorithm:
  type: exhaustive
transformations:
  schemas:
    - type: delete-statement
    - type: replace-statement
    - type: prepend-statement
optimizations:
  ignore-equivalent-prepends: yes 
  ignore-dead-code: yes 
  ignore-string-equivalent-snippets: yes 
resource-limits:
  candidates: 5000
  time-minutes: 3600

patch diff found by exhaustive search:

cat patches/0.diff 
--- fauxware.c
+++ fauxware.c
@@ -14,7 +14,7 @@
 	int pwfile;
 
 	// evil back d00r
-	if (strcmp(password, sneaky) == 0) return 1;
+	
 
 	pwfile = open(username, O_RDONLY);
 	read(pwfile, stored_pw, 8);

1 Reply

replies 3
views 2.8K
likes 0

P.D. Reiter pdreiter · Jun'19 Author

1 #2

I added some debug information to print out more information about the candidate patch [mostly in the evaluator.py] - and it looks like the test list is empty for the candidate that's being posed as the solution:

finished waiting for pending evaluations to complete.
found 1 plausible patches
 [0] Candidate<#1ca7323d> 

time taken: 0.10 minutes
# test evaluations: 196
# candidate evaluations: 16

egrep -rw '1ca7323d' ga.darjeeling.log 
evaluating candidate: Candidate<#1ca7323d>
building candidate: Candidate<#1ca7323d>
evaluating candidate: Candidate<#1ca7323d>
evaluating candidate: Candidate<#1ca7323d>
evaluating candidate: Candidate<#1ca7323d>
building candidate: Candidate<#1ca7323d>
building candidate: Candidate<#1ca7323d>
building candidate: Candidate<#1ca7323d>
built candidate: Candidate<#1ca7323d>
executing tests for candidate: Candidate<#1ca7323d>
known_bad_patch = False for candidate: Candidate<#1ca7323d>
self.__terminate_early = False for candidate: Candidate<#1ca7323d>
tests = [[]] for candidate: Candidate<#1ca7323d>
evaluated candidate: Candidate<#1ca7323d>
built candidate: Candidate<#1ca7323d>
executing tests for candidate: Candidate<#1ca7323d>
known_bad_patch = False for candidate: Candidate<#1ca7323d>
self.__terminate_early = False for candidate: Candidate<#1ca7323d>
tests = [[]] for candidate: Candidate<#1ca7323d>
evaluated candidate: Candidate<#1ca7323d>
built candidate: Candidate<#1ca7323d>
executing tests for candidate: Candidate<#1ca7323d>
known_bad_patch = False for candidate: Candidate<#1ca7323d>
self.__terminate_early = False for candidate: Candidate<#1ca7323d>
tests = [[]] for candidate: Candidate<#1ca7323d>
evaluated candidate: Candidate<#1ca7323d>
built candidate: Candidate<#1ca7323d>
executing tests for candidate: Candidate<#1ca7323d>
known_bad_patch = False for candidate: Candidate<#1ca7323d>
self.__terminate_early = False for candidate: Candidate<#1ca7323d>
tests = [[]] for candidate: Candidate<#1ca7323d>
evaluated candidate: Candidate<#1ca7323d>
 [0] Candidate<#1ca7323d>

1 Reply

P.D. Reiter pdreiter · Jun'19 Author

1 #3

...and it looks like all tests are being filtered out in this candidate (zero-mutations) scenario.

# select a subset of tests to use for this evaluation
         tests, remainder = self._select_tests()
         logger.info("[ALL] tests = [%s] for candidate: %s", tests, candidate)
         tests, redundant  = self._filter_redundant_tests(candidate, tests)
         logger.info("[FILTERED] tests = [%s] for candidate: %s", tests, candidate)

I'm looking at the _filter_redundant_tests method right now.

1 Reply

P.D. Reiter pdreiter · Jun'19 Author

I was able to duplicate the exact repair as the exhaustive search by adding a conjunction with len(lines_changed)>0 to if not any(line in test_line_coverage for line in lines_changed) : from evaluator.py : _filter_redundant_tests

123     def _filter_redundant_tests(self,
124                                 candidate: Candidate,
125                                 tests: List[Test]
126                                 ) -> Tuple[List[Test], Set[Test]]:
127         line_coverage_by_test = self.__problem.coverage
128         lines_changed = candidate.lines_changed(self.__problem)
129         logger.info("lines_changed %s : %s", lines_changed, candidate)
130         keep = []  # type: List[Test]
131         drop = set()  # type: Set[Test]
132         for test in tests:
133             test_line_coverage = line_coverage_by_test[test.name]
134             logger.info("test_line_coverage %s : %s", test_line_coverage, candidate)
135             if len(lines_changed)>0 and not any(line in test_line_coverage for line in lines_changed)    :               
136                 drop.add(test)
137             else:
138                 keep.append(test)
139         return (keep, drop)

I am not exactly sure this is the perfect solution, but I like that it now creates an equivalent patch. I'll go ahead and open an issue on github for this.