Incremental Debug for new test cases #3

P.D. Reiter pdreiter · May'19

3 #1

@ChrisTimperley -
When adding new Darjeeling test and the related bugzoo infrastructure, there are a lot of working parts here and it would great to know where or on what I should be focusing to isolate the configuration issue. It would be nice to have a set of local commands identified for each stage that could be run to verify the infrastructure from the ground up.

for example -
I'm debugging my failing darjeeling coverage using the locally enabled logger and printing the TestSuitCoverage dictionary, I'm seeing that the test.sh isn't working properly.

DEBUG:bugzoo.core.coverage:[TestSuiteCoverage] coverage instruction dictionary: 
{'n1': {'coverage': {}, 'outcome': {'passed': False, 'response': {'code': 127, 'duration': 0.1892676030001894, 'output': '/bin/bash: ./test.sh: No such file or directory\r'}}, 'test': 'n1'}, 'n2': {'coverage': {}, 'outcome': {'passed': False, 'response': {'code': 127, 'duration': 0.16958742499991786, 'output': '/bin/bash: ./test.sh: No such file or directory\r'}}, 'test': 'n2'}, 'p1': {'coverage': {}, 'outcome': {'passed': False, 'response': {'code': 127, 'duration': 0.18321107100018708, 'output': '/bin/bash: ./test.sh: No such file or directory\r'}}, 'test': 'p1'},

although i've set the context properly in the yml files:

DEBUG:bugzoo.mgr.source:parsing bug: {"name": "sefcom:fauxware", "image": "sefcom:fauxware", "dataset": "sefcom", "program": "fauxware.i", "languages": ["c"], "source-location": "/experiment/src", "test-harness": {"context": "/experiment/src", "command": "./test.sh __ID__", "time-limit": 2500, "failing": 2, "passing": 16, "type": "genprog"}, "compiler": {"time-limit": 300, "type": "configure-and-make"}, "coverage": {"context": "/experiment/src", "type": "gcov", "files-to-instrument": ["fauxware.c"]}}

it would be great to be able to verify the commands being executed or sent to the docker image either independently from the darjeeling infrastructure or directly from darjeeling logs.

replies 17
views 9.9K
likes 0

Chris Timperley ChrisTimperley · May'19 Team

With some minor tweaks to the logging, that should be possible. In either case, you should be able to fetch the logs for the BugZoo daemon at ~/.bugzoo/logs.

What happens when you run $ bugzoo bug validate BUGNAME on this particular scenario?

1 Reply

P.D. Reiter pdreiter · May'19 Author

3 #3

Here's a snippet of the bugzoo bug validate <> output:

validating bug: sefcom:fauxware
Compiling...                                                                [OK]
Running test: n1...                                                         [OK]
Running test: n2...                                                         [OK]
Running test: p1...                                           [UNEXPECTED: FAIL]


Running test: p2...                                           [UNEXPECTED: FAIL]


Running test: p3...                                           [UNEXPECTED: FAIL]

which doesn't seem to be consistent with what I was seeing in the TestCoverageSuite output?

2 Replies

Chris Timperley ChrisTimperley · May'19 Team

Does it not seem consistent? The results are the same as those in the test coverage. It looks like a configuration issue in the BugZoo manifest file for the bug. Do you mind sharing the manifest? Also, could you list the contents of the directory that contains the test script?

2 Replies

P.D. Reiter pdreiter · May'19 Author

FYI: I launched a docker image and ran test p1 within and the test results looks correct in the docker container.

docker@1eb8ad7364e2:/experiment/src$ egrep -w p1 test.sh
  p1) timeout 3 $1 <<< $(echo foo_pos0 && cat foo_pos0) | diff trusted_user.output - 
docker@1eb8ad7364e2:/experiment/src$ ./fauxware <<< $(echo foo_pos0 && cat foo_pos0)
Username: 
Password: 
Welcome to the admin console, trusted user!
docker@1eb8ad7364e2:/experiment/src$ ./fauxware <<< $(echo foo_pos0 && cat foo_pos0) | diff trusted_user.output -
docker@1eb8ad7364e2:/experiment/src$

P.D. Reiter pdreiter · May'19 Author

oh i see, there was a field that noted the test as passing - my eye was drawn to
'output': '/bin/bash: ./test.sh: No such file or directory\r'
which is what I was trying to debug.

P.D. Reiter pdreiter · May'19 Author

1 #7

contents of /experiment/src:

docker@1eb8ad7364e2:/experiment/src$ ls
16p2n0s.repair.c  bkdoor03  compile_commands.json          coverage.path        fauxware.i           repair.cache
Dockerfile        bkdoor04  configuration-default          coverage.path.neg    foo_pos0             repaired
Makefile          bkdoor05  configuration.0                coverage.path.pos    foo_pos1             test.sh
backdoor          bkdoor06  configuration.i.0              fauxware             foo_pos2             test.sh.old
bar_pos0          bkdoor07  configuration.i.0.brute        fauxware.0.c         ga.log               testruns.txt
bar_pos1          bkdoor08  configuration.i.0.ga           fauxware.bugzoo.yml  invalid_user.output  trusted_user.output
bkdoor01          bkdoor09  configuration.i.2382391420557  fauxware.c           repair
bkdoor02          bkdoor10  coverage.c                     fauxware.cache       repair.c
docker@1eb8ad7364e2:/experiment/src$

fauxware.bugzoo.yml:

version: "1.0"

plugins:
  - name: genprog
    image: squareslab/genprog
    environment:
       PATH: "/opt/genprog/bin:${PATH}"

blueprints:
  - type: docker
    tag: sefcom:fauxware
    file: Dockerfile
    context: .

bugs:
  - name: sefcom:fauxware
    image: "sefcom:fauxware"
    dataset: sefcom
    program: fauxware.i
    languages:
    - c
    source-location: /experiment/src
    test-harness:
      context: /experiment/src
      command: ./test.sh __ID__
      time-limit: 5
      failing: 2
      passing: 16
      time-limit: 2500
      type: genprog
    compiler:
      time-limit: 300
      type: configure-and-make
    coverage:
      context: /experiment/src
      type: gcov
      files-to-instrument: 
         - fauxware.c

2 Replies

P.D. Reiter pdreiter · May'19 Author

1 #8

FYI: genprog could not process the input source code, but i'm using the workaround using the temp file as source code input [changing files-to-instrument to fauxware.i did not address the error i'm seeing]

Chris Timperley ChrisTimperley · May'19 Team

This looks like a bug in BugZoo. Specifically, it looks like the context in test-harness is being ignored. Just to double check, ./test.sh works when you manually invoke it from inside /experiment/src?

1 Reply

P.D. Reiter pdreiter · May'19 Author

#10

Yep! When I invoke the docker image, go to /experiment/src, and run test.sh for both positive and negative test cases, it works.

1 Reply

Chris Timperley ChrisTimperley · May'19 Team

#11

I don't suppose that I can get access to your repository with fauxware? That will make debugging a little easier :-)
Alternatively, uploading sefcom:fauxware to DockerHub would also work.

1 Reply

P.D. Reiter pdreiter · May'19 Author

#12

sure - it's a toy security patch from another project's git repo, so should be no problem to share. will email a tarball.

Chris Timperley ChrisTimperley · May'19 Team

#13

From a quick glance, it looks like this is due to the Makefile. BugZoo calls the command correctly (take a look at ~/.bugzoo/logs to confirm), but the binary is destroyed after make is called (should clean_all be included in the all rule?)

2 Replies

P.D. Reiter pdreiter · May'19 Author

#14

hey @ChrisTimperley - thanks for looking at this. It looks like this version of the Makefile in my test directory was genprog specific, so I poisoned my own directory. :(
I ended up just migrating the bug yml files from configure-and-make to command and command_with_instrumentation.
Working cleaner now, but still hobbled by missing context for the test execution - I'm getting the NoImplicatedLines exception in darjeeling runs.

Should I file an issue on the BugZoo repo for the missing application of context for tests? or did you already do this?

P.D. Reiter pdreiter · May'19 Author

#15

@ChrisTimperley - I saw you closed the issue I filed that referenced your earlier statement that bugzoo had an bug with context.
So, why does bugzoo bug coverage sefcom:fauxware fail? These ExecResponse outputs seem to indicate that the command wasn't run in the context directory : '_ExecResponse__output': '/bin/bash: ./test.sh: No such file or directory\r'

(djling_venv) bss-lab-1@bsslab1-Precision-Tower-5810:~/Darjeeling/tea_sampler.fauxware$ bugzoo bug coverage sefcom:fauxware |& tail -n 38
/home/bss-lab-1/Darjeeling/djling_venv/lib/python3.6/site-packages/bugzoo/core/coverage.py:215: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  d = yaml.load(f)
[n1: FAILED]

[n2: FAILED]

[p1: FAILED]

[p10: FAILED]

[p11: FAILED]

[p12: FAILED]

[p13: FAILED]

[p14: FAILED]

[p15: FAILED]

[p16: FAILED]

[p2: FAILED]

[p3: FAILED]

[p4: FAILED]

[p5: FAILED]

[p6: FAILED]

[p7: FAILED]

[p8: FAILED]

[p9: FAILED]

here's the test coverage information from TestCoverage's method __repr__ (added a logger.debug call with this info):

2668 DEBUG:bugzoo.mgr.container:initialised container manager
2669 DEBUG:bugzoo.core.coverage:coverage = '', test = 'n1', status = 'FAILED' => outcome.response = '     {'_ExecResponse__code': 127, '_ExecResponse__duration': 0.1892676030001894, '_ExecResponse__output': '/bin/bash: ./test.sh: No such file or directory\r'}'
2670 DEBUG:bugzoo.core.coverage:coverage = '', test = 'n2', status = 'FAILED' => outcome.response = '     {'_ExecResponse__code': 127, '_ExecResponse__duration': 0.16958742499991786, '_ExecResponse__output': '/bin/bash: ./test.sh: No such file or directory\r'}'
2671 DEBUG:bugzoo.core.coverage:coverage = '', test = 'p1', status = 'FAILED' => outcome.response = '     {'_ExecResponse__code': 127, '_ExecResponse__duration': 0.18321107100018708, '_ExecResponse__output': '/bin/bash: ./test.sh: No such file or directory\r'}'

1 Reply

P.D. Reiter pdreiter · May'19 Author

#16

So I carved out some time to do some debug (added some logging msgs) and it looks like a pre-existing coverage map file is being reused regardless of invoking bugzoo bug remove <mybug>:

2665 DEBUG:bugzoo.mgr.bug:2019-05-20 12:26:31,796: Looking for cached coverage map @ /home/bss-lab-1/     .bugzoo/coverage/sefcom:fauxware.coverage.yml
2666 DEBUG:bugzoo.mgr.bug:2019-05-20 12:26:31,796: coverage map already generated at @ /home/bss-lab-     1/.bugzoo/coverage/sefcom:fauxware.coverage.yml

So it looks like when developing new test cases, it would be great to know what new files are generated and what needs to be manually removed, since not all intermediate files are dealt with bugzoo bug remove <mybug>.

P.D. Reiter pdreiter · May'19 Author

#17

Soooo, my getting rid of the stale and wrong cached coverage file, I was able to get the coverage generated correctly.

It would be nice for BugZoo to let the user know that it is using a cached version of the coverage for that test.

Use image hash as key when caching coverage

ChrisTimperley created GitHub issue Use image hash as key when caching coverage based on post #17 5 years ago

#18