Skip to content
Paul Rogers edited this page Nov 27, 2016 · 26 revisions

Testing Tips

Drill makes extensive use of JUnit and other libraries for testing. This page provides pointers to the information you need to work with Drill tests. We don't repeat that information; you will want to follow the links and read the original material to get a complete understanding of the libraries that Drill uses.

Caveat: information here about Drill is "reverse engineered" from the code; this page has not yet had the benefit of insight from the developers who created Drill's test structure.

JUnit

Most of us know the basics of JUnit. Drill uses many advanced features that we mention here. Drill uses JUnit 4, currently version 4.11.

Good references:

JUnit/Hamcrest Idioms

Drill tests use the JUnit 4 series that uses annotations to identify tests. Drill makes use of the "Hamcrest" additions (which seem to have come from a separate project, later merged into JUnit, hence the strange naming.) Basic rules:

  • All tests are packaged into classes, all classes start or end with the word "Test". In Drill, most tests use the prefix format: "TestMumble".
  • Test methods are indicted with @Test.
  • Disabled tests are indicated with @Ignore("reason for ignoring")
  • Tests use "classic" JUnit assertions such as assertEquals(expected,actual,opt_msg).
  • Tests also use the newer "Hamcrest" assertThat formulation. The Hamcrest project provided a system based on assertions and matchers that are quite handy for cases that are cumbersome with the JUnit-Style assertions.
  • Many tests make use of the test fixture annotations. These include methods marked to run before or after all tests in a class (@BeforeClass and @AfterClass) and those that run before or after each test (@Before and @After).
  • The base DrillTest class uses the ExceptionRule to declare that no test should throw an exception.
  • Some Drill tests verify exceptions directly using the expected parameter of @Test:
  @Test(expected = ExpressionParsingException.class)
  public void testSomething( ) {
  • Other code uses the try/catch idiom.
  • Drill tests have the potential to run for a long time, or hang, if thing go wrong. To prevent this, Drill tests use a timeout. The main Drill test base class, DrillTest uses a timeout rule to set a default timeout of 50 seconds:
@Rule public final TestRule TIMEOUT = TestTools.getTimeoutRule(50000);
  • Individual tests (override?) this rule with the timeout parameter to the Test annotation @Test(timeout=1000). This form an only decrease (but not increase) the timeout set by the timeout rule.
  • Tests that need a temporary file system folder use the @TemporaryFolder rule.
  • The base DrillTest class uses the TestName rule to make the current test name available to code: System.out.println( TEST_NAME );.

Additional Resources

Some other resources that may be of interest moving forward:

  • JUnitParams - a cleaner way to parameterize tests.
  • Assumptions for declaring dependencies and environment setup that a test assumes.
  • JUnit Rules may occasionally be helpful for specialized tests.
  • Categories to, perhaps, identify those "smoke" tests that should be run frequently, and a larger, more costly set of "full" tests to be run before commits, etc.
  • [System Rules][http://stefanbirkner.github.io/system-rules/] - A collection of JUnit rules for testing code that uses java.lang.System such as printing to System.out, environment variables, etc.
  • The Stopwatch rule added in JUnit 4.12 to measure the time a test takes.
  • the DisableonDebug rule added in JUnit 4.12 which can turn off other rules when needed in a debug session (to prevent, say, timeouts, etc.)

JMockit

Drill is built as a highly-coupled network of classes. That is, each part of Drill directly depends on many other parts. This structure, in part, reflects the complexity of the problem that Drill solves, but the resulting tight coupling works against us when testing components in isolation. (The very definition of tight coupling is that components cannot be easily isolated.)

A solution to this problem is "mocking": the idea that we substitute an "artificial" dependency for the real one. Drill uses the Mockit framework for this purpose.

Consider a test such as org.apache.drill.exec.expr.ExpressionTest which requires a RecordBatch to test each expression. (In Drill, we use the data itself, rather than the schema, to get the definition of columns.) Building a RecordBatch is complex and requires building up a set of value vectors and other machinery. Yet, all the test needs is to return the type of a single field. This is a perfect case for mocking. Some tests need the RecordBatch} but never use it. Those tests look like this:

  @Test
  public void testBasicExpression(@Injectable RecordBatch batch) throws Exception {
    checkExpressionCode("if(true) then 1 else 0 end", batch, "/code/expr/basicExpr.txt");
  }

Others need a single method to return the correct (or "expected") value:

  @Test
  public void testExprParseUpperExponent(@Injectable RecordBatch batch) throws Exception {
    final TypeProtos.MajorType type = Types.optional(MinorType.FLOAT8);
    final TypedFieldId tfid = new TypedFieldId(type, false, 0);

    new Expectations( ) {{
      batch.getValueVectorId(new SchemaPath("$f0", ExpressionPosition.UNKNOWN)); result=tfid;
    }};

    checkExpressionCode("multiply(`$f0`, 1.0E-4)", batch, "/code/expr/upperExponent.txt");
  }

Mocking is a quick and easy way to focus on one bit of code (the expression parser) without doing a bunch of work to set up other parts that are not actually needed. The mechanism is, however, not entirely obvious, so you'll want to spend time reading up on JMockit and looking at the existing examples in Drill.

References:

Mockito

Drill depends on the Mockito framework. (Need to find & document usages.)

Drill Test Structure

Test Logging

Drill code makes extensive use of logging to provide the developer with in-depth information about Drill's internals. The logging is vital when tracking down a problem, but a distraction at other times. To control logging, you must understand the various forms of logging and logging control.

A production Drill uses Logback Classic as its logger and controls logging via the $DRILL_HOME/conf/logback.xml (or, alternatively, $DRILL_SITE/logback.xml) file. This configure is not used when testing, however.

When you run Drill tests via an IDE, the tests still use Logback Classic which looks for its logback.xml feel on the class path. Logback will find one in drill-common/src/test/resources (which is available on the class path as just logbook.xml.) If you see a bunch of Logback logging messages, then the key problem is that multiple copies of logback.xml exist on the class path. (At the time of this writing, we're working to clean up the multiple copies.) Configure logging in this test file. For example, to set the debug log level:

  <logger name="org.apache.drill" additivity="false">
    <level value="debug" />

However, when you run tests via Maven (see below), tests run using the Surefire plugin (see below), but, strangely, do not use Maven's own logging (see below). Instead, Surefire seems to use Java util logging. You can control the logging output of tests in several ways.

To control logging for every build, configure logging per this blog post. While Logback looks for its configuration file on the class path, Java util logging does not. Instead, you must specify the path to the config file via a system property: java.util.logging.config.file.

First, create a Java util logging configuration file (which is distinct from the normal Logback configuration):

drill-root/src/test/resources/logging.properties

Add the following content:

handlers = java.util.logging.ConsoleHandler
.level = WARN

Then, modify the Drill root pom.xml file:

 <artifactId>maven-surefire-plugin</artifactId>
 ...
  <systemPropertyVariables>
   ...
   <java.util.logging.config.file>
   src/test/resources/logging.properties
   </java.util.logging.config.file>
  ...

However, I could not get the above to actually work... This Stack Overflow post suggests to use a command line argument:

mvn surefire:test -Dtest=TestClassPathScanner \
-Djava.util.logging.config.file=/path/to/drill/src/test/resources/logging.properties

But, that does not work for me either...

This is yet another variation that also does not work...

The version that does work is based on this Stack Overflow post:

mvn surefire:test -Dtest=TestClassPathScanner \
        -Dlogback.configurationFile=/path/to/drill/common/src/test/resources/logback.xml

To help diagnose problems:

mvn surefire:test -Dtest=TestClassPathScanner \
       -Dlogback.statusListenerClass=ch.qos.logback.core.status.OnConsoleStatusListener

Maven

The root Drill pom.xml declares a test-time dependency on JUnit 4.11:

    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.11</version>
      <scope>test</scope>
    </dependency>

Since this dependency is in the root POM, there is no need to add it to the POM files of each Drill module.

It is often helpful to know the set of properties that Maven defines and are available for use in the POM file.

Surefire Plugin

Drill uses the Maven Surefire plugin to run tests. The root POM contains significant test configuration that you may or may not want to include when running tests in a debugger:

          <configuration>
            <argLine>-Xms512m -Xmx3g -Ddrill.exec.http.enabled=false
              -Ddrill.exec.sys.store.provider.local.write=false
              -Dorg.apache.drill.exec.server.Drillbit.system_options=\
               "org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on"
              -Ddrill.test.query.printing.silent=true
              -Ddrill.catastrophic_to_standard_out=true
              -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M
              -Djava.net.preferIPv4Stack=true
              -Djava.awt.headless=true
              -XX:+CMSClassUnloadingEnabled -ea</argLine>
            <forkCount>${forkCount}</forkCount>
            <reuseForks>true</reuseForks>
            <additionalClasspathElements>
              <additionalClasspathElement>./exec/jdbc/src/test/resources/storage-plugins.json</additionalClasspathElement>
            </additionalClasspathElements>

You can run individual tests as follows. First, cd into the project that contains the test. Then, run the test. (You'll get an error if you try to run the test from the root Drill directory.)

cd /path/to/drill/project
mvn surefire:test -Dtest=TestCircle

Logging

Maven output is controlled by Maven's logging mechanism which seems to be SLF4J Simple.

To "quiet" the amount of Maven output, you must modify the logging configuration file in your Maven installation. (There seems to be no way to modify logging in the POM file itself.)

$ cd `dirname \`which mvn\``/../conf/logging
$ pwd # Should print /some/path/apache-maven-3.x.y/conf/logging
$ vi simplelogger.properties

The following suppresses all but the most of Maven's "info" logging:

#org.slf4j.simpleLogger.defaultLogLevel=info
org.slf4j.simpleLogger.defaultLogLevel=warn

Since this is a Maven configuration (not a POM file change), you must make it on each machine on which you run Maven.

Unfortunately, the maven-checkstyle-plugin displays its errors to the log at the INFO log level, making the results hidden with the above configuration. (Need to find a way to enable this plugin's output.)

Eclipse

Using JUnit with Eclipse is trivial:

  • To run all tests in a class, select the class name (or ensure no text is selected) and use the context menu option Debug As... --> JUnit.
  • To run a single test, select the name of the test method, and invoke the same menu command.

Using JMockit is a bit more fussy. See the Getting Started page for info. See also the Development Tips page for properly setting up Eclipse.

Logback Logging

There seems to be a subtle difference in how Eclipse and Maven build the class path used to run tests. Eclipse includes the test resources from all dependencies, whereas Surefire does not. This means that, if you run a test in exec/java-exec, such as TestClassPathScanner, the test resources from common will be on the class path if run from Eclipse, but not run from Surefire.

For the most part, having extra items on the class path in Eclipse is benign. Problems occur, however, when files of the same name appears multiple time on the class path. This is the situation with Logback.

When run under Maven Surefire, each project requires its own logback.xml (or, preferably logback-test.xml) in src/test/resources.

But, when run under Eclipse, the test program sees all the Logback config files on the path, causing Logback to print a bunch of log messages.

IntelliJ

Hey IntelliJ users, what advice can we put here?

Clone this wiki locally