Skip to main content
Build a production-ready isolated test execution service for CI/CD pipelines. This cookbook demonstrates how to create a test execution platform that runs tests in isolated sandboxes, supports parallel execution, and aggregates results.

Overview

Isolated test execution services run tests in completely isolated environments, ensuring test reliability and preventing interference between test runs. This pattern is essential for CI/CD pipelines, test automation platforms, and quality assurance systems. HopX provides the secure isolation needed for this use case.

Prerequisites

  • HopX API key (Get one here)
  • Python 3.8+ or Node.js 16+
  • Understanding of test frameworks (pytest, jest, etc.)
  • Basic knowledge of CI/CD concepts

Architecture

┌──────────────┐
│  CI/CD       │ Test execution requests
│   Pipeline   │
└──────┬───────┘


┌─────────────────┐
│  Test Runner    │ Parallel execution
└──────┬──────────┘


┌─────────────────┐
│  HopX Sandboxes │ Isolated execution
│  (One per test) │
└──────┬──────────┘


┌─────────────────┐
│  Result         │ Aggregate & report
│  Aggregator     │
└─────────────────┘

Implementation

Step 1: Basic Isolated Test Execution

Execute tests in isolated sandboxes:
from hopx_ai import Sandbox
import os
import json
from typing import Dict, List, Any
from concurrent.futures import ThreadPoolExecutor, as_completed

class IsolatedTestExecutor:
    def __init__(self, api_key: str):
        self.api_key = api_key
    
    def execute_test(self, test_code: str, test_framework: str = "pytest", timeout: int = 60) -> Dict[str, Any]:
        """Execute a single test in isolated sandbox"""
        sandbox = None
        try:
            # Create fresh sandbox for each test
            sandbox = Sandbox.create(
                template="code-interpreter",
                api_key=self.api_key,
                timeout_seconds=timeout + 10
            )
            
            # Write test file
            test_file = "/workspace/test_file.py"
            sandbox.files.write(test_file, test_code)
            
            # Install test framework if needed
            if test_framework == "pytest":
                sandbox.commands.run("pip install pytest", timeout=30)
                # Execute test
                result = sandbox.commands.run(f"pytest {test_file} -v", timeout=timeout)
            elif test_framework == "unittest":
                result = sandbox.commands.run(f"python -m unittest {test_file}", timeout=timeout)
            else:
                result = sandbox.run_code(test_code, timeout=timeout)
            
            # Parse test results
            test_results = self._parse_test_results(result.stdout, test_framework)
            
            return {
                "success": result.exit_code == 0,
                "exit_code": result.exit_code,
                "stdout": result.stdout,
                "stderr": result.stderr,
                "test_results": test_results,
                "execution_time": getattr(result, 'execution_time', 0)
            }
            
        except Exception as e:
            return {
                "success": False,
                "error": str(e),
                "stderr": str(e)
            }
        finally:
            if sandbox:
                sandbox.kill()
    
    def _parse_test_results(self, output: str, framework: str) -> Dict[str, Any]:
        """Parse test framework output"""
        if framework == "pytest":
            # Parse pytest output
            passed = output.count("PASSED")
            failed = output.count("FAILED")
            return {
                "passed": passed,
                "failed": failed,
                "total": passed + failed
            }
        else:
            return {
                "passed": 0,
                "failed": 0,
                "total": 0
            }

# Usage
executor = IsolatedTestExecutor(api_key=os.getenv("HOPX_API_KEY"))

test_code = """
def test_addition():
    assert 1 + 1 == 2

def test_subtraction():
    assert 5 - 3 == 2
"""

result = executor.execute_test(test_code, test_framework="pytest")
print(json.dumps(result, indent=2))

Step 2: Parallel Test Execution

Execute multiple tests in parallel:
class ParallelTestExecutor(IsolatedTestExecutor):
    def execute_test_suite(self, test_suite: List[Dict], max_workers: int = 5) -> Dict[str, Any]:
        """Execute multiple tests in parallel"""
        results = []
        
        with ThreadPoolExecutor(max_workers=max_workers) as executor:
            # Submit all tests
            future_to_test = {
                executor.submit(
                    self.execute_test,
                    test["code"],
                    test.get("framework", "pytest"),
                    test.get("timeout", 60)
                ): test
                for test in test_suite
            }
            
            # Collect results as they complete
            for future in as_completed(future_to_test):
                test = future_to_test[future]
                try:
                    result = future.result()
                    results.append({
                        "test_name": test.get("name", "unknown"),
                        "result": result
                    })
                except Exception as e:
                    results.append({
                        "test_name": test.get("name", "unknown"),
                        "result": {
                            "success": False,
                            "error": str(e)
                        }
                    })
        
        # Aggregate results
        total_tests = len(results)
        passed_tests = sum(1 for r in results if r["result"].get("success", False))
        failed_tests = total_tests - passed_tests
        
        return {
            "total_tests": total_tests,
            "passed": passed_tests,
            "failed": failed_tests,
            "success_rate": (passed_tests / total_tests * 100) if total_tests > 0 else 0,
            "results": results
        }

# Usage
executor = ParallelTestExecutor(api_key=os.getenv("HOPX_API_KEY"))

test_suite = [
    {
        "name": "test_math",
        "code": """
def test_addition():
    assert 1 + 1 == 2
""",
        "framework": "pytest",
        "timeout": 30
    },
    {
        "name": "test_strings",
        "code": """
def test_concatenation():
    assert "hello" + " " + "world" == "hello world"
""",
        "framework": "pytest",
        "timeout": 30
    }
]

results = executor.execute_test_suite(test_suite, max_workers=2)
print(json.dumps(results, indent=2))

Step 3: Test Isolation Strategies

Ensure complete test isolation:
class IsolatedTestRunner:
    def __init__(self, api_key: str):
        self.api_key = api_key
    
    def run_isolated_test(self, test_config: Dict) -> Dict[str, Any]:
        """Run test with complete isolation"""
        sandbox = None
        try:
            # Create sandbox with clean environment
            sandbox = Sandbox.create(
                template="code-interpreter",
                api_key=self.api_key,
                timeout_seconds=test_config.get("timeout", 60) + 10,
                env_vars=test_config.get("env_vars", {})
            )
            
            # Set up test environment
            setup_code = test_config.get("setup", "")
            if setup_code:
                setup_result = sandbox.run_code(setup_code, timeout=30)
                if not setup_result.success:
                    return {
                        "success": False,
                        "error": f"Setup failed: {setup_result.stderr}",
                        "stage": "setup"
                    }
            
            # Write test files
            for file_config in test_config.get("files", []):
                sandbox.files.write(
                    file_config["path"],
                    file_config["content"]
                )
            
            # Execute test
            test_result = sandbox.run_code(
                test_config["test_code"],
                language=test_config.get("language", "python"),
                timeout=test_config.get("timeout", 60)
            )
            
            # Cleanup (teardown)
            teardown_code = test_config.get("teardown", "")
            if teardown_code:
                sandbox.run_code(teardown_code, timeout=30)
            
            return {
                "success": test_result.success,
                "stdout": test_result.stdout,
                "stderr": test_result.stderr,
                "exit_code": test_result.exit_code,
                "execution_time": test_result.execution_time
            }
            
        except Exception as e:
            return {
                "success": False,
                "error": str(e)
            }
        finally:
            # Always cleanup - fresh sandbox for next test
            if sandbox:
                sandbox.kill()

# Usage
runner = IsolatedTestRunner(api_key=os.getenv("HOPX_API_KEY"))

test_config = {
    "test_code": """
import unittest

class TestMath(unittest.TestCase):
    def test_addition(self):
        self.assertEqual(1 + 1, 2)
    
    def test_multiplication(self):
        self.assertEqual(2 * 3, 6)

if __name__ == '__main__':
    unittest.main()
""",
    "setup": "import unittest",
    "timeout": 30,
    "env_vars": {
        "TEST_MODE": "true"
    },
    "files": [
        {
            "path": "/workspace/test_utils.py",
            "content": "def helper(): return 'helper'"
        }
    ]
}

result = runner.run_isolated_test(test_config)
print(json.dumps(result, indent=2))

Step 4: Result Aggregation and Reporting

Aggregate test results and generate reports:
class TestResultAggregator:
    def __init__(self):
        self.results = []
    
    def add_result(self, test_name: str, result: Dict[str, Any]):
        """Add test result to aggregation"""
        self.results.append({
            "test_name": test_name,
            "result": result,
            "timestamp": self._get_timestamp()
        })
    
    def generate_report(self) -> Dict[str, Any]:
        """Generate aggregated test report"""
        total = len(self.results)
        passed = sum(1 for r in self.results if r["result"].get("success", False))
        failed = total - passed
        
        # Calculate execution times
        execution_times = [
            r["result"].get("execution_time", 0)
            for r in self.results
            if "execution_time" in r["result"]
        ]
        avg_time = sum(execution_times) / len(execution_times) if execution_times else 0
        total_time = sum(execution_times)
        
        # Group by status
        passed_tests = [r for r in self.results if r["result"].get("success", False)]
        failed_tests = [r for r in self.results if not r["result"].get("success", False)]
        
        return {
            "summary": {
                "total": total,
                "passed": passed,
                "failed": failed,
                "success_rate": (passed / total * 100) if total > 0 else 0,
                "average_execution_time": avg_time,
                "total_execution_time": total_time
            },
            "passed_tests": [
                {
                    "name": r["test_name"],
                    "execution_time": r["result"].get("execution_time", 0)
                }
                for r in passed_tests
            ],
            "failed_tests": [
                {
                    "name": r["test_name"],
                    "error": r["result"].get("error") or r["result"].get("stderr", ""),
                    "execution_time": r["result"].get("execution_time", 0)
                }
                for r in failed_tests
            ],
            "all_results": self.results
        }
    
    def _get_timestamp(self) -> str:
        from datetime import datetime
        return datetime.now().isoformat()

# Usage
aggregator = TestResultAggregator()

# Add results
aggregator.add_result("test_1", {"success": True, "execution_time": 1.5})
aggregator.add_result("test_2", {"success": False, "error": "Assertion failed"})
aggregator.add_result("test_3", {"success": True, "execution_time": 2.1})

# Generate report
report = aggregator.generate_report()
print(json.dumps(report, indent=2))

Best Practices

Isolation

Always create a fresh sandbox for each test to ensure complete isolation. Never reuse sandboxes between tests.
  1. Fresh Sandboxes: One sandbox per test execution
  2. Clean Environment: Start with clean environment variables
  3. No Shared State: Tests should not depend on each other
  4. Resource Cleanup: Always cleanup sandboxes after tests

Performance

Use parallel execution for independent tests, but ensure each test runs in its own isolated sandbox.
  1. Parallel Execution: Run independent tests in parallel
  2. Resource Limits: Set appropriate timeouts per test
  3. Result Caching: Cache test results when possible
  4. Efficient Cleanup: Clean up sandboxes efficiently

Reliability

  1. Error Handling: Handle all error cases gracefully
  2. Timeout Management: Set appropriate timeouts
  3. Retry Logic: Implement retry for transient failures
  4. Logging: Log all test executions for debugging

Real-World Examples

This pattern is used by:
  • GitHub Actions: CI/CD test execution
  • Jenkins: Continuous integration platform
  • CircleCI: Cloud-based CI/CD
  • Travis CI: Continuous integration service

Next Steps

  1. Integrate with CI/CD platforms (GitHub Actions, GitLab CI, etc.)
  2. Add test result persistence and history
  3. Implement test result notifications
  4. Create test execution dashboard
  5. Add support for more test frameworks